Methods For Evaluating The Attainment Of Cleanup Standards Volume 2 Groundwater


United States        Pofes, Planning      EPA 230-R-i2«014
             And EvalwrtiGft
Etwirarmwntel Pratadion    And EvalwrtiGft      Juty 1OT2
             (PM-222)
Methods For Evaluating The
Attainment t Of Cleanup
Standards

Volume 2: Ground  Water

-------
    Methods for Evaluating the
Attainment of Cleanup Standards
     Volume 2: Ground Water
   Environmental Statistics and Lifommtioii Division (PM-222)
         Office of Policy, Planning, and Evaluation
         U. S, Environmental Protection Agency
               401 M Street, S.W.
              Washington, DC 20460

                  July, 1992

-------
                                   DISCLAIMER
This report was prepared under contract to an agency of the United States Government.
Neither  the United States Government nor  any of  its employees,  contractors,
subcontractors, or their employees makes any warranty, expressed or implied, or assumes
any legal liability or responsibility for any third party's  use or the results  of such use of any
information, apparatus, product, model, formula, or process disclosed in this report, or
represents that its use by such third party would not infringe on privately owned rights.


Publication of the data in this document does not signify that the contents necessarily reflect
the joint or separate views and policies of each co-sponsoring agency. Mention of trade
names or commercial products does not constitute endorsement or recommendation for  use.

-------
                           TABLE OF  CONTENTS

                                                                     Pag?

EXECUTIVE SUMMARY	xxi

1.     INTRODUCTION,	1

       1.1     General Scope and Features of the Guidance Document	 1-1
              1.1.1    Purpose	  1-1
              1.1.2    Intended Audience and Use	  1-3
              1.1.3    Bibliography, Glossary, Boxes, Worksheets,
                      Examples, and References to "Consult a
                      Statistician"	  1-4
       1.2     Use of this  Guidance  in Ground-Water Remediation
              Activities	  1-5
              1.2.1    Pump-and-Treat   Technology	  1-5
              1.2.2    Barrier  Methods  to Protect Ground Water	 1-6
              1.2.3    Biological  Treatment	  1-6
       1.3     Organization of this Document	  1-7
       1.4     Summary	 1-8

2.     INTRODUCTION TO  STATISTICAL CONCEPTS AND
       DECISIONS	2-1

       2.1     A Note on Terminology	  2-2
       2.2     Background for the Attainment Decision	2-2
              2.2.1    A  Generic Model of Ground-Water Cleanup
                      Progress 	  2-3
              2.2.2    The Contaminants to be Tested	  2-5
              2.2.3    The Ground-Water System to be Tested	2-6
              2.2.4  The  Cleanup Standard	2-6
              2.2.5    The Definition of Attainment	  2-7
       2.3     Introduction  to Statistical  Issues  For  Assessing Attainment	2-8
              2.3.1    Specification of the Parameter to be Compared to
                      the Cleanup Standard	2-8
              2.3.2    Short-term Versus Long-term Tests	2-13
              2.3.3    The Role of Statistical  Sampling and Inference in
                      Assessing  Attainment	 2-15

-------
                             TABLE OF CONTENTS


                                                                    Page

             2.3.4    Specification of Precision and  Confidence Levels
                      for Protection Against Adverse Health and
                      Environmental Risks	,2-17
            2.35    Attainment Decisions Based on Multiple Wells	2-20
             2.3.6    Statistical Versus Predictive Modeling	2-24
             2.3.7    Practical Problems with the Data Collection and
                      Their Resolution	,2-25
       2.4    Limitations and Assumptions of the Procedures Addressed
             in this Document	2-28
       2.5    Summary	2-28

3.      SPECIFICATION OF ATTAINMENT OBJECTIVES	3-1

       3.1    Data Quality  Objectives	  3-3
       3.2    Specification of the Wells to be Sampled	  3-3
       3.3    Specification of Sample  Collection and Handling
             Procedures	  3-3
       3.4    Specification of the Chemicals to  be  Tested and Applicable
             Cleanup Standards	  3-4
       3.5    Specification of the Parameters to Test	  3-4
             3.5.1    Selecting the  Parameters  to Investigate	3-5
             3.5.2    Multiple Attainment Criteria	  3-8
       3.6    Specification of Confidence Levels for Protection  Against
             Adverse Health  and  Environmental Risks	  3-8
       3.7    Specification 'of the  Recision  to  be Achieved	3-9
       3.8    Secondary Objectives	  3-10
       3.9    Summary	3-10

4.      DESIGN OF THE SAMPLING AND ANALYSIS PLAN	.4-1

       4.1    The  Sample Design	  4-1
             4.1.1    Random Sampling	  4-2
             4.1.2    Systematic Sampling	.4-2
             4.1.3    Fixed versus Sequential  Sampling	4-4
       4.2    The  Analysis   Plan..	  4-5
       4.3    Other Considerations  for  Ground Water Sampling and
             Analysis Plans	  4-6
       4.4    Summary	 4-7

5.      DESCRIPTIVE  STATISTICS AND HYPOTHESIS  TESTING	5-1

       5.1    Calculating the Mean, Variance, and Standard Deviation of
             the  Data	  5-6
       5.2    Calculating the  Standard  Error of the Mean	5-7
             5.2.1    Treating the Systematic Observations as  a Random
                      Sample	  5-8
                                       IV

-------
TABLE OF CONTENTS
5.2.2 Estimates From Differences Between Adjacent
Observations ................................................. 5-9
5.2.3 Calculating the Standard Error After Correcting for
............................................ 5-10
5.2.4 Calculating the Standard Error After Correcting for
Serial Correlation ........................................... 5-13
5.3 Calculating Lag 1 Serial Correlation ................................. 5-14
5.4 Statistical Inferences: What can be Concluded from Sample
Data [[[ 5-16
5.5 The Construction and Interpreati on of Confidence Intervals
about Means [[[ 5-18
5.6 Procedures for Testing for Significant Serial Correlation ......... 5-21
5.6.1 Durbin-Watson Test ........................................ 5-21
5.6.2 An Approximate Large-Sample Test ..................... 5-23
5.7 Procedures for Testing the Assumption of Normality ............. 5-23
5.7.1 Formal Tests for Normality ............................... 5-24
5.7.2 Normal Probability Plots .................................. 5-24
5.8 Procedures for Testing Per cent lies Using Tolerance
Intervals [[[ 5-25
5.8.1 Calculating a Tolerance Interval ........................... 5-25
5.8.2 Inference: Deciding if the True Percentile is Less
than the Cleanup standard ................................. 5-26
5.9 Procedures for Testing proportions ................................. 5-27
5.9.1 Calculating Confidence Intervals for Proportions ...... 5-28
5.9.2 Inference: Deciding Whether the Observed
Proportion Meets the Cleanup Standard .5-29
5.9.3 Nonparametric Confidence Intervals Around a Median .................. 5-30
5.10 Determining Sample Size for Short-Term Analysis and Other
Data Collection Issues ................................................. 5-33
5.10.1 Sample Sizes for Estimating a Mean ..................... 5-34
5.10.2 Sample Sizes for Estimating a Percentile Using
Tolerance Intervals ......................................... 5-38
5.10.3 Sample Sizes for Estimating Proportions ................ 5-39
5.10.4 Collecting the Data .......................................... 5-40
5.10.5 Making Adjustments for Values Below the
Detection Limit .............................................. 5-41
5. 1 1 Summary [[[ 5-41

6. DECIDING TO TERMINATE TREATMENT USING
REGRESSION ANALYSIS.... ............................................... 6-1

-------
                             TABLE OF CONTENTS
                                                                     Page

             61.3    Assessing the Fit of the Model	6-11
             61.4    Inferences in Regression	6-13
       6.2    Using Regression to Model the Progress of Ground Water
             Remediation 	  6-26
             6.2.1    Choosing a Linear or  Nonlinear Regression	6-29
             6.2.2  Fitting  the  Model	 6-32
             6.2.3    Regression  in  the Presence  of Nonconstant
                      Variances	  6-32
             6.2.4    Correcting for Serial Correlation	6-33
       6.3    Combining  Statistical Information with Other Inputs to  the
             Decision Process	  6-38
       6.4    Summary	6-39

7.      ISSUES TO BE CONSIDERED BEFORE STARTING
       ATTAINMENT SAMPLING	  7-1

       7.1.  The  Notion of  "Steady  State"	  7-2
       7.2    Decisions to be Made in Determining When a Steady State is
             Reached	  7-3
       7.3    Determining When  a Steady State Has Been Achieved	7-3
             7.3.1    Rough Adjustment  of Data for Seasonal Effects	7-5
       7.4    Charting  the  Data	  7-6
             7.4.1    A Test for  Change of Levels Based on Charts	7-7
             7.4.2    A Test  for Trends Based  on Charts	7-7
             7.4.3    Illustrations  and  Interpretation	 7-8
             7.4.4    Assessing  Trends via Statistical Tests	7-13
             7.4.5    Considering the Location of Wells	7-14
       7.5    Summary	7-14
8.      ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE
       TESTS 	    8-1

       8.1    Fixed Sample Size Tests	   8-4
       8.2    Determining  Sample Size  and  Sampling Frequency	8-4
             8.2.1     Sample Size for  Testing Means	  8-5
             8.2.2    Sample Size for  Testing Proportions	8-9
             8.2.3    An Alternative Method  for Determining Maximum
                      Sampling Frequency	  8-11
       8.3    Assessing Attainment of the Mean Using Yearly Averages	8-12
       8.4    Assessing Attainment of the Mean After Adjusting for
             Seasonal  Variation	   8-20
       8.5    Fixed Sample Size Tests for Proportions	  8-25
       8.6    Checking for Trends in Contaminant Levels  After Attaining
             the Cleanup Standard	   8-26
       8.7    Summary	8-26
                                       VI

-------
                          TABLE OF CONTENTS

                                                               Page

9.     ASSESSING ATTAINMENT USING SEQUENTIAL TESTS	9-1
      9.1   Determining Sampling Frequency for Sequential Tests	9-5
      9.2   sequential Procedures for sample Collection and  Dam
            Handling	  9-7
      9.3   Assessing Attainment of the Mean Using Yearly Averages	9-7
      9.4   Assessing Attainment of the Mean After Adjusting for
            Seasonal Variation	 9-18
      9.5   Sequential Tests for Proportions	 9-22
      9.6   A Further Note on Sequential Testing	9-23
      9.7   Checking for Trends  in Contaminant Levels After Attaining
            the Cleanup standard	 9-23
      9.8   Summary	9-24
BIBLIOGRAPHY	BIB-1

APPENDIX A: STATISTICAL TABLES	A-l

APPENDIX B: EXAMPLE WORKSHEETS	B-l

APPENDIX C: BLANK WORKSHEETS	C-l

APPENDIX D: MODELING THE DATA	D-l

APPENDIX E: CALCULATING RESIDUALS AND SERIAL
      CORRELATIONS USING SAS	E-l

APPENDIX F: DERIVATIONS AND EQUATIONS	F-l
      F.I   Derivation of Tables A.4 and A.5	F-l
      F.2   Derivation of Equation (F.6)	F-4
            F.2.1   Variance of zt	F-5
            F.2.2   Variance of z	'.	F-6
      F.3   Derivation of the Sample Size Equation	F-8
      F.4   Effective Df for the Mean from an AR1 Process	F-10
      F.5   Sequential Tests for Assessing Attainment	F-ll
      Assessing Attainment of Ground Water Cleanup Standards Using
            Modified  Sequential t-Tests	F-12
            1.      Introduction	F-12
            2.      Fixed Versus Sequential Tests	F-13
            3.      Power and Sample Sizes for the Sequential t-Test
                    with Normally Distributed Data	F-16
                                    VII

-------
                           TABLE OF CONTENTS


                                                                 Page

             4.      Modifications to Simplify the Calculations and
                    Improve the Power	  F-18
             5.      Application  to Ground Water Data from  Superfund
                         	  F-19
             6.      Conclusions and Discussion	  F-22
             Bibliography	   F-23

APPENDIX G: GLOSSARY	   G-l

-------
                             TABLE OF CONTENTS

                              LIST OF FIGURES
                                                                     Pagq
Figure 1.1     Steps in Evaluating Whether a Ground Water Well Has
              Attained the Cleanup Standard	1-2
Figure 2.1     Example scenario for contaminant measurements in one well
              during successful remediation action	2-3
Figure 2.2     Measures of location: Mean, median, 25th percentile, 75th
              percentile, and 95th percentile for three hypothetical
              distributions	2-10
Figure 2.3     Illustration of the difference between a short- and long-term
              mean concentration	.2-14
Figure 2.4     Hypothetical  power curve	.2-20
Figure 3.1     Steps in defining the attainment objectives	3-2
Figure 5.1     Example scenario for contaminant measurements during
              successful remedial action	5-2
Figure 5.2     Example of data from a monitoring well exhibiting a
              seasonal pattern	5-11
Figure 6.1     Example Scenario for Contaminant Measurements During
              Successful Remedial Action	6-1
Figure 6.2     Example of a Linear Relationship Between Chemical
              Concentration Measurements and Time	6-3
Figure 6.3     Plot of data for from Table 6.1	6-10
Figure 6.4     Plot of data and predicted values for from Table 6.1	6-10
Figure 6.5     Examples of Residual Plots (source: adapted from figures in
              Draper and Smith, 1966, page 89)	6-12
Figure 6.6     Plot of residuals for from Table 6.1	6-13
Figure 6.7     Examples of R-Square for Selected Data Sets	6-15
Figure 6.8     Plot of Mercury Measurements as a Function of Time (See
              Box 6.16)	6-21
Figure 6.9     Comparison of Observed Mercury Measurements and
              Predicted Values under the Fitted Model (See Box 6.16)	6-22
                                       IX

-------
                              TABLE OF  CONTENTS


                                                                       Page

Figure 6.10   Plot of Residuals Against Time for Mercury Example  (see
              Box 6.17)	6-24

Figure 6.11   -Plot of Mercury Concentrations Against x = 1/VT, and
              Alternative Fined Model (see Box 6.17)	6-24

Figure 6.12   Plot of Residuals Based on Alternative  Model (see Box
              6.17)	6-25

Figure 6.13   Plot of Ordered Residuals Versus Expected Values for
              Alternative Model (see Box 6.17)	6-25

Figure 6.14   Examples  of Contaminant Concentrations that Could Be
              Observed  During  Cleanup	6-27

Figure 6 15   Steps for Implementing Regression Analysis at Superfund
              Sites	6-28

Figure 6.16   Example  of a  Nonlinear Relationship Between Chemical
              Concentration  Measurements and Time	6-30

Figure 6.17   Examples  of  Nonlinear  Relationships	6-30

Figure 6.18   Plot of Benzene Data and Fitted Model (see  Box 6.22)	6-37

Figure 7.1     Example  Scenario for Contaminant Measurements During
              Successful Remedial Action	7-1

Figure 7.2    Example of Time Chart for Use in Assessing Stability	7-6

Figure 7.3    Example  of Apparent Outliers	 7-10

Figure 7.4    Example  of a Six-point Upward Trend in the  Data	7-10

Figure 7.5    Example  of a Pattern in the Data that May Indicate an
              Upward  Trend	7-11

Figure 7.6    Example  of a Pattern in the Data that May Indicate a
              Downward Trend	7-11

Figure 7.7    Example of Changing Variability in the Data Over Time	7-12

Figure 7.8    Example  of a  Stable Situation with Constant Average and
              Variation	7-12

Figure 8.1     Example  Scenario for Contaminant Measurements During
              Successful Remedial Action	8-1

Figure 8.2    Steps in  the Cleanup Process  When Using a  Fixed Sample
              Size Test	8-3

-------
                             TABLE  OF CONTENTS


                                                                     Page

Figure 8.3     Plot of Arsenic Measurements for 16 Ground Water Samples
              (see Box 8.21)	8-24

Figure 9.1     Example Scenario for  Contaminant Measurements During
              Successful  Remedial Action	9-1

Figure 9.2     Steps in the Cleanup Process  When Using a Sequential
              Statistical  Test	9-3

Figure D. 1     Theoretical Autocorrelation Function Assumed in the Model
              of the Ground  Water  Data	D-4

Figure D.2    Examples of Data with Serial Correlations of 0,0.4, and
              0.8. The higher the serial correlation, the more the
              distribution dampens out	D-5

Figure F. 1     Differences in Sample Size Using Equations Based on a
              Normal Distribution (Known Variance) or a t Statistic,
              Assuming a = .10 and P  = .10	F-9
                                        XI

-------
                             TABLE OF CONTENTS
                              LIST OF TABLES
Table 2.1     False  positive and negative decisions	2-18

Table 3.1     Points to consider when trying to choose among the mean,
             upper proportion/percentile, or median	3-6

Table 3.2     Recommended parameters to test when comparing the
             cleanup standard to the concentration of a chemical with
             chronic effects..,	3-7

Table 4.1     Locations in this document of discussions of sample designs
             and analysis for ground water sampling	4-6

Table 5.1     Summary of notation used in Chapters 5  through 9	5-4

Table 5.2     Alternative formulas for the standard error of the mean	5-20

Table 5.3     Values of M and N+l-M and confidence  coefficients for
             small  samples	5-32

Table 5.4     Example contamination data used in Box 5.19 to generate
             nonparamctric confidence interval	5-33

Table 6.1     Hypothetical Data for the Regression Example in Rgure 6.3	6-9

Table 6.2     Hypothetical concentration measurement for mercury (Hg) in
             ppm for'20 ground water samples taken at monthly intervals... .6-21

Table 6.3     Benzene concentrations in 15 quarterly samples (see Box
             6.22)	6-37

Table 8.1     Arsenic measurements (ppb) for 16 ground water samples
             (see Box 8.21)	8-25

Table A. 1     Tables of t for selected alpha and degrees of freedom	A-l

Table A.2    Tables of z for selected alpha	A-2

Table A. 3    Tables of k for selected alpha, PO, and sample size for use in
             a tolerance interval test	A-3

Table A. 4    Recommended number of samples per seasonal period (np)
             to minimize total cost for assessing attainment	A-6

Table A.5    Variance factors F for determining sample size	A-7

Table D. 1     Decision criteria for determining whether the ground water
             concentrations attain the cleanup standard	..D-7
                                       Xii

-------
                            TABLE OF  CONTENTS


                                                                    Page

Table F.I     Coefficients for the terms at, at-1, etc., in the sum of three
             successive correlated observations	F-7

Table F. 2     Differences between the calculated sample sizes using a t
             distribution and a normal distribution when the samples size
             based on the t distribution is 20, for selected values of a
             (Alpha)  and  |3 (Beta)	F-10
                                      XIII

-------
                            TABLE OF  CONTENTS




                              LIST  OF BOXES


                                                                    Page

Box 2.1      Construction of Confidence Intervals Under Assumptions of
             Normality	2-16

Box 4.1      Example of Procedure for Specifying a Systematic Sample
             Design	4-4

Box 5.1      Calculating Sample Mean, Variance, and Standard Deviation	5-6

Box 5.2      Calculating the Standard Error Treating the Sample 	5-9

Box 5.3      Calculating the Standard Error Using Estimates Between
             Adjacent Observations	5-10

Box 5.4      Calculating Seasonal Averages and Sample Residuals	5-12

Box 5.5      Calculating the Standard Error After Removing Seasonal
             Averages	5-13

Box 5.6      Calculating the Standard Error After Removing Seasonal
             Averages	5-14

Box 5.7      Calculating the Serial Correlation from the Residuals After
             Removing Seasonal Averages	5-15

Box 5.8      Estimating the Serial Correlation Between Monthly
             Observations	5-16

Box 5.9      General 'Construction of Two-sided Confidence Intervals	5-18

Box 5.10     General Construction of One-sided Confidence Intervals	5-19

Box 5.11     Construction of Two-sided Confidence Intervals	5-19

Box 5.12     Comparing the Short Term Mean to the Cleanup Standard
             Using Confidence Intervals	5-21

Box 5.13     Example: Calculation of Confidence Intervals	5-22

Box 5.14     Calculation of the Durbin-Watson Statistic	5-22

Box 5.15     Large Sample Confidence Interval for the Serial Correlation	5-23

Box 5.16     Tolerance Intervals: Testing for the 95th Percentilc with
             Lognormal Data	5-27

Box 5.17     Calculation of Confidence Intervals	5-30
                                      xiv

-------
                             TABLE OF CONTENTS
                                                                    Page
Box 5.18     Calculation of M	5-32
Box 5.19     Example of Constructing Nonparametric Confidence
             Intervals	5-34
Box 5.20     Estimating a from Data Collected Prior to Remedial Action	5-36
Box 5.21     Example of Sample Size Calculations	5-38
Box 5.22     (hollaring Sample Size for Tolerance Intervals	.5-39
Box 5.23     Sample Size Determination for Estimating Proportions	5-40
Box 6.1      Simple Linear Regression Model	6-4
Box 6.2      Calculating Least Square Estimates	6-6
Box 6.3      Estimated Regression Line	6-6
Box 6.4      Calculation  of Residuals 	6-7
Box 6.5      Sum of Squares Due to Error and the Mean Square Error 	6-7
Box 6.6      Five Basic Quantities for Use in Simple Linear Regression
             Analysis	6-8
Box 6.7      Calculation of the Estimated Model Parameters and SSE	6-8
                                                   *
Box 6.8      Example of Basic Calculations for  Linear Regression	..6-9
Box 6.9      Coefficient of Determination	6-14
Box 6.10     Calculating the Standard Error of the Estimated Slope	6-16
Box 6.11     Calculating a Confidence Interval Around the Slope	6-16
Box 6.12     Using the Confidence Interval for the Slope to Identify a
             Significant Trend	6-18
Box 6.13     Calculating the Standard Error and Confidence Intervals for
             Predicted Values	6-18
Box 6.14     Using the Simple Regression Model to Predict Future
             Values	6-19
Box 6.15     Calculating the Standard Error and Confidence Interval a
             Predicted Mean	6-20
Box 6.16     Example of Basic Regression Calculations	6-22
Box 6.17     Analysis of Residuals for Mercury Example	6-23
                                       xv

-------
                              TABLE  OF CONTENTS
                                                                       Page
Box 6.18      Suggested  Transformations	 6-31
Box 6.19      Transformation to "New" Model	 6-34
Box 6.20      "New" Fitted Model  for  Transformed Variables	6-34
Box 6.21      Slope and Intercept of Fitted Regression Line in Terms of
              Original Variables	 6-35
Box 6.22      Correcting  for  Serial  Correlation	 6-36
Box 6.23      Constructing  Confidence Limits  around an Expected
              Transformed Value	 6-38
Box 7.1       Adjusting for Seasonal Effects	 7-5
Box 8.1       Steps for Determining Sample  Site for Testing the Mean	8-6
Box 8.2       Example of Sample Size Calculations for Testing the Mean	8-9
Box 8.3       Determining Sample Size for Testing Proportions	8-10
Box 8.4       Choosing a Sampling Interval Using the Darcy Equation	8-11
Box 8.5       Steps  for Assessing Attainment  Using Yearly  Averages	8-13
Box 8.6       Calculation of the Yearly  Averages	 8-14
Box 8.7       Calculation of the Mean and Variance of the Yearly
              Averages	 8-14
Box 8.8       Calculation of Seasonal Averages and the Mean of the
              Seasonal  Averages	 8-15
Box 8.9       Calculation of Upper  One-sided Confidence Limit for the
              Mean	 8-16
Box 8.10      Deciding if the Tested Ground Water Attains the  Cleanup
              Standard	8-16
Box 8.11      Example of Assessing Attainment of the Mean  Using Yearly
              Averages	 8-17
Box 8.12      Steps for Assessing Attainment Using the Log Transformed
              Yearly  Averages	8-18
Box 8.13      Calculation of the Natural  Logs  of the Yearly  Averages	8-18
Box 8.14      Calculation of the Mean and Variance of the Natural Logs of
              the Yearly Averages	 8-19
                                        XVI

-------
                            TABLE OF  CONTENTS

                                                                    Page
Box 8.15     Calculation of the Upper Confidence Limit for the Mean
             Based on Log Transformed Yearly Averages	8-20
Box 8.16  ~   Steps for Assessing Attainment Using the Mean After
             Adjusting for Seasonal Variation	8-21
Box 8.17     Calculation of the Residuals	8-22
Box 8.18     Calculation of the Variance  of the Residuals	8-22
Box 8.19     Calculating the Serial Correlation from the Residuals After
             Removing Seasonal Averages	8-22
Box 8.20     Calculation of the Upper Confidence Limit for the Mean
             After Adjusting for Seasonal Variation	8-23
Box 8.21     Example Calculation of Confidence Intervals...	8-24
Box 9.1      Steps for Determining Sample Frequency for Testing the
             Mean	9-5
Box 9.2      Steps for Determining Sample Frequency for Testing a
             Proportion	9-6
Box 9.3      Example of Sample Frequency Calculations	9-6
Box 9.4      Steps for Assessing Attainment Using Yearly Averages	9-9
Box 9.5      Calculation of the  Yearly Averages	 9-10
Box 9.6      Calculation of the Mean and Variance of the Yearly
             Averages	9-10
Box 9.7      Calculation of Seasonal Averages and the Mean of the
             Seasonal Averages	9-11
Box 9.8      Calculation of t and 6 When Using the Untransformed
             Yearly  Averages	9-12
Box 9.9      Calculation of the Likelihood Ratio for the Sequential Test	9-12
Box 9.10     Deciding if the Tested Ground Water Attains the Cleanup
             Standard	9-13
Box 9.11     Example Attainment Decision Based on a Sequential Test	9-14
Box 9.12     Steps for Assessing Attainment Using the Log Transformed
             Yearly  Averages	9-15
Box 9.13     Calculation of the Natural Logs of the Yearly Averages	9-16
                                      xvn

-------
                            TABLE OF  CONTENTS
                                                                  Page
Box 9 14     Calculation of the Mean and Variance of the Natural Logs of
             the Yearly Averages	9-16
Box 9.15     Calculation of t and 5 When Using the Log Transformed
             Yearly  Averages	9-17
Box 9.16     Steps for Assessing Attainment Using the Mean After
             Adjusted for Seasonal Variation	9-19
Box 9.17     Calculation of the Residuals	9-19
Box 9.18     Calculation of the Variance  of the Residuals	9-20
Box 9.19     Calculating the Serial Correlation from the Residuals After
             Removing Seasonal Averages	9-20
Box 9.20     Calculation of t and 5 When Using the Mean Corrected for
             Seasonal Variation	9-21
Box 9.21     Calculation of the Likelihood Ratio for the Sequential Test
             When Adjusting for Serial Correlation	9-21
Box 9.22     Example Calculation of Sequential Test Statistics after
             Adjustments for Seasonal Effects and Serial Correlation	9-22
Box D.I      Modeling  the  Data	D-2
Box D.2      Autocorrelation  Function	D-3
BoxD.3      Revised Model for Ground Water Data	D-7
                                     xviii

-------
AUTHORS AND CONTRIBUTORS
This manual represents the combined effort of several organizations and many
individuals. The names of the primary contributors, along with the role of each
organization, are summarized below.
Westat, Inc., 1650 Research Boulevard, Rockville, Md 20850, Contract No. 68-01-
7359, Task 11 -- research, statistical procedures, draft and final draft report. Key Westat
staff included:
John Rogers Adam Chu
Ralph DiGaetano Ed Bryant
Contract Coordinator: Robert Clickner
Dynamac Corporation, 11140 Rockville Pike, Rockville, MD 20852 (subcontractor to
Westat) -- consultation on the sampling of ground water, treatment alternatives, and
chemical analysis. Key Dynamac staff included:
David Lipsky Richard Dorrler
Wayne Tusa
EPA, OPPE, Statistical Policy Branch - Project management, technical input, peer
review. Key EPA staff included:
Barnes Johnson Herbert Lacayo
SRA Technologies, Inc., 4700 King Street, Suite 300, Alexandria, VA 22302,
Contract No. 68-01-7379, Delivery Order 16 ~ editorial and graphics support, technical
review, and preliminary draft preparation. Key SRA staff included:
Marcia Gardner . Karl Held
LoriHidinger Alex Polymenopoulos
Mark Ernstmann Jocelyn Smith
xix

-------

-------
                           EXECUTIVE SUMMARY

          This document provides regional project managers, on-site coordinators, and
their contractors with sampling and analysis methods for evaluating whether ground water
remediation has met pm-established cleanup standards for one  or more chemical
contaminants at a hazardous waste site. The verification of cleanup by evaluating a site
relative to a cleanup standard or an applicable or relevant and appropriate requirement
(ARAR) is mandated in Section 121 of the Superfund Amendments and Reauthorization
Act (SARA). This document, the second in a series, provides sampling and data analysis
methods for the purpose of verifying attainment of a cleanup standard in  ground water.
The fast volume addresses evaluating attainment in soils and solid  media.

          This document presents statistical methods which can be used  to address the
uncertainty of whether a site has met a cleanup standard. Sup&fund managers face the
uncertainty of having to make a decision about the entire site based only on samples of the
ground water at the  site, often collected for only a limited time period.

          The methods in this document approach cleanup standards as having three
components that influence the overall stringency of the standard: first, the magnitude,
level, or concentration deemed to be protective of public health and the  environment;
second, the sampling performed to evaluate whether a site is above or below the standard;
and third, the method of comparing sample data to the standard to  decide whether the
remedial action was successful.  All three of these  components are important. Failure to
address any one these components can result in insufficient levels of cleanup. Managers
must look beyond the cleanup level and explore the sampling and analysis methods which
will allow confident assessment of the site relative to the cleanup standard

          A site manager is likely to confront two major questions in evaluating the
attainment of the cleanup standard: (1) is the site really  contaminated because a few
samples are above the cleanup standard?   and (2) is the site really "clean" because the
sampling shows the majority of samples to be below the cleanup standard?  The statistical
methods demonstrated  in this guidance  document allow for decision making under
uncertainty and permit valid extrapolation of information that can be defended and used
with confidence to determine whether the site meets  the  cleanup standard.

                                       xxi

-------
The presentation of concepts and solutions to potential problems in assessing
ground water attainment begins with an introduction to the statistical reasoning required to
implement these methods. Next, the planning activities, requiring input from both
statisticians and nonstatisticians, are described. Finally, a series of methodological
chapters are presented to address statistical procedures applicable to successive stages in the
remediation effort. Each chapter will now be considered in detail.

Chapter 1 provides a brief introduction to the document, including its
organization, intended use, and applications for a variety of treatment technologies. A
model for the sequence of ground water remediation activities at the site is described.
Many areas of expertise must be involved in any remedial action process. This document
attempts to address only statistical procedures relevant to evaluating the attainment of
cleanup goals.

The cleanup activities at the site will include site investigation, ground water
remediation, a post-treatment period allowing the ground water to reach steady state,
sampling and analysis to assess attainment, and possible post-cleanup monitoring.
Different statistical procedures are applicable at different stages in the cleanup process. The
statistical procedures used must account for the changes in the ground water system over
time due to natural or man-induced causes. As a result, the discussion makes a distinction
between short-term estimates which might be used during remediation and long-term
estimates which are used to assess attainment. Also, a slack period of time after treatment
and before assessing attainment is strongly recommended to allow any transient effects of
treatment to dissipate.

Chapter 2 addresses statistical concepts as they might relate to the evaluation of
attainment. The chapter discusses the form of the null and alternate hypothesis, types of
errors, statistical power curves, the handling of outliers and values below detection limits,
short- versus long-term tests, and assessing wells individually or as a group. Due to the
cost of developing new wells, the assessment decision is assumed to be based on
established wells. As a result, the statistical conclusions strictly apply only to the water in
the sampling wells rather than the ground water in general. The expertise of a
hydrogeologist can be useful for making conclusions about the ground water at the site
based on the statistical results from the sampled wells.
xxn

-------
The procedures in this document favor protection of the environment and
human health. If uncertainty is large or the sampling inadequate, these methods conclude
that the sample area does not attain the cleanup standard, Therefore, the null hypothesis, in
statistical terminology, is that the site does not attain the cleanup standard until sufficient
data are acquired to prove otherwise.

Procedures used to combine data from separate wells or contaminants to
determine whether the site as a whole attains all relevant cleanup standards are discussed.
How the data from separate wells are combined affects the interpretation of the results and
the probability of concluding that the overall site attains the cleanup standard. Testing the
samples from individual wells or groups of wells is also discussed.

Chapter 3 considers the steps involved in specifying the attainment objectives.
Attainment objectives must be specified before the evaluation of whether a site has attained
the cleanup standard can be made. Attainment objectives are not specified by statisticians
but rather must be provided by a combination of risk assessors, engineers, project
managers, and hydrogeologists. Specifying attainment objectives includes specifying the
chemicals of concern, the cleanup standards, the wells to be sampled, the statistical criteria
for defining attainment, the parameters to be tested, and the precision and confidence level
desired.

Chapter 4 discusses the specification of the sampling and analysis plans. The
sampling and analysis plans are prerequisites for the statistical methods presented in the
following chapters. A discussion of common sampling plan designs and approaches to
analysis are presented. The sample designs discussed include simple random sampling,
systematic sampling, and sequential sampling. The analysis plan is developed in
conjunction with the sample design.

Chapter 5 provides methods which are appropriate for describing ground water
conditions during a specified period of time. These methods are useful for making a quick
evaluation of the ground water conditions, such as during remediation. Because the short-
term confidence intervals reflect only variation within the sampling period and not long-
term trends or shifts between periods, these methods are not appropriate for assessing
attainment of the cleanup standards after the planned remediation has been completed.
However, these descriptive procedures can be used to estimate means, percentiles,
xxiii

-------
confidence intervals, tolerance intervals and variability. Equations are also provided to
determine the sample size required for each statistical test and to adjust for seasonal
variation and serial correlation.

Chapter 6 addresses statistical procedures which are useful during remediation,
particularly in deciding when to terminate treatment. Due to the complex dynamics of the
ground water flow in response to pumping, other remediation activity, and natural forces,
the decision to terminate treatment cannot easily be based on statistical procedures.
Deciding when to terminate treatment should be based on a combination of statistical
results, expert knowledge, and policy decisions. This chapter provides some basic
statistical procedures which can be used to help guide the termination decision, including
the use of regression methods for helping to decide when to stop treatment. In particular,
procedures are given for estimating the trend in contamination levels and predicting
contamination levels at future points in time. General methods for fitting simple linear
models and assessing the adequacy of the model ate also discussed.

Chapter 7 discusses general statistical methods for evaluating whether the
ground water system has reached steady state and therefore whether sampling to assess
attainment can begin. As a result of the treatment used at the site, the ground water system
will be disturbed from its natural level of steady state. To reliably evaluate whether the
ground water can be expected to attain the cleanup standard after remediation, samples must
be collected under conditions similar to those which will exist in the future. Thus, the
sampling for assessing attainment can only occur when the residual effects of treatment on
the ground water are small compared to those of natural forces.

Finding that the ground water has returned to a steady state after terminating
remediation efforts is an essential step in establishing of a meaningful test of whether or not
the cleanup standards have been attained. There are uncertainties in the process, and to
some extent it is judgmental. However, if an adequate amount of data is carefully gathered
prior to beginning remediation and after ceasing remediation, reasonable decisions can be
made as to whether or not the ground water can be considered to have reached a state of
stability. The decision on whether the ground water has reached steady state will be based
on a combination of statistical calculations, plots of data, ground water modeling using
predictive models, and expert advice from hydrogeologists familiar with the site.
xxiv

-------
          Chapters 8 and 9 present the statistical procedures which can be used to evaluate
whether the contaminant concentrations in the sampling wells attain the cleanup standards
after the ground water has reached steady state. The suggested methods use either a fixed
sample size test (Chapter 8) or a sequential statistical test (Chapter 9). The testing
procedures can be applied to either samples from individual wells or wells tested as a
group. Chapter 8 presents fixed sample size tests for assessing attainment of the mean:
using yearly averages or after adjusting for seasonal variation; using a nonparametric test
for proportions; and using a nonparametric confidence interval about the median.  Chapter
9 discusses sequential statistical tests for assessing attainment of the mean using yearly
averages,  assessing attainment of the mean after adjusting  for seasonal variation,  and
assessing attainment using a nonparametric test for proportions. In both fixed sample  size
tests and sequential tests, the ground water at the site is judged to attain the cleanup
standards, if the contaminant levels  are below the standard and  are not increasing over time.
If the ground water at the site attains the cleanup standards, follow-up monitoring is
recommended  to ensure that the steady state  assumption holds.

          Although the  primary focus of the document  is the procedures presented in
Chapters 8 and 9 for evaluating attainment, careful consideration of when to terminate
treatment and how long to wait for steady state are important in the overall planning. If the
treatment is terminated prematurely, excessive time may be spent in evaluating attainment
only to have to restart treatment to complete the remediation, followed by a second period
of attainment sampling and  decision.  If the ground water is not at steady state, the
possibility  of incorrectly determining the attainment  status of the  site increases.

          As an aid to the reader, a glossary of commonly-used terms is provided in
Appendix G; calculations and examples are presented in  boxes within the text;  and
worksheets with examples  are  provided in Appendix B.
                                        XXV

-------

-------
1. INTRODUCTION
Congress revised the Superfund legislation in the Superfund Amendments
and Reauthorization Act of 1986 (SARA). Among other provisions of SARA, section 121
on Cleanup Standards discusses criteria for selecting applicable or relevant and appropriate
requirements (ARAR's) for cleanup and includes specific language that requires EPA
mandated remedial action to attain the ARAR's.

Neither SARA nor EPA regulations or guidances specify how to determine
whether the cleanup standards have been attained. This document offers procedures that
can be used to determine whether a site has attained the appropriate cleanup standard after a
remedial action.
1.1 General Scope and Features of the Guidance Document

1.1.1 Purpose

This document provides a foundation for decision-making regarding site
cleanup by providing methods that statistically compare risk standards with field data in a
scientifically defensible manner that allows for uncertainty. Statistical procedures can be
used for many different purposes in the process of a Superfund site cleanup. The purpose
of this document is to provide statistical procedures which can be used to determine if
contaminant concentrations measured in selected ground-water wells attain (i.e., are less
than) the cleanup standard. This evaluation requires specification of sampling protocols
and statistical analysis methods. Figure 1.1 shows the steps involved in the evaluation
process to determine whether the cleanup standard has been attained in a selected ground
water well.
1-1

-------
                   INTRODUCTION
Figure 1.1
Steps in Evaluating Whether a Ground Water Well Has Attained the
cleanup standard
                                       C   Sun  J
                                Define Attainment Objectives
                                         Chapters
                              Specify Sampling and Analysis Plan
                                      Chapters 4 and 5
                                Decide to Terminate Treatment
                                         Chapter 6
                                   Determine Steady state
                                         Chapter 7
                             Assess the Attainment of the Cleanup
                                         Standard
                                      Chapters 8 & 9
                            Declare that the Well Attains the Cleanup
                              Standard and Contine to Monitor as
                                         Necessary
                                  Is the Cleanup Standard
                                        Attained?
                                      Do Concentrations
                                     increase over Time?
                          1-2

-------
INTRODUCTION
Consider the situation where several samples were taken and the results'
indicated that one or two of the samples exceed the cleanup standard. How should this
information be used to decide whether the standard has been attained? The mean of the
samples might be compared with the standard. The magnitude of the measurements that are
larger than the standard might be taken into consideration in making a decision. The loca-
tion where large measurements occur might provide some insight.

When specifying how attainment is to be defined and deciding how statisti-
cal procedures can be used, the following factors are all important:

The location of the sampling wells and the associated relationship
between concentrations in neighboring wells;
The number of samples to be taken;
The sampling procedures for selecting and obtaining water samples;

The data analysis procedures used to test for attainment.

Appendix D lists relevant EPA guidance documents on sampling and
evaluating ground water. These documents address both the statistical and technical
components of asampling and analysis program. This document is intended to extend the
methodologies they provide by addressing statistical issues in the evaluation of the
remediation process. This document does not attempt to suggest which standards apply or
when they apply (i.e., the "How clean is clean?" issue). Other Superfund guidance
documents perform that function.

1.1.2 Intended Audience and Use

This document is intended primarily for Agency personnel (primarily on-site
coordinators and regional project managers), responsible parties, and their contractors who
are involved with monitoring the progress of ground-water remediation at Superfund sites.
Although selected introductory statistical concepts arc reviewed, this document is directed
toward readers that have had some prior training or experience applying quantitative
methods.
1-3

-------
INTRODUCTION
It must be emphasized that this document is intended to provide general
direction and assistance to individuals involved in the evaluation of the attainment of
cleanup standards. It is not a regulation nor is it formal guidance from the Superfund
Office. This manual should not be viewed as a "cookbook" or a replacement for good
engineering or statistical judgment
1.1.3 Bibliography, Glossary, Boxes, Worksheets, Examples, and
References to "Consult a Statistician"
This document includes a bibliography which provides a point of departure
for the more sophisticated or interested user. There are references to primary textbooks,
pertinent journal articles, and related guidances.

The glossary (Appendix F) is included to provide short, practical definitions
of terminology used in this guidance. Words and phrases appearing in bold within the text
are listed in the glossary. The glossary does not use theoretical explanations or formulas
and, therefore, may not be as precise as the text or alternative sources of information.

Boxes are used throughout the document to separate and highlight equations
and example applications of the methods presented. For a quick reference, a listing of all
boxes and their page numbers is provided in the index.

A series of worksheets is included (Appendices B and C) to help order and
structure the calculations. References to the pertinent sections of the document are located
at the top of each worksheet. Example data and calculations are presented in the boxes and
the worksheets in Appendix B. The data and sites are hypothetical, but elements of the
examples correspond closely to several existing sites.

Finally, the document often directs the reader to "consult a statistician"
when more difficult and complicated situations are encountered. A directory of Agency
statisticians is available from the Environmental Statistics and Information Division (PM-
222) at EPA Headquarters (FTS 260-2680, 202-260-2680).
1-4

-------
INTRODUCTION
1.2 Use of this Guidance in Ground-Water Remediation Activities

Standards that apply to Superfund activities normally fall into the category
of risk-based standards which arc developed using risk assessment methodologies.
Chemical-specific ARARs adopt from other programs often include at least a generalized
component of risk. However, risk standards may be specific to a site, developed using a
local endangerment evaluation.

Risk-based standards are expressed as a concentration value and, as applied
in the Superfund program axe not associated with a standard method of interpretation.
Although statistical methods arc used to develop elements of risk-based standards, the
estimated uncertainties are not carried through the analysis or used to qualify the standards
for use in a field sampling program. Even though risk standards are not accompanied by
measures of uncertainty, decisions based on field data collected for the purpose of repre-
senting the entire site and validating cleanup will be subject to uncertainty. This document
allows decision-making regarding site cleanup by providing methods that statistically
compare risk standards with field data in a scientifically defensible manner that allows for
uncertainty.

Superfund activities where risk-based standards might apply are highly
varied. The following discussion provides suggestions for the use of procedures &scribed
in this document when implementing or evaluating Superfund activities.
1.2.1 Pump-and-Treat Technology

Ground water is often treated by pumping contaminated ground water out of
the ground, treating the water, and discharging the water into local surface waters or
municipal treatment plants. The contaminated ground water is gradually replaced by
uncontaminated water from the surrounding aquifer or from surface recharge. Pump and
treat systems may use a few or many wells. The progress of the remediation depends on
where the wells arc placed and the schedule for pumping. Pumping is often planned to
extend over many years.
1-5

-------
                                INTRODUCTION
              Statistical methods presented in this manual can be used for monitoring the
contaminants in both the effluent from the treatment system and the ground water in order
to monitor the progress of the remediation.

              Project managers must decide when to terminate treatment based on avail-
able data, advice from hydrogeologists, and the results of ground-water monitoring and
modeling. This manual provides guidance on statistical procedures to help decide when to
terminate  treatment.

              The remediation may temporarily alter ground water levels and flows,
which in turn will affect the contaminant concentration levels.  After termination of treat-
ment and after the transient effects of the remediation have dissipated, the statistical proce-
dures presented in this manual can be used to assess if the ground-water contaminant
concentrations remain  at levels  which will  attain and continue to  attain the cleanup standard.


1.2.2         Barrier Methods to Protect Ground Water

              If the contamination is relatively immobile and cannot effectively be
removed from the ground water using extraction, it is sometimes handled by containment.
In such cases, establishing barriers at the surface or around the contamination source may
reduce contaminant input to the aquifer, resulting in the reduction of ground-water concen-
trations to a level which attains the cleanup standard. The barriers include soil caps to
prevent surface infiltration, and slurry walls and other structures to force ground water to
flow away from contamination sources.

              The procedures in this manual can be used to establish whether the contam-
ination levels attain the relevant standards after the ground water has established its new
levels as a result of changes in ground-water flows.


1.2.3         Biological Treatment

              In many situations natural bacteria will  adapt to the contamination in the soil
and ground water and consume the contaminants, releasing metabolic products. These
bacteria will be most effective in consuming the contaminant if the underground environ-

                                        1-6

-------
INTRODUCTION
ment can be controlled, including controlling the dissolved oxygen and nutrient levels.
Biological treatment of ground water usually involves pumping ground water from down-
gradient @cations and injecting enriched ground water at upgradient locations. The
changes in the water table levels produce an underground flow carrying the nutrients to and
throughout the contaminated soil and aquifer. Progress of the treatment can be monitored
by sampling the water being pumped from the ground and measuring contaminant and
nutrient concentrations. Biological treatment can also be accomplished above ground using

a bioreactor as a component of a pump-and-treat system

Monitoring wells are placed in various patterns throughout, and possibly

beyond, the area of contamination. These wells can be used to sample ground water both
during treatment to monitor progress and after treatment to assess remediation success
using the statistical methods discussed in this document.

1.3 Organization of this Document

The topics covered in each chapter of this document are outlined below.

Chapter 2. Introduction to Statistical Concepts and Decisions: introduces
terminology and concepts useful for understanding statistical tests
presented in later chapters.

Chapter 3. Specification of Attainment Objectives: discusses specification
of the attainment objectives in a way which allows selection of the
statistical procedures to be used.

Chapter 4. Design of the Sampling and Analysis Plan: discusses common
sampling plan designs and approaches to the analysis.

Chapter 5. Descriptive Statistics: provides basic statistical procedures
which are useful in all stages of the remedial effort. The procedures
form a basis for the statistical procedures used for assessing
attainment.

Chapter 6. Deciding to Terminate Treatment Using. Regression Analysis:
discusses statistical procedures which can aid the decision-makers
who must decide when to terminate treatment.

Chapter 7. Approaching a Steady State After Terminating Remediation:
discusses statistical and nonstatistical criteria for determining
whether the ground water system is at steady state and/or if
additional remediation might be required.
1-7

-------
INTRODUCTION
Chapter 8. Assessing Attainment Using Fixed Sample Size Tests:
discusses statistical procedures based on fixed sample sizes for
deciding whether the concentrations in the ground water attain the
relevant cleanup standards
Chapter 9. Assessing Attainment Using Sequential Tests: discusses
sequential statistical procedures for deciding whether the
concentrations in ground water attain the relevant cleanup standards.
Worksheets: Provided for both practical use at Superfund sites and as
examples of the procedures which in being recommended.
1.4 Summary

This document provides a foundation for decision-making regarding site
cleanup by providing methods that statistically compare risk standa& with field data in a
scientifically defensible manner that allows for uncertainty. In particular, the document
provides statistical procedures for assessing whether the Superfund Cleanup Standards for
ground water have been attained. The document is written primarily for agency personnel,
responsible parties and contractors. Many areas of expertise must be involved in any
remedial action process. This document attempts to address only the statistical input
required for the attainment decision.

The statistical procedures presented in this document provide methods for
comparing risk based standards with field data in a manner that allows for assessing uncer-
tainty. The procedures allow flexibility to accommodate site-specific environmental
factors.

To aid the reader, statistical calculations and examples arc provided in boxes
separated from the text, and appendices contain a glossary of commonly-used terms; statis-
tical tables and detailed statistical information; worksheets for implementing procedures and
calculations explained in the text.
1-8

-------
  2. INTRODUCTION  TO STATISTICAL  CONCEPTS  AND
                                DECISIONS
              This document provides statistical procedures to help answer an important
question that will arise at Superfund sites  undergoing ground water remediation:

           "Do the contaminants  in the ground water in designated
                wells at the site attain the cleanup standards?"

The cleanup standard is attained if, as a result of the remedial effort, the previously unac-
ceptably high contaminant  concentrations are reduced to a level which is acceptable and can
 be expected to remain acceptable when judged relative to the cleanup standard.

              In order to answer the question above, the following more specific ques-
tions must be answered:

                    What contaminant(s) must attain the designated cleanup standards?
                    How is attainment of the cleanup standards to be defined?
                    What is the  designated cleanup standard for the contaminant(s) being
                    assessed? and
                    Where and when should samples of the ground water be  collected?

              This chapter discusses  each of these topics  briefly, followed by an intro-
duction to statistical procedures for  assessing the  attainment of cleanup standards in ground
water at Superfund sites. Also discussed are terminology and statistical concepts which are
useful for understanding the statistical tests presented in later chapters. Basic statistical
principles and topics which have  particular applicability to ground water  at Superfund sites
are also considered.

             Later chapters discuss in detail the specification of attainment objectives and
the implementation of statistical procedures required to determine if those objectives have
been met at the Superfund site.
                                       2-1

-------
   CHAPTER 2:  INTRODUCTION TO  STATISTICAL CONCEPTS  AND  DECISIONS

2.1          A Note  on Terminology

              This guidance document assumes that the reader is familiar with statistical
procedures and terminology, particularly the concepts of random sampling and hypothesis
testing, and the calculation of descriptive statistics such as means, standard deviations, and
proportions. An introduction to these statistical procedures can be found in statistical
textbooks such as Sokal and Rohlf (1981), and Neter, Wasserman, and Whitmore (1982).
The glossary provides a description of the terms and procedures used  in this document.

              In this document we will use the word clean as a short hand for "attains the
cleanup standard" and contaminated for "does not attain the cleanup standard."

              The term sample can be used in two different ways.  One refers to  a
physical water  sample collected  for laboratory analysis while the other refers to a collection
of data called a statistical sample. To avoid confusion, the physical water sample will be
called a physical sample or water sample.  Otherwise, the word  sample will refer to
a statistical sample i.e. a collection of randomly selected physical samples obtained for
assessing  attainment of the cleanup  standard.


2.2          Background for the Attainment  Decision

              In general, over time, a Superfund site will go through the following
phases:

                    Contamination;
                    Realization that a  problem exists;
                    Investigation to determine the extent of the problem;
                    Selection of a  remediation plan to alleviate the problem;
                    Cleanup (which may occur in  several steps);
                    Termination  of cleanup;
                    Final  determination that the cleanup has achieved the  required  goals;
                    and
                    Termination  of the remediation effort.
                                       2-2

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS:


              This document focuses on the post-cleanup phase and particularly on the
sampling and statistical procedures for determining if the site has attained the required

cleanup standards.
2.2.1
A Generic Model of Ground-Water Cleanup Progress
             During the planning and execution of remedial  action and the sampling and
analysis for assessing attainment, numerous activities must take place as indicated in the
following scenario and illustrated in Figure 2.1. This figure will be used throughout the
document to indicate to the reader at which step in the remedial process the procedures
being discussed in a chapter ate applicable. A discussion of each step follows Figure 2.1.
Figure 2.1    Example  scenario for contaminant measurements in one well during
             successful remediation action
                1.2

                  1

    Measured   °'8
     Ground    Q6
      Water      .
  Concentration  04

                0.2
                           Start
                        Treatment
                                Bid      Start
                            Treatment  Sampling
 End Sampling
Declare Clean or
 Contaminated
                                                  Date
(1)     Evaluate the site;      Although evaluation of the site and selection of the cleanup
       determine the         technology may require the use of several statistical
       remedial action to be  procedures, this document does not address this aspect of
       used                the remedial effort
                                       2-3

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND  DECISIONS
(2)
Perform remedial
cleanup
(3)
Decide when  to
terminate remedial
treatment
(4)
Assess when the
ground water
concentrations  reach
steady  state
(5)
Sample to assess
attainment
During a successful remedial cleanup, the concentrations of
contaminants can be expected to have a decreasing trend.
Due to seasonal changes, natural fluctuations, changes in
pumping schedules, lab measurement error, etc., the
measured concentrations will fluctuate around the trend.
Some statistical procedures that could be used to analyze
data during treatment are discussed  in Chapter 5.

Based on both expert knowledge of the ground-water
system and data collected during treatment, it must be
decided when to terminate treatment and prepare for the
sampling and analysis far assessing attainment. Statistical
procedures relevant to the termination decision are dis-
cussed in Chapter 6. Analysis  of data collected during
treatment may indicate that the cleanup standards will not
be achieved by the chosen cleanup methods, in which case
the cleanup technology and goals must be reassessed.

The  ground-water system will be disturbed from its  natural
level and flow by the treatment process, including perhaps
pumping or reinjection of ground water.  After treatment is
terminated, the transient effects will dissipate and the
ground-water levels and flows  will gradually reach their
natural levels.  In this process,  the contaminant concen-
trations may change in unpredictable ways.  Before  the
assessment is initiated, the ground water must be able to
return to its natural level and flow pattern, called steady
state, so that the data collected are relevant to assess condi-
tions in the future. Sampling and analysis during the
return to natural conditions are discussed in Chapter 7.
The ground water at a particular site will be considered to
have achieved steady state if the  assumption of steady state
is consistent with both statistical tests and the advice of a
hydrogeologist familiar with  the site. The attainment
sampling can begin once it is determined that the site is at
steady  state.

After the water levels and flows have reached steady state,
sampling to assess attainment of the cleanup standards can
begin.  Statistical procedures for assessing attainment  are
presented in Chapters 8 and 9. The statistical tests used
may be either fixed sample size tests or sequential tests. At
many sites sequential  tests will probably be preferred.
During the assessment phase, measured concentrations are
expected to either fluctuate around a constant or gradually
decreasing concentration. If the measurements consistently
increase, then  either the ground-water system is not at
steady state or there is reason to believe that the sources of
contamination have not been  adequately cleaned up. In this
situation, a reassessment of the data is required to deter-
mine if more time must pass  until the site is at steady state
or if additional  remedial  activity is required.
                                       2-4

-------
   CHAPTER 2: INTRODUCTION TO STATISTICAL  CONCEPTS  AND DECISIONS


(6)    Based on  statistical     If the cleanup standard has been attained, implementation,
       tests, determine if the   of periodic sampling to monitor  for unanticipated problems",
       cleanup  standard has   is recommended. The attainment decision  is based on
       been attained or not.    several assumptions.  From a statistical perspective, the
                             purpose of periodic monitoring after attainment is to check
                             the validity of the assumptions. If the  attainment  objectives
                             have not been met, the cleanup technology and goals must be reassessed.
              Different statistical procedures are needed at different steps in this process.

The statistical procedures which are helpful in dedetermining whether to terminate treatment
arc different from those used in the attainment  decision. In all aspects of the site investiga-
tion and remediation, statistical procedures may be required that are not addressed in this
document. In this case, consultation with a statistician familiar with ground-water data is

recommended


              This  document takes the  approach that:


                     A decision that the ground water in the wells attains the cleanup
                     standard requires the assumption that the ground water can be
                     expected to continue to attain the cleanup standards beyond the
                     termination of sampling,  and

                     Data collected while the ground-water system is disturbed by treat-
                     ment cannot reliably predict  concentrations after steady state has
                     been achieved. Therefore, it is recommended that the ground-water
                     system return to steady state before the sampling for assessing
                     attainment commences.  The data gathered prior to reaching steady
                     state  can be used for guidance  in selecting the statistical procedure  to
                     employ for assessing attainment
2.2.2        The Contaminants to be Tested


              In general, multiple contaminants will be identified  at the site prior to reme-
dial action.  The mixture of contaminants which are present at any one time or place will

depend on many  factors.


              The discussion in this document assumes that relevant regulatory agencies

have specified the contaminants which arc to be used to assess attainment. Conclusions

based on the statistical procedures introduced in this document apply only to the com-

pounds actually  sampled  and the corresponding data analyzed in the statistical tests.
                                        2-5

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
2.2.3 The Ground-Water System to be Tested

Contamination in ground water is measured from water samples collected
from wells at specified locations and times. The location of the wells, the times and
frequency of the sampling, and the assumptions behind the analyses will affect the interpre-
tation of the statistical results.

This document assumes that the attainment decision will be based on
samples from established wells, This document does not make recommendations on where
to locate wells for sampling. However, decisions must be made on which wells arc to be
used for the assessing attainment. Because wells arc not randomly located throughout an
aquifer, the statistical conclusions strictly apply only to the water obtained from the selected
wells and not to the aquifer in general. Conclusions about the aquifer must be based on a
combination of statistical results for the sampled wells and expert knowledge or beliefs
about the ground-water system and not on statistical inference.

Because of the high cost of installing a new well and the possibility of using
information from previous investigation stages, this document assumes that the location of
wells has been specified by experts in ground-water hydrology and approved by regulatory
agencies who arc familiar with the contamination data at the site.

Interpretation of the results of the statistical analysis will depend on a
judgment as to whether the wells are in the correct place. If it is necessary to test the
assumptions used to select wells, additional wells will have to be established and sampled.
In this case, consultation with a statistician is recommended.
2.2.4 The Cleanup Standard

The cleanup standard is the criterion set by EPA against which the measured
concentrations are compared to determine if the ground water at the Superfund site is
acceptable or nor. If the ground water meets the cleanup standard, then the remediation
efforts are judged to be complete. The specification of the cleanup standard by EPA or
another regulatory agency may be different for different sites and for different chemicals or
mixtures of chemicals. With a mixture of contaminants, the cleanup standard may apply to
2-6

-------
  CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS

an aggregate measure, or, in complex mixtures, the ground water may be required to meet
the cleanup standard for every contaminant present. For more information, see Guidance
on Remedial Actions for  Contaminated Ground  Water at Superfund Sites (EPA,  1988).


2.2.5        The Definition  of Attainment

              In order to determine if the contaminant concentrations at the site attain the
cleanup standard one must carefully define what concentration is to be compared to the
cleanup standard and what criteria are to be used to make the comparison for assessing
attainment. This document assumes that either the average concentration or a selected
percentile of the concentrations is to be compared to the cleanup standard.  The examples in
the text usually use the  average concentration. The ground water in a well attains the
cleanup standard if,  based on statistical tests, it is unlikely that the  average concentration (or
the  percentile) is greater than the cleanup Standard.

              The  statistical procedures for assesing the attainment of the cleanup  stan-
dard use a basic statistical technique called hypothesis testing. To show that the ground
water in the selected wells is actually below the cleanup standard (i.e.,  attains the cleanup
standard), we assume that the water in the wells does not attain the cleanup standard. This
assumption is  called the null hypothesis.  Then data arc collected. If the  data arc suffi-
ciently inconsistent with the null hypothesis, the null hypothesis is rejected and we con-
clude that the water in the well attains the cleanup standard.

              The  steps  involved in  hypothesis testing are:

               (1)    Establish the null hypothesis, "The contaminant concentrations in
                     the select+ wells do  not  attain the  applicable cleanup  standard";
               (2)    collect data; and
               (3)    Based on the  data, decide if the ground water attains the cleanup
                     standard:
                     (a)    If the data are  inconsistent  with the null hypothesis, conclude
                            that there  is  sufficient evidence to reject the null  hypothesis.
                            Accept the alternate hypothesis that  the contaminant concern-
                            trations attain the applicable cleanup standard, i.e., conclude
                            that the ground  water is clean.
                     (b)    Otherwise, conclude that there is insufficient evidence to
                            reject the  null hypothesis  and that the contaminant concentra-

                                         2-7

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

                            tions do not attain the cleanup standards, i.e., conclude that
                            the  ground  water is  contaminated.

             - To be technically correct, the results of the hypothesis test indicate whether
the null hypothesis can be rejected with  a  specified level of confidence. In practice, we
would conclude that the concentrations  do  or do not attain the cleanup standards and act as
if that conclusion were known as fact rather than subject to error. Therefore to avoid the
verbose but technically correct wording above, the results of the hypothesis tests will be
worded as concluding that the concentrations either attain or do not attain the cleanup
standard.

              When specifying simplified Superfund site cleanup objectives in consent
decrees, records of decision, or work  plans, it is extremely important to say that the site
shall be cleaned up until the sampling program indicates with reasonable  confidence that the
concentrations of the contaminants at the entire site are less than the cleanup standard.
However, attainment is often wrongly described by saying that concentrations at the site
shall not exceed the cleanup standard.
2.3           Introduction  to  Statistical Issues For Assessing Attainment

              This section provides a discussion of some basic statistical issues with an
emphasis on those with specific application to assessing attainment in ground water. This
discussion provides a general background for the specification of attainment objectives in
Chapter 3  and the  statistical procedures presented in Chapters 4 through 9.
2.3.1        Specification of the Parameter to be  Compared to the Cleanup
              Standard
              In order to define a statistical test to determine whether the ground water
attains the cleanup standard, the characteristics of the chemical concentrations to be com-
pared to the cleanup standard must be specified.  Such  characteristics  are called parameters.
The choice of the parameter to use when assessing attainment at  Superfund sites may
depend on site specific characteristics and decisions and has not, in general, been specified
by EPA.
                                         2-8

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

The parameters discussed in this document are the mean or average concen-
tration and a selected percentile of the concentrations. For example, the rule for deciding if
the ground water attains the cleanup standard might be: the ground water is considered
clean (orremediated) if the mean concentration is below the cleanup standard based on a
statistical test. The following sections define parameters for distributions of data and the
statistical properties of these parameters. An understanding of these properties is necessary
for determining the appropriate parameter to test

The Distribution of Data Values

This section discusses the characteristics of concentration distributions
which might be expected at Superfund sites and how the distribution of concentrations in
the ground water can be described using parameters. These topics are discussed in more
detail-in Volume I (Sections 2.8 and 3.5).

Consider the set of concentration measurements which would be obtained if
all possible ground-water samples from a particular monitoring well over a specified period
of time could be collected and analyzed. This set of measurements is called the popula-
tion of ground-water sample measurements. The set of ground-water samples comprising
the population may cover a fixed period of time, such as one year, or an unlimited time,
such as all future measurements. The set of ground-water measurements can be described
mathematically and graphically by the "population distribution function" referred to as the
"distribution of the data". Figure 2.2 shows a plot of the population distribution for data
from three hypothetical distributions. The vertical axis shows the relative proportion of the
population measurements at each concentration value on the horizontal axis. In the plots,
the areas under the curve between any two points on the concentration axis represents the
percentage of the ground-water measurements that have concentration values within the
specified range.

Two distributions, the normal and lognormal distributions, will be used as
examples in the following discussion. Both the normal and lognormal distributions are
useful in statistical work and can be used to approximate the concentration distributions
from wells at Superfund sites. Figure 2.2 shows an example of a normal and a lognormal
distribution.
2-9

-------
  CHAPTER 2:  INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS
Figure 2.2     Measures of location: Mean, median, 25th percentile, 75th percentile, and
              95th percentile for three hypothetical distributions
 -1
                          Hypothetical Distribution
 3456
Concentration ppm
                        Lognormal Distribution
                                                      Legend:
                                                        Measures of Location:
 25th Percentik


Median (50th Percentile)


 Mean


  75th Percentile


  95th Percentile
                                                           Measure of Spread:

                                                              ± 1 Standard Deviations
                                                                  Around the Mean
                 2345
                    Concentration ppm
                                      2-10

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Summary measures describing characteristics of the population distribution
are referred to as parameters or population parameters. Three important characteris-
tics of the data described by these parameters:

• The location of the data;
• The spread (or dispersion) of the data; and
• The general shape or "skewness" of the data distribution.

Measures 'of Location

Measures of location (or central tendency) are often used to describe where
most of the data lie along the concentration axis of the distribution plot. Examples of such
measures of location are:

"The mean (or average) concentration of all ground-water samples is
17.2 ppm" (i.e., 17.2 is the mean concentration);
"Half the ground-water samples have concentrations greater than 13
ppm and half less than 13 ppm" (13 is the median concentration);
or
"Concentrations of 5 ppm (rounded to the nearest unit) occur more
often than any other concentration value" (the mode is 5 ppm).

Another measure of location is the percentile. The Qth percentile is the
concentration which separates the lower Q percent of the ground-water measurements from
the upper 100-Q percent of the ground-water measurements. The median is a special
percentile, the 50th percentile. The 25th percentile is the concentration which is greater
than the lowest 25 percent of the ground-water measurements and less than the remaining
75 percent of the ground-water measurements. Figure 2.2 shows the mean, median, 25th
percentile, 75th percentile, and 95th percentile for three distributions introduced previously.

Throughout this document, the Greek letter, JL, (spelled "mu" and pro-
nounced "mew") will be used to denote the population mean. The median will be denoted
and the Qth percentile will be denoted by XQ.
2-11

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPT'S AND DECISIONS

Measures of Spread

Measures of spread provide information about the variability or dispersion
of a set of measurements. Examples of different measures of spread are:

The standard deviation or the variance (the square of the
standard deviation). The population standard deviation is denoted
by the Greek letter, o, (pronounced "sigma") throughout this docu-
ment. If data are normally distributed, two-thirds of the data are
within one standard deviation of die mean;
The coefficient of variation is the ratio of the standard deviation
. . a ,
to the mean, —, and
H
The interquartile range is the difference between the 75th and
25th percentiles of the distribution.

For each distribution in Figure 2.2, the mean and the range of plus and
minus one standard deviation around the mean are shown on the plots.

Measure of Skewness

Skewness is a measure of the extent to which a distribution is symmetric or
asymmetric. A distribution is symmetric if the shape of the two halves are mirror images of
each other about a center line. One common symmetric distribution is the normal distribu-
tion, which is often described as having a "bell-shape." Many statistical tests assume that
the sample measurements are normally distributed (i.e., have a normal distribution).

The distribution of concentrations is not likely to be symmetric. It may be
skewed to the right. That is, the highest measurements (those to the right on the plot of the
distribution function) are farther from the mean concentration than are the lowest concen-
trations. Ground-water measurements often have a skewed distribution which can be
approximated by a lognormal distribution (see Gilbert 1987, for additional discussion of
the normal and lognormal distributions). Note that for right skewed distributions (e.g., the
lognormal distribution in Figure 2.2) the mean is greater than the median.

The three distributions shown in Figure 2.2 have the same mean and stan-
dard deviation. Note, however, that the occurrence of particularly high or low concentra-
2-12

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS  AND DECISIONS

 tions differs for the three distributions. In general, the more skewed the distribution, the
 more likely are these extreme observations.


              Selecting the Parameter to Compare to the Cleanup Standard

              In order to determine if the contaminant concentrations attain the cleanup
 standard the measure of location which is to be compared to the cleanup standard must be
 specified. Even though the true distribution is unknown, the specified measure of location,
 or parameter of interest, can be  selected based  on:

                    Information  about the distribution from preliminary data;
                    Information about the behavior of each parameter  for different
                    distributions;
                    The effects  of various concentrations of the contaminant on human
                    health and the environment; and
                    Relevant  criteria far protecting human health and the environment.

              Chapter 3 discusses in more detail the selection of the mean or a percentile
 to be compared to the cleanup standard.
2.3.2        Short-term Versus Long-term Tests

              Due to fluctuating concentrations over time, the average contaminant
concentration over a short period of time may be very different from the average over a
long period of time. Figure 2.3 shows a hypothetical series of weekly ground-water
concentration measurements collected over a period of 70 weeks (about 16 months). The
figure shows the weekly concentration measurements,  the average concentration for weeks
21 through 46  (6 months),  and  the long-term average  concentration which  is obtained from
data collected over 50 years (only a portion of which is shown here). From the figure, it
can be seen that the short-term average concentration can be very different from the long-
term average.
                                       2-13

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Figure 2.3 Illustration of the difference between a short- and long-term mean
concentration
Long-
10 20 30 40
Weeks
50 60 70
The short-term average is estimated using data collected during the period of
interest, in this example during weeks 21 through 46. Similarly the longer term average
can be estimated based on data collected over the longer period of interest, perhaps 50
years. Fortunately, by using information on the correlation of the measurements across
time, it is usually possible to estimate the long-term average concentration from data
collected over a limited period of time. In order to estimate the average concentration for a
period which is longer than the data collection period, assumptions must be made which
relate the unmeasured future concentrations to the concentrations which are actually
measured. These assumptions are stated in terms of a model for the data.

Statistical decisions and estimates that only apply to the sampling period arc
referred to here as "short-term" estimates and are presented in Chapter 4. Decisions and
estimates that apply to the foreseeable future are called "long-term" estimates. The long-
term estimates are made based on the assumption that the ground-water concentrations will
behave in a predictable manner. The assumptions take into account the expected natural
fluctuations in ground-water flows and contaminant concentrations.

In this document the ground water is said to attain the cleanup standard only
if the concentrations attain the cleanup standard for the foreseeable (or at least predictable)
future. Thus, long-term estimates and procedures are used to assess attainment. Short-
term estimates can be used to make interim management decisions.
2-14

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
2.3.3 The Role of Statistical Sampling and Inference in Assessing
Attainment
When assessing attainment, it is desirable to compare the population mean
(or population percentile or other parameter) of the concentrations to the cleanup standard.
However, the data for assessing attainment arc derived from a sample, a small proportion
of the population. Statistical inference is used to make conclusions about the population
parameter from the sample measurements. For illustration, the following discussion
assumes that the population mean must be less than the cleanup standard if we arc to
conclude that the water in the well attains the cleanup standard.

The mean concentration calculated from the sample data provides an esti-
mate of the population mean. Estimates of concentration levels computed from a statistical
sample are subject to "error" in part because they arc based on only a small subset of the
population. The use of the term "error" in this context in no way implies that then are
mistakes in the data. Rather, "error" is a short hand way of saying that there is variability
in the sample estimates from different samples. There are two components to this error
sampling error and lab, or measurement, error.

Different samples will yield different estimates of the parameter of
interest due to sampling error.
Unknown factors in the handling and lab analysis procedures result
in errors or variation in the lab measurements, i.e., two lab analyses
of the same ground-water sample will usually give slightly different
concentration values. This difference is attributed to lab error or
measurement error.

Because the sample mean is subject to error, it cannot be directly compared
to the cleanup standard to decide if the population mean is less than the cleanup standard.
For example, just because the mean for a particular sample happens to be below the cleanup
standard does not mean that the standard has been attained. To make meaningful infer-
ences, it is necessary to obtain a measure of the error (or expressed another way, the preci-
sion) associated with the sample mean. An estimate of the error in the sample mean can be
calculated from the sample and is referred to as the standard error of the mean. It is a
1The possible bias in the measurements is assumed to be zero. The quality assurance plan should address
the problems of possible bias.

2-15

-------
    CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS

 basic measure of the absolute variability  of the  calculated sample mean from one sample to
 another.

               The standard error of the mean can be used to construct confidence
 intervals around a sample mean using equation (2.1) in Box 2.1. Under general condi-
 tions, the interval constructed using equation (2.1) will include the population mean in
 approximately 95 percent of all samples collected and is called a "95 percent two-sided
 confidence interval."  This useful fact follows from the Central Limit Theorem which
 states that, under fairly general conditions, the distribution of the sample mean is "close" to
 a normal distribution even though we may not know the distribution of the original  data.
 Note also that the validity of the confidence interval given in Box 2.1 depends on the data
 being independent in a statistical sense.  Independent ground water measurements are
 obtained when the sample collection times are randomly selected within the sampling
period.

              When assessing attainment, a two-sided test would be used for pH because
 both high and low values represent pollution. For most other pollutants, use one-sided
 confidence intervals because only high values  indicate pollution. A 95 percent one-sided
 confidence interval can be obtained from equation (22) in Box 2.1. The interval from zero
 (the lowest possible measurement) to this upper endpoint will also include the  population
 mean in approximately  95 percent  of all samples collected.
                                      Box 2.1
          Construction of Confidence Intervals Under  Assumptions  of Normality
        To construct a 95 percent two-sided confidence interval around a sample
        mean:
              lower endpoint = sample mean - 1.96 * standard error and
              upper endpoint = sample mean + 1.96 * standard error.      (2.1)

        To construct a 95 percent one-sided confidence interval:
              upper endpoint = sample mean + 1.65 * standard error.      (2.2)
              Using confidence intervals, the following procedure can be used to make
 conclusions about the population mean based on a sample of data:
                                       2-16

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

(1) Calculate the sample mean;
(2) calculate the standard error of the sample mean;
(3) Calculate the upper endpoint of the one-sided confidence interval;

(4) If the upper endpoint of the confidence interval is below the cleanup
standard, then conclude that the ground water attains the cleanup
standard, otherwise conclude that the ground water does not attain
the cleanup standard.

A 95 percent confidence interval will not cover the population parameter in 5 percent of the
samples. When using the confidence interval to assess attainment, one will incorrectly
concluded that the ground water attains the cleanup standard in up to 5 percent of all
samples. Thus, this procedure is said to have a false positive rate of 5 percent. This false
positive rate is discussed in detail in the next section.
2.3.4 Specification of Precision and Confidence Levels for
Protection Against Adverse Health and Environmental Risks

The validity of the decision that a site meets the cleanup standard depends
on how well the samples represent the ground water during the period of sampling, how
accurately the samples are analyzed, and the criteria used to define attainment. The true but
unknown condition is that the ground water is either clean or contaminated. Similarly, the
decisions made using the statistical procedures will result in an attainment or non-attainment
decision. The relationship between these two conditions is shown in Table 2.1.
2-17

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Table 2.1      False  positive and negative  decisions
Decision based on a
statistical sample
Clean
Contaminated
True condition in the well:
Clean (Attains the
cleanup standard)
Correct decision
False negative
decision
Contaminated (Does
not attain the cleanup
standard)
False positive
decision
correct decision
              As a result of the sampling and measurement uncertainty, one may decide
that the site is clean when it is not. In the context of this document, this mistaken conclu-
sion is referred to as a false positive finding (statisticians refer to a false positive as a
'Type I error"). There are several points to make regarding false positives:

                     Reducing the chance of a false positive decision helps to protect
                     human health and  the environment;
                     A low false positive rate does not come without cost. The additional
                     cost of lowering false positive rates comes  from taking additional
                     samples and using  more precise analysis methods;
                     The definition of a false p9sitive in this document is exactly the
                     opposite of the more familiar  definition of a false positive under
                     RCRA detection and  compliance monitoring.

              In order to design a statistical test for assessing attainment, those specifying
the sampling and analysis objectives must  select the maximum acceptable false  positive rate
(the maximum probability of a false positive decision  is denoted by the Greek  letter alpha,
a). It is usually set at, levels  such as 0.10, 0.05, or 0.01 (that is 10%. 5%, or 1%),
depending on the potential consequences of declaring that the ground water is clean when
in fact it is not. While different false positive rates can be used for each chemical, it is
recommended that the same rate be used for all chemicals being investigated. For a further
discussion of false positive rates,  see Sokal and  Rohlf (198  1).
                                       2-18

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
The converse of a false positive decision is a false negative decision (or
Type II error), the mistake of concluding the ground water requires additional treatment
when, in fact, it attains the cleanup standard This error results in the waste of resources in
unnecessary treatment. It would be desirable to minimize the probability of false negative
decisions as well as false positive decisions. The Greek letter beta (P) is used to represent
the probability of a false negative decision.

If both a and P can be reduced, the percentage of time that the correct deci-
sion will be made will be increased. Unfortunately, simultaneous reduction usually can
only be achieved by increasing sample size (the number of samples collected and analyzed),
which may be expensive.

The probability of declaring the ground water to be clean will depend on the
true mean concentration of the ground water. If the population mean is above the cleanup
standard, the ground water will rarely be declared clean (this will only happen if the partic-
ular sample chosen has a large associated sampling and/or measurement error). If the
population mean is much smaller than the cleanup standard, the ground water will almost
always be judged to be clean. This relationship can be plotted for various values of the
population mean as in Figure 2.4. The plot shows the probability of declaring the ground
water to be clean as a function of a hypothetical population mean, and is referred to as a
power curve. For practical purposes, in this volume the probability of declaring the site
clean is the "power of the test." The following assumptions were made when plotting the
example power curve in Figure 2.4: the false positive rate is 5%, the false negative rate
when the true mean, ji], is 0.6 is 20%, and the cleanup standard is 1.0.

If the population mean concentration is equal to or just above the cleanup
standard (i.e., does not attain the cleanup standard), the probability of declaring the ground
water to be clean is a; this is the maximum false positive rate.

For the specification of the attainment objectives (discussed in Chapter 3),
the acceptable probabilities of a false positive and false negative decision must be specified.
Based on these values and the selected statistical procedures, the required sample size can
be calculated.
2-19

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Figure 2.4    Hypothetical power curve
  Probability
  of deciding
    die ate
  attains the
   cleanup
   standard
              0.8
                   \ False negative rate of
                         at a mean of .6 ppm
0.6
0.4 •
              0.2
             Power at ^^is 80%
Cleanup
Standard
                          0.2       0.4       0.6       0.8        1
                               Population mean concentration, ppm
2.3.5
Attainment Decisions  Based on  Multiple Wells
              The ground water will be judged to attain the cleanup standard if the con-

taminant concentrations in the selected wells are sufficiently low compared to the cleanup

standard. Below are two possible ways in which the attainment decision can be based on

water samples  from  multiple  wells:


                     Assess each well individually: make a separate attainment decision
                     for each well; conclude that the ground water at the site attains the
                     cleanup standard if the ground water in each tested well attains the
                     cleanup standard.

                     Associate selected wells into groups: collect samples in all wells in
                     a group at the same time, combine the results from all wells in the
                     same group into one summary statistic for that time period; conclude
                     that the ground water represented by each group attains the cleanup
                     standard if the summary statistic attains the cleanup standard.
                     Conclude that the ground water at the site attains the cleanup stan-
                     dard if the summary statistics from all groups attain the standard.


              The choice of assessing wells individually or as a group has implications for

the interpretation of the statistical results and the false positive  and  false  negative probabili-

ties for deciding that the site, as opposed to the well, attains the cleanup standard. These

issues are discussed in more detail in the following three sections.
                                       2-20

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Assessing Multiple Wells Individually

When assessing each well individually, slightly different criteria can be used
for each attainment decision. For example, different sample collection schedules can be
used for each well. Assessing each well individually may require substantially fewer
samples than assessing the wells as a group, depending on the concentrations in the wells.
The attainment decisions for each individual well must be combined to make
an attainment decision for the entire site. The only procedure discussed in this document
for combining the results from assessments on individual wells is to conclude that the
ground water at the site attains the cleanup standard only if the ground water in each well
attains the cleanup standard

If many wells are tested the site will not attain the cleanup standard if any
one of the wells does not attain the standard. Even if all wells actually attain the cleanup
standard, the more wells used to assess attainment, the greater the likelihood of a false
negative decision in one well, resulting in an overall non-attainment decision. On the other
hand, assessing all wells individually can result in significant protection for human health
and the environment because all concentrations must attain the cleanup standard in spite of
false negative decisions. Implicit in the above discussion is the conflict of protecting the
public health versus the cost of possible overcleaning are over attainment.

Testing Multiple Wells as a Group

When multiple wells are tested as a group, samples must be collected in
each well at the same, time and thus the same number of samples will be collected in all
wells within a group. At each sample time, the measurements from each well are combined
into a summary statistic. The ground water in the group of wells would be declared to
attain the cleanup standard if the summary statistic was significantly less than the cleanup
standard. Several methods can be used to combine the measurements from all tested wells
at each sample time into one summary statistic. Two methods arc:

Average of measurements from all wells within a group; and
Take the maximum concentration across all wells within a group.

2-21

-------
  CHAPTER  2: INTRODUCTION TO STATISTICAL  CONCEPTS AND DECISIONS
              If the average across all wells must be less than the cleanup standard, then
the site may be declared clean if the concentrations in some wells are substantially greater
than the cleanup standard as long as concentrations in other wells arc much less than the
cleanup  stand&d.  These differences among wells in a groups can sometimes be minimized
by grouping wells with similar concentration levels. On the other hand, requiring that the
maximum concentration across all wells attain the cleanup standard assures that each well
individually  will  attain  the standard.

              If the average concentration across all wells is to be compared to the cleanup
standard, a decrease in lab costs may be achieved by compositing the water samples across
wells (and possibly across time)  and analyzing the contaminant  concentrations in the
composite samples.  Since the recommended number of samples to be composited and the
length of the sample period will depend on the  serial correlation of the data and several cost
and variance estimates, consultation with a statistician is recommended if compositing is
considered.
              Multiple Statistical Tests

              When assessing attainment in multiple wells (or groups of wells) and when
assessing attainment far  multiple  chemicals,  two probabilities  are of interest: the probability
of deciding that one compound in one well (or group of wells) is clean and the probability
of deciding that all compounds in all wells (or groups of wells) are  clean. The following
discussion will be phrased in terms of testing individual wells. However, it also applies to
testing groups of  wells.

              For an individual statistical decision on one compound or well, the maxi-
mum probability  of a false positive decision is denoted by the Greek letter alpha, a. This
may also be called the comparison-wise alpha. When multiple chemicals or wells are
being assessed, the overall alpha or experiment-wise alpha is the maximum  probability
of incorrectly declaring that the all compounds in all ground water wells at the site attain the
cleanup standard.*   In this document it is assumed that the site will be declared to have
'Note that the procedures discussed here for assessing the attainment of the site from the results of multiple
  statistical tests are different from the typical presentations on "multiple comparison tests" or "experiment-
  wise  versus comparison-wise tests" presented in many introductory statistics textbooks which use a
  different null hypothesis.  Here all tests, rather than any single test, must have a significant result.

                                        2-22

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS


attained the cleanup standard only if all contaminants tested attain their specified cleanup
Standard


              The probability of deciding that all compounds in all wells  attain the cleanup
standard, i.e., the overall a, depends on the number of statistical tests performed.  If wells

are assessed individually, more statistical tests will be performed than when assessing

wells as a group. Thus, the decision on whether to group wells is related to the selection of
the probabilities of a false positive or false negative decision.


              The overall probability of declaring that a site has attained the cleanup

standard depends on the:

                     Number of contaminants and wells being assessed

                     Concentrations of the  contaminants being assessed;

                     Statistical tests being used  for the  individual contaminants;

                     Correlation between the concentration measurements of different
                     contaminants in the same wells and contaminants in different wells;
                     and

                     Decision rules for combining the statistical results from each
                     contaminant and well to  decide if  the overall  site  attains the cleanup
                     Standard,


Although the  calculation of the overall probability of declaring the  site to  attain the cleanup
standard can be difficult, the following general conclusions can be stated when using the

rale that all contaminants    (or wells) must attain the cleanup standard:

                     The probability of incorrectly deciding that the site attains the
                     cleanup standard, the  overall  alpha, is always less than or equal to
                     the maximum probability of mistakenly deciding that any one
                     contaminant (or well)  attains its cleanup standard (comparison-wise
                     alpha).

                     As the number of contaminants being assessed increases, the
                     probability of deciding  that the site is clean  decreases,regardless  of
                     the true status of the site.


              Choice of a strategy for combining the results from many statistical tests

involves both policy  and statistical questions. As a result no general recommendations can
                                       2-23

-------
CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

be made in this document. When many contaminants or wells arc being assessed, consul-
tation with a statistician is recommended.

2.3.6 Statistical Versus Predictive Modeling

A model is a mathematical description of the process or phenomenon from
which the data are collected. A model provides a framework for extrapolating from the
measurements obtained during the data collection period to other periods of time and for
describing the important characteristics of the data. Perhaps most importantly, a model
serves as a formal description, of the assumptions which are being made about the data.
The choice of statistical method used to analyze the data depends on the nature of these
assumptions. (See Appendix D for a discussion on modeling the data.)

Mathematical (deterministic) models can be used to predict or simulate the
contaminant concentrations, the effect of treatment on the contaminants, the time required
far remediation, and the remaining concentrations after remedial action. These models are
referred to here as predictive models. To predict future concentrations these models typi-
cally use (1) mathematical formulae describing the flow of ground water and contaminants
through porous or fractured media, (2) boundary conditions to specify the conditions at the
start of the simulation (often based on assumptions), and (3) assumptions about the aquifer
conditions. Predictive models are powerful tools, providing predictions in a relatively
short time with minimal cost compared to the corresponding field sampling. They allow
comparison of the expected results of different treatment alternatives. However, it is
difficult to determine the probability of correctly or incorrectly deciding if the ground water
attains the cleanup standard using predictive models, in part, due to the many assumptions
on which the models are based.

On the other hand, the statistical models and procedures discussed in this
document arc based on very few assumptions and can be used whether or not predictive
models have been applied at the site. The statistical procedures can also be used as a check
on the predictive models. Unlike the predictive models, the statistical models presented in
this document for assessing attainment only use measurements from the period after
remedial action has been terminated.
2-24

-------
  CHAPTER 2:  INTRODUCTION  TO  STATISTICAL CONCEPTS  AND DECISIONS

              While this document makes the assumption that the attainment  decision will
be based on statistical models and procedures,  predictive models  and  data collected prior to
the sampling for the attain-t decision provide a guide as to which wells are to be used
for assessing attainment, when to initiate an evaluation, and what criteria are to be used to
define attainment of the cleanup standard. If predictive models are used in other ways for
the attainment decision, consultation with a statistician is recommended. Due to the
complexity of both site conditions and predictive modeling, other procedures which might
be used to combine the results of predictive and statistical models are beyond the scope of
this document
2.3.7        Practical  Problems  with the  Data Collection  and Their
              Resolution
              With any collection of data there are possible problems which must be
addressed by the statistical procedures. The problems discussed below are: measurements
below the detection limit, missing data and very  unusual observations, often called
"outliers."
              Measurements  Below the Detection Limit

              The detection limit for a laboratory measurementprocedure is the lowest
concentration level which can be determined to be different from a blank. Measurements
which arc below the detection limit may be reported in one of several different ways
(Gilbert 1987). For example:

                    A concentration value, with the notation that the reported concentra-
                    tion is below the detection limit;
                    Less than a specified detection limit;  or
                    Coded as  "below  the detection limit" with no  concentration or  detec-
                    tion  limit specified.

              Special procedures arc required to use the below-detection-limit mesure-
mets in a statistical analysis. If, due to poor selection of the laboratory analysis method or
unanticipated problems  with  the  analysis, the cleanup standard is below the  detection limit,
the  possible statistical  procedures which might be used to compare the concentrations  to the
                                       2-25

-------
    CHAPTER 2:  INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

  cleanup standard are very limited and required many assumptions which are difficult to
  justify. As a result, this document only addresses the situation where the cleanup standard
  is greater than the detection limit.

               For all of the procedures described in this  manual, the following procedures
  for  handling belowdetection-limit  measurements  are  recommended:

               Whenever the measured concentration for a given water sample is reported
               by the laboratory, use this concentration in the analysis even thougn it is
               below  the detection limit;
               When the concentration is reported as less than a specified detection limit,
               use the value at the detection limit as the measured concentration in the
               analysis; and
               When  the laboratory reports that the chemical concentration is "below the
               detection limit" with no specified detection limit,  contact the analytical
               laboratory to determine the minimum  detectable value, and use this value in
               the analysis. Do not  treat below-detection-level  measurements   as missing.

               Using the detection limit for values below the detection limit is conservative;
  i.e., errs in favor of minimizing health and environmental risks. Other methods of
handling below-detection-limit problems can  be used, but are more  difficult to implement
and have the potential of erring in the opposite direction.  Selection of a method can be
  dependent upon the proportion of non-detects. Alternative procedures  should be investi-
 gated and assessed as to how data are affected Some of these alternative procedures are
  discussed in the  following  references on  detection limit problems: Bishop,  1985; Clayton et
  al., 1986; Gilbert, 1981; Gilliom  and Helsel, 1986;  Helsel and Gilliom, 1986; and Gleit,
  1985.
               Missing Values

               Missing  concentration values are different from below-detection measure-
 ments in that no information about the missing concentration (either above or below the
 detection level) is known.  Missing values may be due to many factors, including either (1)
 non-collection of the scheduled sample. (2) loss of the sample before it is analyzed due to
 shipping or lab problems, or (3) loss of the lab results due to improper recording of results
 or loss of the data records.
                                         2-26

-------
   CHAPTER  2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

               In general, this problem can be minimized with appropriate planning and <•
 backup  procedures and by using a proper chain of custody procedures, careful
 packaging and handling, clear labeling, and keeping copies of important records.

               If the sample is lost shortly after collection, it is recommended that another
 sample be collected immediately to replace the lost sample as long as the time between the
 lost and replacement sample is less than half the time between successive samples specified
 in the sample design. Any deviations to the sampling design, including lost and replace-
 ment samples should be reported with the data and analysis. The replacement or substitu-
 tion of missing data by numerical values is never recommended.


               Outliers

               In many statistical texts, measurements that are (1) very large or small
 relative to the rest of the  data,  or (2) suspected of being unrepresentative  of the true concen-
 tration at the sample location are often called "outliers." Observations which appear to be
unusual may correctly represent unusual concentrations in the field,  or may result from
 unrecognized handling problems, such as contamination, lab measurement, or data
 recording errors.  If a particular observation is suspected to be in error,  the error should be
 identified and corrected, and the corrected value used in the analysis.  If no such verifica-
 tion is possible, a  statistician  should be consulted  to provide modifications to  the statistical
 analysis that  account for the suspected "outlier." For more background on statistical
 methods  to handle  outliers, see Bamett and Lewis (1984).

               The handling of outliers is a controversial topic. In this document, all data
 not known to be in error are considered to be valid because:

                     The expected distribution  of concentration values may be skewed
                     (i.e., non-symmetric) so that large concentrations  which look  like
                     "outliers" to some analysts  may be legitimate;
                     The procedures recommended in this document are less sensitive to
                     extremely low  concentrations than to extremely  high  concentrations;
                     and
                     High  concentrations arc of particular concern for their potential
                     health  and  environmental impact.
                                       2-27

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS  AND DECISIONS
 2.4          Limitations and Assumptions  of the Procedures Addressed in
              this Document
              Because a single document cannot adequately address the wide variety of
 situations found at all Superfund sites, this document will only discuss those statistical
 procedures that are applicable to most sites and  can be implemented without a detailed
 knowledge of statistical methods. Although the  procedures recommended here will be
 generally applicable, specific objectives or situations at some sites may require the use of
 other statistical procedures. Where possible problems are anticipated, the text will recom-
 mend consultation with a statistician.

              Due to the complex nature of conditions at Superfund sites, this document
 cannot address all statistical issues applicable either to Superfund sites or to assessing the
 attainment of cleanup standards. The discussion in this document is based on certain
 assumptions about what statistical tests will be requited and what the situations at the site
 will be. For completeness, the major assumptions  are reviewed below.

                     The contaminants are known;
                     The ground water does not attain the cleanup standard until this
                     assumption (that is the null  hypothesis) is rejected using a statistical
                     test;
                     At the time of sampling for assessing attainment, there are no
                    reasons to believe the ground-water concentrations might increase
                     over time;
                     Location of the monitoring and pumping (or treatment) wells arc
                     fixed and arc not to be specified as part of the statistical methods.
                     As a result, the attainment decision strictly applies only to the water
                     in the wells, not to the ground water in general. To draw general
                     conclusions about the ground  water, additional assumptions must be
                     made or  additional wells must be established; and
                     The  cleanup standard is  greater than the detection limit for all chemi-
                     cals to be tested.
2.5           Summary

              This guidance considers the variety and complexity of ground water condi-
tions at Superfund sites and provides procedures which can be used at most sites  and under
most conditions. This chapter outlines some of the conditions found at Superfund sites and
                                      2-28

-------
  CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS


some of the assumptions which have been made as a guide to the selection of statistical
procedures  presented in  later  chapters.


              Errors are possible in evaluating whether a site attains the cleanup stan-
dards, resulting in false  positive and false negative decisions.   Statistical methods provide
approaches for balancing these two decision errors and allow extrapolation in a scientifi-

cally-valid  fashion.


              This chapter reviews briefly the statistical concepts that farm a basis for the

procedures described in this guidance. These include:

                     false positive  decision  - a site is thought  to be clean when  it is not;

                     false negative decision - a site is thought to be contaminated when it
                     is not;

                     mean — the value that corresponds to the "center" of the concentra-
                     tion distribution;

                     Qth proportion or percentile — a value  that separates the lower Q
                     percent of the measurements from the upper 100-Q percent of the
                     measurements;

                     confidence intervals ~ a sample-based estimate of a mean or
                     percentile which is expressed as a range or interval of values which
                     will include the true parameter value with a known probability or
                     confidence;

                     null hypothesis  ~  the prior assumption that  the contaminant concen-
                     trations in the  ground water at the site do not attain the cleanup
                     Standard;

                     hypothesis tests ~ a statistical procedure far assessing attainment of
                     the ground water by accepting or rejecting the null hypothesis on the
                     basis of data;  and

                     power  curve ~ for a specified statistical test and sample size, the
                     probability of concluding that the ground water attains the cleanup
                     standard  versus true  concentration.


              Unlike statistical tests  in other  circumstances, assessment of ground water

requires consideration of the correlation  between measurements across time and space. As
a result of correlation across time, estimating the short-term and long-term concentrations

requires different procedures. The-ground water is defined as  attaining the cleanup stan-
                                        2-29

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

dard if the statistical test indicates the long-term mean concentration or concentration
percentile at the site attains the cleanup standard

              When many wells or contaminants  are assessed,   careful  consideration must
be given to the decision procedures which arc used to combine data from separate wells or
contaminants in order to determine if the site as a whole attains all relevant cleanup stan-
dards.  How the data from separate wells are combined affects the interpretation of the
results  and the probability  of concluding that the  overall site attains the cleanup standard.  A
complete discussion of how to assess attainment using multiple wells is beyond the scope
of this  volume.
                                      2-30

-------
   3.  SPECIFICATION  OF  ATTAINMENT OBJECTIVES
             This chapter discusses the  specification of the attainment objectives,  includ-
ing the specific procedures to be used to assess attainment. The sampling and analysis
plans, discussed in the next chapter, outline procedures to be used to assess attainment
consistent with the attainment objectives. The specification of objectives must be com-
pleted by personnel familiar with the following:

                    The characteristics of the ground water and contamination present at
                    the waste site;
                    The health and environmental risks  of the  chemicals involved; and
                    The costs of sampling, analysis and remediation.

             The flow  chart in Figure 3.1 summarizes the steps required to specify the
sampling and analysis objectives and shows where each step is discussed. In general,
specification of the attainment objectives for the site under investigation involves specifying
the  following items:

                    The wells to be sampled;
                    The sample collection and handling procedures;
                    The chemicals to be tested and the laboratory test methods to be
                    used;
                    The relevant cleanup standard for the chemicals under  investigation;
                    The parameter (e.g., the mean or a percentile)  of the chemical
                    concentration distribution which is to be compared to the cleanup
                    standard
                    The "false positive rate" for the statistical test (the confidence level
                    for protection against  adverse health and  environmental  risk);
                    The precision to be achieved; and
                    Any  other secondary objectives for which the data are to be used
                    which may  affect the  choice of statistical procedure.
                                       3-1

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
Figure 3.1     Steps in defining the attainment objectives
                      Sim
               Specify sample wells
                  (Section 3.2)
                      I
               Specify the ample
                  (Section 3 J)
                     I
            Specify the chemical to be
                     tested.
                  (Section 3.4)
                     1
           Specify the parameter to compare
              to the cleanup standard
                  (Section 3.5)
                     I
         Specify the probability of mistakenly
          declaring the.sample area clean.
                  (Section 3.6)
        Specify the precision to be achieved
                  (Section 3.7)
                     I
            Review all elements of the
              attainment objectives.
     Are any
   changes in the
attainment objectives
     required?
                                          3-2

-------
CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
The items which make up the attainment objectives are discussed in detail in
the following sections.

3.1 Data Quality Objectives

The Quality Assurance Management staff within EPA has developed
requirements and procedures for the development of Data Quality Objectives (DQOs) when
environmental data arc collected to support regulatory and programmatic decisions.
Although the DQOs are an important part of the attainment objectives, they are discussed in
detail elsewhere and will not be addressed here. For more information, readers should
refer to U.S. EPA (1987a) and U.S. EPA (1987b).

3 . 2 Specification of the Wells to be Sampled

Wells within the site will be monitored and evaluated with respect to the
applicable cleanup standards. Extending inferences from the sampled wells to the ground
water in general must be made on the basis of both available data and expert knowledge
about the ground-water system and not on the basis of statistical sampling theory. Careful
selection of the ground-water wells to be used for assessment is required to ensure that
attainment of the cleanup standard in the sampled wells implies to all parties concerned that
the ground-water quality has been adequately protected.

Sections 2.2.3 and 2.3.5 provide more discussion on the implications of the
decision on which wells must attain the cleanup standard.

3.3 Specification of Sample Collection and Handling Procedures

The results of any statistical analysis are only as good as the data on which
it is based. Therefore, an important objective for sampling and analysis plan is to carefully
define all aspects of data collection and measurement procedures, including:

How the ground-water sample is to be collected;
What equipment and procedures are to be used;
3-3

-------
CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
How the sample is to be handled between collection and
measurements
How the laboratory measurements are to be made; and
What precision is to be achieved

One reference for guidance on these topics is The Handbook for Sampling
and Sample Preservation of Water and Wastewater (U.S. EPA, 1982).
3.4 Specification of the Chemicals to be 'rested and Applicable
Cleanup Standards

The chemicals to be tested should be listed. When multiple chemicals are
tested, this document assumes that all chemicals must attain the relevant cleanup standard in
order for the ground water from the well(s) to be declared clean.

The term "cleanup standard" is a generic term for the value to which the
sample measurements must be compared. Throughout this document, the cleanup standard
will be denoted by Cs. The cleanup standard for each chemical of concern must be stated
at the outset of the study. Cleanup standards are determined by EPA in the process of
evaluating site-specific cleanup alternatives. Final selection of the cleanup standard
depends on many factors. These factors are discussed in Guidance on Remedial Actions
for Contaminated Ground Water at Superfund Sites [Interim Final! (I J.S. EPA, 1988).
3.5 Specification of the Parameters to Test

In order to define a statistical test to determine if the contaminant concentra-
tions in ground water well(s) attain the cleanup standard, the characteristic of the concen-
trations which is to be compared to the cleanup standard must be specified. Such character-
istics are called parameters. The two parameters discussed in this document for testing
individual wells are the mean concentration and a specified percentile of the concentrations
such as the median or the 90th percentile of the ground-water concentrations. The follow-
ing sections discuss the criteria for selecting the parameters to test. These parameters have
been defined previously in Section 2.3.1.
3-4

-------
          CHAPTER 3:  SPECIFICATION  OF  ATTAINMENT OBJECTIVES
3.5.1        Selecting the Parameters to Investigate


              Criteria for selecting the parameter to use in the statistical attainment

decision are:

                     The criteria used to develop the risk-based  standards, if known;

                     Whether the effects of the contaminant being measured are acute or
                     chronic;

                     The relative sample sizes required;

                     The likelihood of finding concentration measurements below the
                     Cleanup standard; and

                     The relative spread of the data.

              For example, if the cleanup standard is a risk-based standard developed  for

the mean concentration over a specified period of time,  it is logical that the cleanup standard

be compared to the mean concentration. Alternatively, if the cleanup standard is a risk-

based standard developed  for extreme concentrations which should rarely be exceeded, it is
logical to test an upper percentile of  the concentration distribution.


              Many considerations may go into the selection of the parameter to test.

Table 3.1 presents criteria and conditions that support or contradict the  use of each

parameter.


              Some general rules for selecting the parameter to test are:


              (  1 )  If the chemical contaminant of concern  has short-term or acute
                     effects on human health or the environment, testing of upper
                     percentiles is  recommended, with higher percentiles being chosen
                     for testing when the distribution of contamination has a higher
                     coefficient  of  variation.

              (2)     If the chemical contaminant of concern has long-term  or chronic
                     effects on human  health or the environment, Table 3.2 shows the
                     recommended parameter based on the coefficient of variation of the
                     data and the likelihood of measurements  below the detection level.
                                        3-5

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
Table 3.1      Points to consider when trying to choose among the mean, upper
               proportion/percentile, a median
Parameter
                          Points to consider
Mean
 1) Easy to calculate and estimate a confidence interval.

 2) Useful when the cleanup  standard has been based on consideration
    of carcinogenic a chronic health effects a long-term average
    exposure.

 3) Useful when the data have little variation from sample to  sample or
    season to season.

 4) If the data have a large coefficient of variation (greater than  about
     1.5) testing the  mean can require more samples than for testing an
    upper percentile in order to  provide the same protection to human
    health and the environment

 5) Can have high false positive rates with small sample  sizes and
    highly skewed data,  i.e. when the contamination  levels are  generally
    low with only occasional short  periods of high contamination.

 6) Not as powerful  for testing attainment when there is  a large
    proportion of less-thandetection-limit values.

 7) Is adversely affected by outliers  or errors in a few data values.
Upper
Proportion
Percentile
1) Requiring that an upper percentile be less than the cleanup standard
    can limit the occurrence of samples with high concentrations,
    depending  on the selected percentile.

 2) Unaffected by less-thandetection-limit values, as long as the
    detection limit is less than the cleanup standard.

 3) If the health effects of the contaminant axe acute, extreme
    concentrations are of concern and are best tested by ensuring that a
    large proportion of the  measurements are below a cleanup standard.

 4)  The proportion of the samples that must  be  below the cleanup
    standard must be chosen.

 5)  For highly variable or skewed data,  can  provide similar protection of
    human health and the environment with a smaller sample size than
    when testing the mean.

 6)  Is relatively unaffected by a small number of outliers.
                                          3-6

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
Table 3.1      Points to consider when trying to choose among the  mean, upper
               proportion/percentile,  or median  (continued)
                                           Points to Consider
Median
1)  Has benefits over the mean because it is not as heavily influenced by
    outliers and highly variable data, and  can be used with a large
    number  of less-than-detection-limit  values.

2)  Has many of the positive features  of  the mean, in particular its
    usefulness for  evaluating cleanup  standards based on  carcinogenic
    or  chronic health effects  and long-term average exposure.

3)  For positively skewed  data,  the median is lower than the mean and
    therefore  testing the median  provides less  protection for human
    health and the  environment than testing the mean.

4)  Retains some negative  features of the mean in that testing the median
    will not limit the occurrence of extreme values.
Table 3.2      Recommended parameters to test when comparing the cleanup standard to
               the concentration of a chemical  with  chronic effects'
      Large  Coefficient
      of  Variation
      (Perhaps cv > 1.5)
      Intermediate  Coefficient of
      Variation
      (Perhaps 1.5 > cv > .5)
      Small Coefficient
      of Variation
      (Perhaps cv < .5)
                                      Proportion of the data with concentrations
                                              below the detection limit:
                                     (Perhaps  30%)
                       MeanOr
                   Upper Percentile
                   (Upper  percentile
                requires  fewer  samples)
                       Mean or
                   Upper Percentile
                         Mean
                      or Median
                                                  High
                                            (Perhaps > 30%)
Upper  Percentile
Upper  Percentile
    Median
1 Based on Westat simulations and analysis summarized in an internal Westat memo.
                                          3-7

-------
          CHAPTER 3:  SPECIFICATION OF ATTAINMENT OBJECTIVES
3.5.2        Multiple  Attainment Criteria


              In some situations two or more parameters might be chosen. For example,
both the mean and an upper percentile can be tested using the rule that the ground water

attains the cleanup standard if both parameters are below the cleanup standard.


              Other more complicated criteria may be used to assess the attainment to the

cleanup criteria. Examples of multiple criteria are:


                     It is desirable that most of the ground-water samples have concen-
                     trations below the cleanup standard and that the concentrations
                     which are above the cleanup standard are not too large. This may be
                     accomplished by testing if the 75th percentile is below the cleanup
                     standard and the mean of those concentrations which are above the
                     cleanup standard is less than twice the cleanup standard. This com-
                     bination of tests can be performed with modifications of the methods
                     presented in this document.

                     It is desirable that the mean concentration be less than the cleanup
                     standard and that the standard deviation of the data be small. This
                     may be accomplished by testing if the mean is below the cleanup
                     standard and the standard deviation is below a specified value. This
                     document does not address testing the standard deviation, variance,
                     or coefficient of variation against a standard.


For testing of multiple criteria not discussed in the guidance document, consultation with a

statistician is recommended.
3.6           Specification of Confidence  Levels  for  Protection  Against
              Adverse Health and Environmental Risks
              In order to design a statistical test for deciding if the ground water attains the

cleanup standard, those specifying the sampling and analysis objectives must select the

false positive rate. This rate is the maximum probability that the test results will show the

ground water to be clean when it is actually contaminated. It is usually set at levels such as

0.10, 0.05, or 0.01 (that  < 10%, 5%, or 1%), depending on the potential consequences of

deciding that the ground water is clean when, in fact, it is not clean. While different false

positive rates can be used for each chemical, it is recommended that the same rate be used
                                       3-8

-------
CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
for all chemicals being investigated. For a further discussion of false positive rates see
Section 2.3.4 or Sokal and Rohlf (1981).

3.7 Specification of the Precision to be Achieved

Recision generally refers to the degree to which repeated measurements are
similar to one another. In this context it refers to the degree to which estimates from differ-
ent samples are similar to one another. Decisions based on precise estimates will usually be
the same from sample to sample. The desired precision of the statistical test is specified by
the desired confidence in the statistical decisions resulting from the statistical test.
Specification of the precision to be achieved is required to completely define
the statistical test to use. The precision which is to be achieved can be defined by specify-
ing the-parameter value for which the probability of a false negative decision is to be
controlled. For a definition of "false negative" see Section 2.3.4.

To completely define the precision when testing the mean, the following
items must be specified:

a, the false positive rate;
Cs, the cleanup standard;

• m, the mean concentration at which the false negative rate is to be
specified; and
• P, the false negative rate at \i\.

To completely define the precision when testing percentiles, the following
items must be specified:

a, the false positive rate;
Cs, the cleanup standard;
'When testing multiple chemicals from the same ground water samples, the overall false positive rate will
be approximately the same as that for individual chemical tests if the concentrations of different chemicals
are highly correlated. In situations when the concentrations are not highly correlated, the overall false
positive rate for the entire site will be smaller than that specified for the individual chemicals.
3-9

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
                   P0, the largest acceptable proportion of ground-water samples with
                   concentrations above the  cleanup standard;
              •      PI, the value of the proportion for which the false negative rate is to
                     be specified (comparable to jii, when testing means);
              •      0, the false negative rate at PI-

              The specification of these items is  discussed in &tail Chapter 2 of this
document and in Chapter 6 and 7 of Volume L  The reader should refer to Volume I for
detailed instructions on how these items arc to be specified.
3.8          Secondary Objectives

              The sampling and analysis data may be used for purposes other than assess-
ing the attainment of the cleanup standards. For example, they may be used to determine
the relationship between concentrations of different  contaminants, to determine the  seasonal
patterns in the measurements, or to get measurements on a contaminant not being assessed.
These secondary  objectives may determine what procedure is used to  collect the samples or
how often the  samples arc  collected.


3.9          Summary

              This chapter discussed the specification of the various items which make up
the attainment objectives. The objectives will be specified by EPA, regulatory agencies,
and others familiar with the site, the environmental and health risks, and the sampling and
remediation costs. As part of the objectives, careful consideration must be given to
defining the wells  to be tested, the ground-water sampling and analysis procedures, the
statistical parameter to be compared to the cleanup standard, and the precision and confi-
dence level desired. The attainment objectives provide the background for developing the
sampling and analysis  plans discussed  in Chapter 4.
                                       3-10

-------
4. DESIGN OF THE SAMPLING AND ANALYSIS PLAN
Once the attainment objectives are specified by program and subject matter
personnel, statisticians and hydrogeologists can be useful in designing important compo-
nents of sampling and analysis plans. The sampling plan specifies how the water samples
are to be collected, stored, and analyzed, and how many samples to collect. The analysis
plan specifies which of the statistical procedures presented in the following chapters are to
be used. The sampling and analysis plans are interrelated and must be prepared together.
The decision regarding attainment of the cleanup standard can be made only if the field and
laboratory procedures (in the sampling plan) provide data that are representative of the
ground water and can provide the parameter estimates (from the analysis plan) specified in
the attainment objectives.

The specification of the sampling and analysis plans will depend on the
characteristics of the waste site and the evidence needed to evaluate attainment. The statisti-
cal methods must be consistent with the sample design and attainment objectives. If there
appears to be any reason to use different sample designs or analysis plans than those
discussed in this guidance, or if there is any reason to change either the sample design or
the analysis plan after field data collection has started, it is recommended that a statistician
be consulted.
4.1 The Sample Design

The sample design, or sampling plan; outlines the procedure for
collecting the data, including the timing, location, and filed procedures for obtaining each
physical water sample. The discussion here focuses on the timing of the sample collection
activities. Common types of sample design are random sampling and systematic sampling.
Either of these sample collection procedures can require a fixed number of samples or use
sequential sampling in which the number of samples to be collected is not specified before
the sampling period.
4-1

-------
         CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

4.1.1        Random Sampling

              In a random sample design, samples arc collected at random times through-
out the sampling period. For example, using simple random sampling 48 sample collection
times might be  randomly selected within a four year sampling period. Using  simple
random sampling, some years may have more samples than other years. One alternative to
'simple random sampling is stratified random sample in which 12 samples arc collected in
each of four years, with the sample times within each year being randomly selected. In
either case, with a simple random sample the time interval between the collection of the
water samples  will vary. Some samples may be collected within days of each other while
at other times there may be many  months between samples.

              Although random sampling has some advantages when calculating the
statistical results for short term tests (Chapter 5), systematic sampling is generally recom-
mended far assessing attainment.


4.1.2        Systematic  Sampling

              Using a systematic sample with a random start, ground water samples arc
collected at regular time intervals, (such as  every week, month, three months, year, etc.)
starting from the fast sample collection time, which is randomly determined. In this
document, the systematic sample with a random start will be referred to  as  simply a
systematic sample.

              When sampling ground water, a systematic sample is usually preferred over
a simple random  sample because:
                    Extrapolating from the sample period to future periods is easier with
                    a systematic sample than a simple random sample;
                    Seasonal cycles can be easily identified and accounted for in the data
                    analysis;
                    A systematic sample will be easier to administer because of the fixed
                    schedule for  sampling  times;  and
                    Most ground water samples  have been traditionally collected using a
                    systematic  sample.
                                      4-2

-------
CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

The procedures described in the following chapters assume that either a
systematic or random sample is used when collecting data for a short term test and that a
systematic sample is collected when assessing attainment. If other sample designs arc
considered, consultation with a statistician is recommended. It should be noted that when
implementing a systematic sample, care must be taken to capture any periodic seasonal
variations in the data. The seasonal patterns in the data will repeat themselves (after adjust-
ing for measurement errors) following a regular pattern. For example, if ground water
measurements at a site exhibit seasonal fluctuations, following the four seasons of the year,
collecting data every six months may miss some important aspects of the data, such as high
or low measurements, and could present a misleading picture of the status of the site.
Because many seasonal patterns will have a yearly cycle (due to yearly patterns in surface
water recharge) the text will often refer to the number of samples per year instead of the
number of samples per seasonal cycle.

One variation of the standard systematic sample uses a different random
start for each years data. For example, if one water sample is collected each month, in the
first year samples might be collected on the 17th of each month and in the second year on
the 25th of each month, etc. This variation is preferred when there arc large seasonal
fluctuations in the data.

Follow the steps below to specify the systematic sample design:

(1) Determine the period of any seasonal fluctuation (i.e., time period
between repeating patterns in the data). This period will usually be a
year. If no period is discernible from the data, the use of a one-year
period is recommended
(2) Determine the number of ground water samples, n, to collect in each
year (seasonal cycle) and the corresponding sampling period
between samples. A minimum of four sample collections per year is
recommended.
(3) Specify the beginning of the attainment sampling period.
(4) Randomly select a sampling time during the first sampling period.
(5) Subsequent sampling should be at equal intervals of the sampling
period after the first sample is collected.

In practice, the samples need not be collected precisely at the time called fur
by the sampling interval. However, the difference between the scheduled sampling time
and the actual time of sampling should be small compared to the time between successive

4-3

-------
CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

samples. The sample collection of subsequent samples should not be changed if one

sample is collected early or later than scheduled. An example of the procedure is presented

in Box 4.1.
Box 4.1
Example of Procedure for Specifying a Systematic Sample Design

(1) The seasonal cycle in the measurements is assumed to have a period
of one year.

(2) Based on the methods in Chapter 8, it is decided to collect 6
samples per year, one every two months.

(3) The attainment sampling period is to start on April 1,1992

(4) The first sampling time during the first two-month sampling period
is randomly selected using successive flips of a coin. Each flip
divides the portion of the sampling period being considered into
two. Heads chooses the earlier half, tails the later half. After 5
flips, the chosen day for the first sample is April 15.

(5) Samples are scheduled to be collected the 15th of every other month.
If one sample is collected on the 20th of a month, the subsequent
sample should still be targeted for the 15th of the appropriate month.
4.1.3 Fixed versus Sequential Sampling

For most statistical tests or procedures, the statistical analysis is performed
after the entire set of water samples has been collected and the laboratory results arc

complete. This procedure uses a fixed sample size test because the number of samples

to be collected is established and fixed before the sample collection begins. In sequential

testing, the water samples are analyzed in the lab and the statistical analysis is performed

as the sample collection proceeds. A statistical analysis of the data collected at any point in

time is used to determine whether another sample is to be collected or if the sampling termi-

nates. Sequential statistical tests for data collected using sequential sampling of ground

water are discussed in detail in Chapter 9.
4-4

-------
CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

4.2 The Analysis Plan

Similar to sampling plan, planning an approach to analysis begins before the
first physical sample is collected. The first step is to define the attainment objectives,
discussed in Chapter 3. If the mean is to be compared to cleanup standards, the statisti-
cal methods will be different than if a specified proportion of the samples must have
concentrations below the cleanup standard. Second, the analysis plan must be developed in
conjunction with the sampling plan discussed earlier in this chapter.

Third, determine the appropriate sample size (i.e. the number of physical
samples to be collected) for the selected sample and analysis plan. Whether using a fixed
sample size or sequential design, calculate the sample size for the fixed sample size test.
Use this sample size for comparing alternate plans. In some cases, the number of samples
is determined by economics and budget rather than an evaluation of the required accuracy.
Nevertheless, it is important to evaluate the accuracy associated with a prespecified number
of samples.

Fourth, the analysis plan will describe the statistical evaluation of the data.

In many cases, specification of the sampling and analysis plan will involve
consideration of several alternatives. It may also be an iterative process as the plans are
refined. In cases where the costs of meeting the attainment objectives are not acceptable, it
may be necessary to reconsider those objectives. When trying to balance cost and preci-
sion, decreasing the precision can decrease the sampling and lab costs while increasing the
costs of additional remediation due to incorrectly concluding that the ground water does not
attain the cleanup standard. In this situation, consultation with a statistician, and possibly
an economist, is recommended.

Chapters 8 and 9 offer various statistical methods, depending on attainment
objectives and the sampling plan. Table 4.1 presents the locations in this document where
various combinations of analysis and sampling plans are discuss&
4-5

-------
     CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN
Table 4.1     Locations in this document of discussions of sample designs and analysis
             for ground water sampling
Type of Evaluation
Continuous Data
Discrete Data
Analysis Method
Test of the Mean
Test of Proportions
Sample Design
Fixed Sample Site
Sections 8.3 and 8.4
Section 8.5
Sequential
Sections 9.3 and 9.4
section 9.5
4.3          Other Considerations for Ground Water Sampling and Analysis
             Plans
             At a minimum, all ground water sampling and analysis plans should'
specify:
                    sampling objective;
                    sampling preliminaries;
                    Sample  collection;
                    In-situ field analysis;
                    Sample  preservation and analysis;
                    Chain of custody control;
                    Analytical procedures and quantitation limits;
                    Field and laboratory QA/QC plans;
                    Analysis procedures far any QC data;
                    statistical analysis  procedures; and
                    Interim and final statistics to  be  provided to project personnel.

             For more information on other considerations  in ground water sampling and
analysis, see RCRA  Ground Water Monitoring Technical Enforcement  Guidance Document
(EPA,  1986b).
                                      4-6

-------
         CHAPTER 4: DESIGN OF THE  SAMPLING AND ANALYSIS PLAN

4.4          Summary

              Design of the sampling  and  analysis plan requires specification of attainment
objectives-by program and subject matter personnel. The sampling and analysis objectives
can be refined with the assistance of statistical expertise. The sample design and analysis
plans go together, therefore, the methods of analysis must be con&tent with the sample
design and both must be consistent with the characteristics of the data and the attainment
objectives.

              Types  of sample design include simple random sampling or systematic
sampling, and fixed sample size or sequential sampling. This guidance assumes the data
will be collected  using a systematic sample  when assessing attainment.

              Steps required to  plan an approach to analysis are:
                     Specify the attainment  objectives;
                     Develop the analysis  plan in  conjunction with the sampling plan,
                     Determine the  appropriate sample size; and
                     Describe how the resulting data will  be  evaluated.
                                       4-7

-------
CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN
                       4-8

-------
5. DESCRIPTIVE STATISTICS AND HYPOTHESIS TESTING
This chapter introduces the reader to some basic statistical procedures that
can be used to both describe (or characterize) a set of data, and to test hypotheses and make
inferences from the data. The procedures use the mean or a selected percentile from a
sample of ground water measurements along with its associated confidence interval. The
confidence interval indicates how well the population (a actual) mean on percentile can be
estimated from the sample mean or percentile. These parameter estimates and their
confidence intervals can be useful in communicating the current status of a clean up effort.
Methods of assessing whether the concentrations meet target levels are useful for evaluating
progress of the remediation. The statistical procedures given in this chapter arc called
"parametric" procedures. These methods usually assume that the underlying distribution of
the data is known. Fortunately, the procedures perform well even when these assumptions
arc not strictly true; thus they are applicable in many different field conditions (see
Conover, 1980). The text notes situations in which the statistical procedures are sensitive
to violations of these assumptions. In these cases, consultation with a statistician is
recommended.

Calculations of means, proportions, percentiles, and their corresponding
standard errors and their associated confidence intervals (measures of how precise these
estimated means, proportions, or percentiles are) will be described. The statistics and
inferential procedures presented in this chapter are appropriate only for estimating short-
term characteristics of contaminant levels By "short-term characteristics" we mean
characteristics such as the mean or percentile of contaminant concentrations during the fixed
period of time during which sampling occurs. For example, data collected over a one year
period can be used to characterize the mean contaminant concentrations during the year.
Procedures for estimating the long-term mean and for assessing attainment arc discussed in
Chapters 8 and 9. The distinction between the methods of this chapter and those given in
Chapters 8 and 9 is that inferences based on short-term methods apply only to the specified
period of sampling and not to future points of time. The procedures discussed in this
chapter can be used in any phase of the remedial effort; however, they will be most 'useful
during treatment, as indicated in Figure 5.1. For a further discussion of short- versus
long-term tests, see Section 2.3.2.
5-1

-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Figure 5.1 Example scenario for contaminant measurements during successful remedial
action
Much of the material on means, percentiles, standard errors and confidence
intends has been previously presented in Volume I of this series of guidance documents.
To avoid duplication, the discussion of these topics in this chapter is limited to the main
points. The reader should refer to Volume I (Section 6.3 and 7.3) for additional details.
Some Notations and Definitions

Unless stated otherwise, the symbols xi, X2,.... xi,..., XN will be used in
-this manual to denote the contaminant concentration measurements for N ground-water
samples taken at regular intervals during a specified period of time. The subscript on the
x's indicates the time order in which the sample was drawn; e.g., Xj is the first (or oldest)
measurement while XN is the Nth (or latest) measurement. Collectively, the set of x's is
referred to as a data set, and, in general, x; will be used to denote the il measurement in the
data set.

The data set has properties which can be summarized by individual
numerical quantities such as the sample mean, standard deviation or percentile
(including the median). In general, these numerical quantities are called sample
statistics. The sample mean or median provides a measure of the central tendency of the
data or the concentration around which the measurements cluster. The sample standard
deviation provides a measure of the spread or dispersion of the data, indicating whether the
5-2

-------
CHAPTER 5: DESCRIPTIVE STATISTICS

sample data are relatively close in value or somewhat spread out about the mean. The
sample variance is the square of the standard deviation. The computational formulas for
these quantities arc given in subsequent sections.

As one of many possible sets of samples which could have been obtained
from a ground water well, the mean, standard deviation, or median of the observed sample
of measurements, xb x2, . . . , XN, represent just one of the many possible values that could
have been obtained. Different samples will obviously lead to different values of the sample
mean, standard deviation or median. This sample-to-sample variability is referred to as
sampling error or sampling variability and is used to characterize the precision of
sample-based estimates.

The precision of a sample-based estimate is measured by a quantity known
as the standard error. For example, an estimate of the standard error of the mean will
provide information on the extent to which the sample mean can be expected to vary among
different sets of samples, each set collected during the same sample collection period. The
standard error can be used to construct confidence intervals. A confidence interval
provides a range of values within which we would expect the true parameter value to lie
with a specified level of confidence. Statistical applications requiring the use of standard
errors and confidence intervals are described in detail in the sections which follow. The
standard error differs from the standard deviation in that the standard deviation measures
the variability of the individual observations about their mean while the standard error
measures the variability of the sample mean among independent samples.

Throughout the remainder of this document, certain mathematical symbols
will be used. For reference, some of the frequently-used symbols are summarized in
Table 5.1.

Finally, note that the equations that follow assume that there are no missing
observations. If there arc relatively few missing observations (i.e., five percent or less of
the data set have missing data for the chemical measurement under consideration), the
ground-water samples with missing data should be deleted from the data set. In this case,
all statistics should be calculated with the available data, where the "sample size" now
corresponds to the number of samples which have non-missing concentration values.
However, if more than five percent of the data arc missing, a statistician should be
5-3

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS


consulted. Additional comments regarding the treatment of the missing values will be

given in the sections where  specific statistical  procedures  are being  discussed.



Table 5.1    Summary of notation used in Chapters 5 through 9
   Symbol
                           Definition
     m
     N
     Of


     Cs
 Contaminant measurement for the ith ground water sample. For
 measurements reported as below detection, X; = the detection limit.

 In the discussion of regression, the dependent variable, often the
 sample collection time, sometimes the sample collection time after a
 transformation.

 The number of years for which data were collected (usually the
 analysis will be performed with data obtained over full  year periods)

 The number of sample measurements per year (for monthly data, n =
 12; for quarterly data, n = 4). This is also referred to as the number of
 "seasons"  per year

 The total number of sample measurements (for data obtained over full
 year periods with no missing values, N = nm)

 An alternative way of denoting  a contaminant measurement,  where k  =
 1,2, ,,,, m denotes the year; and j = 1, 2,. . ., n denotes the sampling
 period (season) within the year. If there are no missing values, the
 subscript for xjk is related to the subscript for X; in  the following
 manners: i = (k-l)n + j.

 The  mean (or average) of the N ground water measurements.

 The  variance of the N ground water  measurements.

 The  standard deviation of the N ground water measurements.

 The  standard error of the mean (this is calculated differently for long
 and short term tests).

 The degrees of freedom associated with the standard error of an
 estimate.

The cleanup standard relevant to the ground water and the contaminan
 being rested.
                                       5-4

-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Table 5.1 summary of notation used in Chapters 5 through 9 (continued)
Symbol
Definition
P

PO
Pi
The "true" but unknown proportion of the ground water with
contaminant concentrations greater than the cleanup Standard.

The criterion for defining whether the sample area is clean or
contaminated using proportions. According to the attainment
objectives, the ground water attains the cleanup standard if the
proportion of the ground water samples wrth contaminant
concentrations greater than the cleanup standard is less than P0 i.e.,
the ground water is clean if P
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

5.1 Calculating the Mean, Variance, and Standard Deviation of the Data

The basic equation presented in Box 5.1 for calculating the mean and
variance (or standard deviation) for a sample of data can be found in any introductory
statistics text (e.g., Sokal and Rohlf, 1981 or Neter, Wasserman, and Whitmore, 1982).
Box 5.1
calculating Sample Mean, variance and standard Deviation

Designate the individual data values from a sample of N observations as xlt
x2,.... XN. The sample mean (or arithmetic average) of these observations,
indicated by X, is given by

(5.1)

The equation for die sample variance, s2, is
e2 2] N W ,.-
s ~ ^^^^^^^^^ * KJ 1 (5.2)
N-l N ' !

The corresponding equation for the standard deviation of the data is
(5.3)

Both the variance and standard deviation have N-l degrees of freedom.
The mean and standard deviation are descriptive statistics that provide
information about certain properties of the data set. The mean is a measure of the
5-6
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

concentration around which the individual measurements cluster (the location central
tendency). The standard deviation (or equivalently, the variance) provides a measure of the
extent to which sample data vary about their mean.

Note that samples with missing data should be excluded from these
calculations, in which case N equals the number of samples with non-missing
observations. If more than five percent of the data have missing values, consult a
statistician.

The term, "Degrees of Freedom," denoted by Df, can be thought of as a
measure of the amount of information used to estimate the variance (or standard deviation)
and thus reflects the precision of the estimate. For example, the variance and standard
deviation calculated from formulas (5.2) and (5.3). respectively, are based on "N-l degrees
of freedom." For other estimates of variance (e.g., see Section 5.2.2 or 5.2.4). the
associated degrees of freedom may be different. The degrees of freedom is used in
calculating confidence intervals and performing hypothesis tests.

5.2 Calculating the Standard Error of the Mean

The standard error of the mean (denoted by s,) provides a measure of the
precision of the mean concentration obtained from ground-water samples that have been
collected over a period of time. The standard error of a statistic (e.g., a mean) reflects the
degree to which that statistic will vary from one randomly selected set of samples to another
(each of the same size). Small values of s, indicate that the mean is relatively precise,
whereas large values indicate that the mean is relatively imprecise.

A number of different formulas are available for calculating the standard
error of the mean. The appropriate formula to use depends on the behavior of contaminant
measurements over time and the sampling design used for sample collection. Four
methods of calculating the standard error and the conditions under which they are
applicable are discussed below. Care should be taken in each case to insure that an
appropriate estimation formula for the standard error is chosen. Appropriate formulas
should be decided on a site-by-site basis.
5-7
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

General rules for the selection of the formula for calculating the standard

error of the mean include:

If the ground water samples are collected using a random sample,
use the formulas in section 5.2.1 and Box 5.2.

If the ground water samples are collected using a systematic sample:

Use the formulas in Section 5.2.4 and Box 5.6 unless there
are no obvious seasonal patterns or the serial correlations in
the data are not significant.

Use the formulas in Section 5.2.2 and Box 5.3 if there are
obviously no seasonal patterns in the data however the data
might be correlated.

Use the formulas in Section 5.2.3 and Box 5.4 if there are
seasonal patterns in the data and serial correlations in the
residuals are not significant.

Use the formulas in Section 5.2.1 and Box 5.2 if there are
obviously no seasonal patterns in the data and serial
correlations in the data are not significant.

If there are trends in the data consider using regression
methods (Chapter 6). If regression methods are not used
and the trends are small relative to the variation of the data,
the methods using differences (Sections 5.2.2 and 5.2.4) are
preferred over the other methods.

Sections 5.3 and 5.6 discusses procedures for estimating the serial

correlation and statistical tests for determining if it is significant.
5.2.1 Treating the Systematic Observations as a Random Sample

The simplest method of estimating the standard error is to treat the

systematic sample as a simple random sample (see Section 4.1). In this case, the standard
error of the mean (denoted by sx) is given by the equations in Box 5.2. Formula (5.4) will

provide a reasonably good estimate of the standard error if the contamination is distributed

randomly with respect to time. The formula may overstate the standard error if there are

trends in contamination over time, seasonal patterns or if the data are serially correlated.
5-8
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Box 5.2
Calculating the Standard Error Treating the Sample
as a Simple Random Sample
where s is the standard deviation of the data as computed from equation
(5.3) and N is the number of non-missing observations. Equation (5.4) is
equivalent to
N(N-l) (5.5)
The degrees of freedom for this estimate of the standard error is N-l.
5.2.2 Estimates From Differences Between Adjacent Observations

Another method in common use is based on overlapping pairs of
consecutive observations. That is, observation 1 is paired with observation 2.2 with 3, 3
with 4, and so on. This method often gives a more accurate estimate of the standard error
if the serial correlation between successive observations is high. The computational
formula for this estimate of the standard error is given in Box 5.3 (e.g., see Kish, 1965,
page 119 or Wolter, 1985, page 251).

If the data are independent, that is if the samples are collected using a
random sample or if the data have no seasonal patterns or serial correlations, the standard
error calculated using equation (5.6) will be less precise than that using equation (5.4).
Since most statistics text books assume that the data are independent, these text books
present only equation (5.4) for estimating the standard error of the mean. However, when
using a systematic sample, the data are rarely independent. When the data are not
independent, equation (5.4) may over estimate the standard error of the short term mean.
On the other hand, equation (5.6) is preferred because it provides a less biased estimate of
the standard error of the short-term mean. Calculation of the standard error using the
differences between adjacent observations, equation (5.6), is not appropriate for estimating
5-9
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

the standard error of a long-term mean. Because systematic samples and short term means
(i.e., the mean of the limited population being sampled) are often of interest in survey
sampling, equation (5.6) is more commonly used in the analysis of sample surveys.
Box 5.3
Calculating the Standard Error Using Estimates Between Adjacent
VI (Xj - Xj.l)2
**
2N(N-1)
(5.6)
The number of degrees of freedom for the standard error given by (5.6) is
approximately -y, as suggested by DuMouchel, Govindarajulu and
Rothman (1973). When using this formula, round the approximate degress
of freedom down to the next smallest integer.
We suggest that this method of successive differences using overlapping
pairs be used to estimate the standard error of the mean unless there are obvious seasonal
patterns in the data, or seasonal patterns are expected. If there are seasonal patterns or
trends in the data, equation (5.6) will tend to overestimate the standard error. If the sample
data reflect seasonal variation, the method for computing the standard error discussed in the
next section should be employed.

5.2.3 Calculating the Standard Error After Correcting for Seasonal

The formulas given in the preceding sections for calculating the standard
error are are appropriate for data exhibiting seasonal variability. Seasonal variability is
generally indicated by a regular pattern that is repeated every year. For example,
Figure 5.2 shows 16 chemical observations taken at quarterly intervals. Notice that
beginning with the first observation, there is a fairly obvious seasonal pattern in the data.
That is, within each year, the first quarter observation tends to have the largest value, while
5-10
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

the third quarter observation tends to have the smallest value. Over the year, the general
pattern is for the concentration to start at a high value, decrease in the second quarter,
decrease again in the third quarter, and then in the fourth quarter..

Figure 5.2 Example of data from a monitoring well exhibiting as a seasonal pattern
8-

2
0 2
4 6 8 10 12 14 16 18
Tim* (Quarter)
When the data exhibit regular seasonal patterns, the seasonal means should
be calculated separately and then used to "adjust" the sample data. Specifically, let xit
denote the observed concentration for the ground water sample taken from the jth time point
in year k. Let n be the number of "seasons" in a seasonal cycle. Note that if data arc
collected every month, then we have n = 12 and j = 1, 2, . . . , 12. However, if data arc
collected quarterly, then we have n = 4 and j = 1,2,3,4. In general, let j = 1, 2,. . . , n;
and k = 1, 2,. . . , nij, where nij is the number of non-missing observations that arc
available for season j. Note that nij will equal m (the number of years) far all j (i.e., for all
seasons) unless some data arc missing. Even if the seasonal effects arc relatively small, it
is recommended that the seasonal means be subtracted from the sample data. The presence
of "significant" seasonal patterns can be formally tested by means of analysis of variance
(ANOVA) techniques. A statistician should be consulted for more information about these
tests.
5-11
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

The equations for the j seasonal average, the average of the nij (non-
missing) sample observations for season j, and the sample residual after correcting for the
seasonal means are given in Box 5.4. Additional discussion of methods for adjusting for
Seasonality can be found in Statistical Analysis of Ground-Water Monitoring Data at RCRA
Facilities (EPA, 1989b).
Box 5.4
Calculating Seasonal Averages and Sample Residuals
The jth seasonal average is:

where nij is the number of non-missing observations available for season j.
The sample residual after correcting for the seasonal means is defined by

Cjk**fc- *i (5.8)
By subtracting the estimated seasonal means from the measurements, the
resulting values, Ejk (or residuals), will all have an expected mean of zero and the variation
of the 6jk about the value zero reflects the general variation of the observations. Using the
residuals calculated from formula (5.8). the standard error of the mean can be calculated
from the equations in Box 5.5 (e.g., see Neter, Wasserman, and Kutner, 1985, pages 573
and 539). The term s^ is referred to as the mean square error and is standard output in
many statistical computer packages (e.g.. see Appendix E for details on using SAS to
calculate the relevant statistics).
5-12
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Box 5.5
Calculating the Standard Error After Removing Seasonal Averages

The standard error based on the residuals resulting from removing the
seasonal averages is:
j-1 k-1 J*

s»" V W-n> (5.9)

where
n
~ DJ. (5.10)

The degrees of freedom associated with the standard error is Df = N-n.

Note that equation (5.9) can also be written as:
(5.11)

where
n ">
z
2 J-1
N.n
The estimate of S* above is the same as the mean square error when using
one-way analysis of variance.
5.2.4 Calculating the Standard Error After Correcting for Serial
Correlation
If the serial correlation of the seasonally adjusted residuals is significant (see

Section 5.6), the following formula in Box 5.6 should be used to compute the standard

error of the mean, s,.
5-13
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Box 5.6
Calculating the Standard Error After Removing Seasonal Averages
The-standard error based on the residuals resulting from removing the
seasonal averages is:
VI (ei - ei.i
*2 _ „
2N(N-1)
I)2
(5.13)
The degrees of freedom associated with the standard error given by formula
(5.13) is approximately Df = ' g'n'. When using this formula, round the
approximate degress of freedom down to the next smallest integer. This
equation results from applying equation 5.6 to the residuals from equation
5.8.
5.3 Calculating Lag 1 Serial Correlation

The serial correlation (or autocorrelation) measures the correlation of obser-
vations separated in time. Consider the situation where the ground water concentrations are
distributed around an average concentration, with no long-term trend or seasonal patterns.
The ground water measurements will fluctuate around the mean due to historic fluctuations
in the contamination events and the ground water flows and levels. Even though the
measurements fluctuate around the mean in what may appear to be a random pattern, the
measurements in ground water samples taken close in time (such as on successive days)
will typically be more similar than measurements taken far apart in time (such as a year
apart). Therefore measurements taken close together in time arc more highly correlated
than measurements taken far apart in time. The extent to which successive measurements
arc correlated if measured by the serial correlation. The presence of significant serial corre-
lation affects the standard error of the mean.

If serial correlation is present in the data, statistical methods must be
selected which will provide correct results when applied to correlated data Some of the
statistical procedures described in Chapters 5, through 9 require the calculation of the serial
correlation. In general, serial correlations need not be based on observations which
immediately follow one another in time sequence ("lag 1" serial correlations). Serial
correlations may be defined that are 2 time periods, 3 time periods, etc., apart. These are
5-14
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

referred to as "lag 2", "lag 3", or in general, "lag k" serial correlations. Serial correlations
are discussed more fully in Gilbert (1987). page 38 or Box and Jenkins (1976). page 26.
only "lag 1" serial correlations will be considered in this document,

To calculate the serial correlation, first compute the seasonally adjusted
residuals, 6jk; using the procedure described in Section 5.2.3. Order the 6jk,'s
chronologically and denote the ith time-ordered residual by e; The serial correlation
between the residuals can then be computed as shown in Box 5.7 (see Neter, Wasserman,
and Kutner, 1985, page 456).
Box 5.7
Calculating the Correlation from the Residuals After Removing
Seasonal Averages
The sample estimate of the serial correlation of the residuals is:
N
Where Cj, i» 1,2, ...,N are the residuals after removing seasonal averages,
in the time order in which the samples were collected.
The serial correlation between successive observations, computed from
formula (5.14), depends on the time interval between collection of ground-water samples.
For example, for quarterly data, $bbs represents the correlation between measurements that
are taken three months apart, while, for monthly data, $0j,s represents the correlation
between measurements that are taken one month apart. Correlations between observations
taken at different intervals will generally be different For estimating sample sizes (Section
5.10) it will be convenient to work with the monthly serial correlation, i.e., the correlation
between observations that are one month apart If the data are not collected at monthly
intervals, the formula in Box 5.8 can be used to convert fa» to a monthly serial correlation
$ (see Box and Jenkins, 1970, for more details). Equation (5.15) estimates the monthly
correlation from a correlation based on observations separated by t months. For example,
for a sample correlation calculated from quarterly data, t - 3. Equation (5.15) is based on
5-15
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

assumptions about the factors which affect the correlations in the measurements. These
assumptions become more important as the frequency at which the observations are
collected differs from monthly (see Box and Jenkins, 1970, page 57 and Appendix D).
Box 5.8
Estimating the Serial Correlation Between Monthly Observations
The estimated serial correlation between monthly observations based on a
sample estimate of the serial correlation between observations separated by t
months is:
I
(5-15)
With data from multiple wells, the estimates of serial correlations can be
combined across wells to provide a better estimate when the following conditions are met:
• The contaminant concentration levels in the wells are similar,
• The wells are sampled at the same frequency;
• The wells are sampled for roughly the same period of time; and
• .The wells are geographically close.
Under these conditions, the combined estimate of serial correlation is calculated by
averaging the estimates calculated for each well.
5.4 Statistical Inferences: What can be Concluded from Sample
Data
The first two sections of this chapter dealt with the computation of several
types of measures that can be used to characterize the sample data, means, standard errors,
and serial correlation coefficients. In addition to characterizing or describing one's data
with summary statistics, it is often desirable to draw conclusions from the data, such as an
answer to the question: Is the mean concentration less than the cleanup standard?
5-16
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

A general approach to drawing conclusions from the data, also referred to as
making inferences from the data, uses a standard structure and process for making such'
decisions referred to as "hypothesis testing" in statistical literature. It can be outlined as
follows,

1. Make an assumption about the concentrations which you would like to
disprove (e.g.. the average population measure of a contaminant is greater
than the cleanup standard of 2.0 ppm). This cleanup standard represents
your initial or null hypothesis about the current situation.
2. Collect a set of data, representing a random sample from the population of
interest.
3. Construct a statistic from the sample data. Assuming that the null
hypothesis is true, calculate the expected distribution of the statistic.
4. If the value of the statistics is consistent with the null hypothesis, conclude
that the null hypothesis provides an acceptable description of the present
situation.
5. If the value of the statistic is highly unlikely given the assumed null
hypothesis, conclude that the null hypothesis is incorrect.

Of course, sample data may occasionally provide an estimate that is
somewhat different from the true value of the population parameters being estimated. For
example, the average value of the sample data could be, by chance, much higher than that
of the full population. If the sample you happened to collect was substantially different
from the population, you might draw the wrong conclusion. Specifically, you might
conclude that the value assumed in the null hypothesis had changed when it really had not.
This false conclusion would have been arrived at simply by chance, by the luck of
randomly selecting a particular set of observations or data values. The probability of
incorrectly rejecting the null hypothesis by chance can be controled in the hypothesis test.

If the chance of obtaining a value of a test statistic beyond a specified limit
is, say, 5% if the null hypothesis is true, then if the sample value is beyond this limit you
have substantial evidence that the null hypothesis is not true. Of course, 5% of the time
when the null hypothesis is true a test statistic value will be beyond that specified limit.
This probability of incorrectly rejecting the null hypothesis is generally denoted by the
symbol a (alpha) in statistical literature. The person(s) making the decision specify the risk
of making this type of error (often referred to as a Type I error in statistical literature) prior
to analyzing the data. If one wishes to be conservative, one might choose a=.01, allowing
5-17
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

up to a one percent chance of incorrectly rejecting the null hypothesis. With less concern
about this type of error, one might choose o=.l. A common choice is a=.05.

Many of the test procedures presented below use confidence intervals. A
confidence interval shows the range of values for the parameter of interest for which the
test statistics discussed above would not result in the rejection of the null hypothesis.
5.5 The Construction and Interpretation of Confidence Intervals
about Means

A confidence intend is a range of values which will include the population
parameter, such as the population mean, with a known probability or confidence. The
confidence interval indicates how closely the mean of a sample drawn from a population
approximates the true mean of the population. Any level of confidence can be specified for
a confidence interval. For example, a 95 percent confidence interval constructed from
sample data will cover the true mean 95 percent of the time. In general, a 100( 1-a) percent
confidence interval will cover the true mean 100(1-a) percent of the time. As indicated
above the value of a, the probability of a Type I error, must be decided upon and is usually
chosen to be small; e.g., 0.10, 0.05, or 0.01. The general form of a confidence interval
for the mean is shown in Box 5.9.
Box 5.9
General Construction of Two-sided Confidence Intervals
A two-side confidence interval for a mean is generally of the form:

*-t*s* to * + t*s* (5.16)
In equation (5.16) the product t*s* represents the distance (in terms of
sample standard enors) on either side of the sample average that is likely to include the true
population mean. One determines t from a table of the t-distribution giving the probability
that the ratio of (a) the difference between the true mean and the sample mean to (b) the
sample standard error of the mean exceeds a certain value. To determine t, you actually
5-18
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

need to determine two parameters: a, the probability of a Type I error, and Df, the number
of degrees of freedom associated with the standard error. Thus, t is usually expressed as
ti-oj* and the appropriate value of ti-aj>f can be found from a table of the critical values
of the t distribution using the row and column associated with the values of 1-a and Df (see
Appendix A).

Given below are the formulas for one- and two-sided confidence limits for a
population mean (Boxes 5.10 and 5.11). Here, the population (or "true") mean is the
conceptual average contamination over all possible ground-water samples taken during the
specified time period. The one-sided confidence interval (establishing an acceptable limit
on the range of possible values for the population mean on only one side of the sample
mean) can be used to test whether the ground water in the well for the (short-term) period
of is significantly less than the cleanup standard The two-sided version of the
confidence interval can be used to characterize the ground-water contamination levels
during the period of sampling.
Box 5.10
General Construction of One-sided Confidence Intervals
The upper one-sided confidence limit for the mean is given by:

"uo - «+ti-aJ>fS» (5.17)
Box 5.11
Construction of Two-sided Confidence Intervals
The corresponding two-sided confidence limits are given by:

(5.18)
and
(5.19)
In equations (5.17) to (5.19), 1-a is the confidence level associated with the
interval, x is the computed mean level of contamination; sx is the corresponding standard
5-19
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

error computed from the appropriate formula in Section 5.2, and Df is the number of
degrees of freedom associated with s,. The degrees of freedom (Df) associated with the

standard error depend on the particular formula used. Table 5.2 summarizes the various
standard error formulas, their corresponding degrees of freedom, and the conditions under
which they should be used. The appropriate value of ti.a>r>f can be obtained from
Appendix Table A.I. Note that for two-sided intervals, the t-value used is ti^jx rather

than ti.a*
A \I w?
«• s* = y 2N(N-1)
N-l Data exhibit no seasonal
patterns and no serial
correlation (Section 5.2.1)
2N Data exhibit no seasonal
3 patterns, but may be serially
correlated (Section 5.2.2)
N-n Data exhibit a seasonal
pattern, but no serial
correlation (Section 5.2.3)
2(N-n) Seasonally-adjusted
3 residuals exhibit serial
correlation (Section 5.2.4)
The upper one-sided confidence limit u^, defined in equation (5.17) can be

used to test whether the average contaminant levels for ground-water samples collected
5-20
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

over a specified period of time is less than the cleanup standard, Cs (see Box 5.12).
Although the rules indicated beloaLcan be used to monitor cleanup piumess. they should
not be used to assess attainment of the cleanup standard^ Procedures for assessing
attainment are given in Chapters 8 and 9.
Box 5.12
Comparing the Short Term Mean to the Cleanup Standard Using

For short-term means, the decision rule to be used to decide whether or not
the ground water is less than the cleanup standard is the following:
If jiuo < Cs, conclude that the short-term mean ground-water contaminant
concentration is less than the cleanup standard (i.e., \i < Cs).
If mja ^ Cs, conclude that the short-term mean ground-water contaminant
concentration exceeds the cleanup standard (i.e., M. £ Cs)
5.6 Procedures for Testing for Significant Serial Correlation

Different statistical methods may be required if the data have significant
serial correlations. The serial correlation can be estimated using the procedures in Box 5.7.
The Durbin-Watson test and the approximate large sample test in sections 5.6.1 and 5.6.2
can be used to test if the observed serial correlation, 3^, is significantly different from
zero.

5.6.1 Durbin-Watson Test

The discussion here on determining the existence of serial correlation in the
data assumes the knowledge of confidence intervals and hypothesis testing. Sections 5.4
and 2.3.4 provide a discussion of these concepts, if the reader would like to review them.

If there is no serial correlation between observations, the expected value of
$obs will be close to zero. However, the calculated value of ^ is unlikely to be zero even
if the actual serial correlation is zero. The Durbin-Watson statistic can be used to test
whether the observed value of $^5 is significantly different from zero. To perform the test
5-21
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

(e.g., see Neter, Wasserman, and Kutner, 1985, page 450). compute the statistic D shown

in Box 5.14.
Box 5.13
Example: Calculation of Confidence Intervals

Suppose that 47 monthly ground-water samples were collected over a
period of slightly less than 4 years. The measurements for three of the
samples were below the detection limit and were replaced in the analysis by
the detection limit Based on these data, the overall mean is .33. Since the
data did not exhibit any seasonal patterns but was thought to be serially
correlated, equation (5-6) was used to compute the standard error of the
mean; i.e., s, = .1025. The degrees of freedom associated with the
standard error is 2N/3 = 2(47)/3 = 31. Hence, for a two-sided 99 percent
confidence interval, a = 0.01 and 100531 = 2.75 from Appendix Table A.I.

The required confidence interval for the mean goes from x - t\.a/2 of sx to
* + 'l-ofl.Df s* »•«•• from P3 -2.75(. 1025)] to [.33 + 2.75(.l625)] or
from .048* to .612 ppm.

For a one-sided 99 percent confidence interval, a - 0.01 and t OU1 = 2.457
from Appendix Table A.I. The corresponding one-sided confidence
interval goes from zero to

* -33 + 2.457(.1025) ».58 ppm.

Since the cleanup standard is Cs - 0.5 ppm it is concluded that for the
period of observation, there is insufficient evidence to conclude with
confidence that the true mean ground-water concentration is less than the
cleanup standard. This is the case even though the sample mean happens to
be less than the cleanup standard. There is enough variability in me data
that a true mean greater than 0.5 ppm cannot be ruled out.
Box 5.14
Calculation of the Durbin-Watson Statistic
N

^jj . (5.20)

i«l '
5-22
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

If D < dij, where du is the upper "critical" value for the test given in
Appendix Table A-6 of the book by Nctcr, Wassennan, and Kutner, 1985 (pages 1086-*
1087), conclude that there is a significant serial correlation. If D £ dij, conclude that there
is no serial correlation1. The Durbin-Watson statistic D is standard output in many
regression packages.

5.6.2 An Approximate Large-Sample Test

If N > 50, the following approximate test can be used in place of the
Durbin-Watson test (e.g., see Abraham and Ledolter, 1983, page 63).
Box 5. 15
Large Sample Confidence Interval for the Serial Correlation
Compute the lower and upper limits, ^ and +u, defined by

(5.21)
and
" (522)
If the interval from $L to *u do*5 Hfit contain the value 0, conclude that the serial correlation
is significant Otherwise, conclude that die serial correlation is not significant

5.7 Procedures for Testing the Assumption of Normality

Many of the, procedures discussed in this manual assume that the sampling
and measurement error follow a normal distribution. In particular, the assumption of
normality is critical for the method of tolerance intervals described later in Section 5.8.
1 The decision rule used here is somewhat different from the usual Durbin-Watson test described in most
text books. For the applications given in this manual, the recommended decision rule results in deciding
that autocorrelation exists unless there is strong evidence to the contrary. Also, the particular value of du
to use depends on N and "p-1". where p is the number of parameters in the fitted model. See section
for an example

5-23
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

Thus, it will be important to ascertain whether the assumption of normality holds. Some
methods for checking the normality assumption are discussed below.

5.7.1 Formal Tests for Normality

The statistical tests used for evaluating whether or not the data follow a
specified distribution are called "goodness-of-fit tests."* The computational procedures
necessary for performing the goodness-of-fit tests that work best with the normal
distribution are beyond the scope of this guidance document. Instead, the user of this
document should use one of the statistical packages that implements a goodness-of-fit test.
SAS (the Statistical Analysis System) is one such statistical package. A good reference for
these tests is the book on nonparametric statistics by Conover (1980). Chapter 6. There axe
many different tests for evaluating normality (e.g. D'Agostino, 1970; Filliben, 1975;
Mage, 1982; and Shapiro and Wilk, 1985). If a choice is available, the Shapiro-Wilk or
the Kolmogorov-Smirnov test with the Lilliefors critical values is recommended.

5.7.2 Normal Probability Plots

A relatively simple way of checking the normality of the data or residuals
(such as those obtained from Box 5.4) is to plot the data or residuals ordered by si/e
against their expected values under normality. Their il expected value will be called EV;.
Such a plot is referred to as a "normal probability plot."

If there are no seasonal effects, the residual e;, is simply defined to be the
difference between the observed value and the sample mean, i.e.,

ej = Xj- X. (5.23)

If seasonal variability is present, the residuals should be calculated from formula (5.8). In
either case, the ith ordered residual, e^, for i * 1,2,..., N, is defined to be the ith smallest
value of the Cj's (that is, e(i) £ e(2) £... £ e(i) £... £ e^), and its expected value is given
approximately by (SAS 1985):
2 These should not be confused with tests for assessing the fit of a regression model which are discussed
later in Chapter 5.

5-24
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
(5.24)

where $„-, is the sUndard deviation of the residuals and z(.) is given by formula
(5.25) below. If formula (5.23) applies ~i.e., no seasonal effects are in evidence- and is
used to compute the residuals, then $„,» s, where s is given by formula (5.3). If formula
(5.8) applies-requiring an adjustment for seasonal effects-and is used to compute the
residuals, then s^ = v where sj is given by formula (5.12). The function z(a) is defined
to be the upper lOOa percentage point of the standard normal distribution and is
approximated by (Joiner and Rosenblatt 1975):

(5.25)

Under normality, the plot of the ordered residuals, e(;), against EV; should
fall approximately along a straight line. An example of the use of normal probability plots
is given in Section 6.X For more rigorous statistical procedures for testing normality, use
the "goodness-of-fit" tests mentioned in Figure 6.17.
5.8 Procedures for Testing Percentiles Using Tolerance Intervals

This section describes a statistical technique for estimating and evaluating
percentiles of a concentration distribution. The technique is based on tolerance intervals
and is not recommended if there are seasonal or other systematic patterns in the data.
Moreover, this procedure is relatively sensitive to the assumption that the data (or
transformed data) follow a normal distribution. If it is suspected that a normal distribution
does not adequately approximate the distribution of the data (even after transformation),
tolerance intervals should not be used. Instead, the procedure described later in Section 5.9
should be used.
5.8.1 Calculating a Tolerance Interval

The Qth percentile of a distribution of concentration measurements is that
concentration value, say XQ, for which Q percent of the concentration measurements are
less than XQ and (100-Q) percent of the measurements arc greater than XQ. For example,
5-25
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

if the value 3.2 represents the 25th percendle for a give population of data, 25% of the data
fall below the value 3.2 and 75% are above it Since the data represent a sample (rather
than the population) of concentration values, it is not possible to determine the exact value
of XQ from the sample data. However, with normally distributed data, a 100(1-a) percent
confidence interval around the desired percentile can be easily computed.

Let x,, x2,.... XN denote N concentration measurements collected during a
specified period of time. As explained in Section 2.3.7, values that are recorded as below
the detection limit should be assigned the minimum detectable value (DL). The sample
mean, x, and the sample standard deviation, s, should initially be computed using the basic
formulas given in Section 5.1.

Given Q and a, the upper 100(1-a) one-sided confidence limit for the true
percentile, XQ, is given by:

*Q « x + ks (5.26)

where k is a constant that depends on n, a, and P0 = (100-Q)/100. The appropriate values
of k can be obtained from Appendix Table A.3. For values not shown in the table, see
Guttman (1970).
5.8.2 Inference: Deciding if the True Percentile is Less than the
Cleanup Standard

The upper confidence interval as computed from equation (5.26) can be
used to test whether the true (unknown) Qth percentile, XQ, for a specified sampling period
is less than a value, Cs. The decision rule to be used to test whether the true percentile is
below Cs is:

If XQ < Cs, conclude that the 0th percentile of ground-water contaminant
concentrations is less than the Cs (i.e., XQ < Cs).
If £Q £ Cs, conclude that the Q* percentile of ground water contaminant
concentrations is not less than Cs and may be much greater than Cs.
5-26
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Box 5.16
Tolerance Intervals: Testing for the 95th Percentile with Lognormal Data

Data for 20 ground-water samples were obtained to determine if the 95th
percentile of the contaminant concentrations observed for a two-year period
was below the cleanup standard of 100 ppm. A false positive rate of one
percent (a = 0.01) was specified for the test The data appeared to follow a
lognormal distribution. Therefore, the logarithms of the data (the
transformed data) were assumed to have a normal distribution and were
analyzed. In the following discussion, x refers to the original data and y
refers to the transformed data. Because the log of the data was used, the
upper confidence interval on the 95th percentile of the data was compared to
the log of the cleanup standard [ln(100)-4.605].

For the transformed data, the sample mean (the average of the logarithms)
is:
_ 72.372 ,.,10
y «-20--3.619

The standard deviation of the transformed observations, s, as calculated
from equation (5.3) is 0.715.

For N = 20, a - .01 and PQ = 5%, k = 2.808 (from Appendix Table A.3).
Finally, £95 can be calculated using equation (5.26):

y95 = 3.619 + 2.808(.715) » 5.627

Since 5.627 is greater than 4.605 (the cleanup standard in log units), it is
concluded that the 95^ true percentile may be greater than Cs.
5.9 Procedures for Testing Proportions

An alternative statistical procedure for testing percentiles is based on the

proportion of water samples that have contaminant levels exceeding a specified value. As

was the case for the method of Section 5.8, this method is not recommended if there are

seasonal patterns in the data. If seasonal variability is present, consult a statistician. The

equations presented in this section apply if the acceptable proportion of contaminated

samples is less thflfl Q|5 and large sample sizes are used.

To apply this test, each sample ground-water measurement should be coded

as either equal to or above the cleanup standard, Cs, (coded as" 1") or below Cs (coded as

"0"). The statistical analysis is based on the resulting coded data set of O's and 1's. This
5-27
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

test can be applied to any concentration distribution (unlike the method of tolerance
intervals which applies only to normally distributed data) and requires only that the cleanup
Standard be greater than the detection limit.

Let xlt x2, .... XN denote N concentration measurements collected during a
specified period of time. Corresponding to each measurement xit define a coded value
yj * 1 if Xj is greater than the cleanup standard and ys = 0, otherwise. The proportion of
samples, p, above the cleanup standard can be calculated using die following equations:

r-I/i (5.27)
i-l

N
Assuming that the observations are independent, the standard error of the
proportion, S, is given by:
SP= V N (5'29)

Formula (5.29) will tend to over estimate the variance if the data have a significant serial
correlation. If the data have significant serial correlations, we can use formula (5.6) with
the x's replaced by the y's. Note that formulas (5.29) and (5.6) should only be used if N
is large; i.e., if N * 10/p and N2 10/(l-p).

5.9.1 Calculating Confidence Intervals for Proportions
For sufficiently large sample sizes (i.e., N £ 10/p and N£ 10/(l-p), i.e. at
least 10 samples with measurements above the cleanup standard and 10 with measurements
below the cleanup standard), an approximate confidence interval may be constructed using
the normal approximation. If there is concern about the sample size N being too small
relative to p, a statistician should be consulted.
5-28
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

For large sample sizes, the one-sided 100(l-o) percent upper confidence
limit is given by:
PUO - P + zi-o S (5-30)
where p is the proportion of ground- water samples that have concentrations exceeding Cs,
and zi.a is the appropriate critical value obtained from the normal distribution (see
Appendix Table A.2).

The corresponding two-sided 100(l-a) percent confidence limits are given
by:

Pu«/2 » P + zi-otfSp (5.31)
and
pLo/2 " P ' rl-o/2 «p (5.32)
where zj.^ is the appropriate critical value obtained from the normal distribution (see
Table A.2). The range of values from P^^ to PUafl represents a 100(1 -a) percent
confidence interval for the corresponding population proportion.
5.9.2 Inference: Deciding Whether the Observed Proportion Meets
the Cleanup Standard

The upper confidence limit as computed from equation (5.30) can be used to
test whether the true (unknown) proportion, P, is less than a specified standard, P0. The
decision rule to be used to test whether the true proportion is below P0 is:

If Pua? < PO conclude that the proportion of ground-water samples with
contaminant concentrations exceeding Cs is less than P0.
If PUO 2 PO, conclude that the proportion of ground-water samples with
contaminant concentrations exceeding Cs may be greater than or
equal to P0.
5-29
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Box 5. 17
Calculation of Confidence Intervals
For 184 ground- water samples collected during an 8-year period, 11
samples had concentrations greater than or equal to the cleanup standard.
The proportion of contamination samples is (equation 5.27):
A one-sided confidence interval has an upper limit of (from equations 5.30):
Assuming a = 0.05 (i.e., 95 percent confidence), Zj.a» 1.645. The
standard error of p determined from formula (5.29) is Sp = 0.0175.
The confidence interval is thus .0000 to .0598 + .0288 or .0000 to .0886.
5.9.3 Nonparametric Confidence Intervals Around a Median

An alternate approach to testing proportions is to test percentiles. For
example, the following two approaches are equivalent: (a) testing to see if less than 50% of
the samples have contamination greater than the cleanup standard and (b) testing to see if
the median concentration is less than the cleanup standard. The method presented in this
section for testing the median can be extended to testing other percentiles, however, the
calculations can be cumbersome. If you wish to test percentiles rather than proportions, or
to test the median using other confidence intervals than are presented here, consultation
with a statistician is recommended

If the data do not adequately follow the normal distribution even after
transformation, a nonparametric confidence interval around the median can be constructed.
The median concentration equals the mean if the distribution is symmetric (see Section
2.5). The nonparametric confidence interval for the median is generally wider and requires
more data than the corresponding confidence interval for the mean based on the normal
distribution. Therefore, the normal or log-normal distribution interval should be used
whenever it is appropriate.
5-30
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

The nonparametric confidence interval for the median requires a minimum

of seven (7) observations in order to construct a 98 percent two-sided confidence interval,

or a 99 percent one-sided confidence interval. Consequently, it is applicable only for the
pooled concentration of compliance wells at a single point in time a for sampling to
produce a minimum of seven observations at a single well during the sampling period.

The procedures below for construction of a nonparametric confidence

interval for the median concentration follow (U.S. EPA, 1989b). An example is presented

in Box 5.19.

(1) Within each well or group of wells, order the N data from least to
greatest, denoting the ordered data by xi, X2,...XN, where x; is the
ith value in the ordered data, lies do not affect the procedure. If
there are ties, order the observations as before, including all of the
tied values as separate observations. That is, each of the
observations with a common value is included in the ordered list
(e.g., 1, 2, 2, 2, 3, 4, etc.). For ties, use the average of the tied
ranks.

(2) Determine the critical values of the order statistics as follows. If the
minimum seven observations is used, the critical values arc 1 and 7.
Otherwise, find the smallest integer, M, such that the cumulative
binomial, distribution with parameters N (the sample size) and p =
0.5 is at least 0.99. Table 5.3 gives the values of M and N+l-M
together with the exact confidence coefficient far sample sizes from
4 to 11. For larger samples, use the equation in Box 5.18.

(3) Once M has been determined, find N+l-M and take as the
confidence limits the order statistics XM and xn+1.M (With the
minimum seven observations, use X[ and x7.)

(4) Inference: Deciding whether the site meets the cleanup standards.

After calculating the upper one-sided nonparametric confidence limit
XM from (3). use the following rule to decide whether the ground
water attains the cleanup standard:

If XM < Cs, conclude the median ground water concentration in the
wells during the sampling period is less than the cleanup standard.

If XM 2 Cs, conclude the median ground water concentration in the
wells during the sampling period is not less than the cleanup
standard.
5-31
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

Table 5.3 Values of M and N+l-M and confidence coefficients for small samples
N
4
5
6
7
8
9
10
11
M
4
5
6
7
8
9
9
10
N+l-M
1
1
1
1
1
1
2
2
Two-sided
confidence
87.5%
93.8%
96.9%
98.4%
99.2%
99.6%
97.9%
98.8%
Box 5.18
Calculation of M
(5.33)
where zo.99 is the 99th peicendle from the normal distribution and equals
2.33. (From Table A.2 in Appendix A)
5-32
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

Table 5.4 Example contamination data used in Box 5.19 to generate nonparametric
confidence interval
Sampling
Dale
Jan. 1

April 1

Julyl

Octl

Wclll
Concentration
(ppm)
3.17
2.32
7.37
4.44
9.50
21.36
5.15
. 15.70
5.58
3.39
8.44
10.25
3.65
6.15
6.94
3.74
Rank
(2)
0)
(11)
(6)
(13)
(16)
(7)
(15)
(8)
(3)
(12)
(14)
(4)
(9)
(10)
(5)
WeU 2
Concentration
(ppm)
3.52
12.32
2.28
5.30
8.12
3.36
11.02
35.05
2.20
0.00
9.30
10.30
5.93
6.39
0.00
6.53
Rank
(6)
(15)
(4)
a)
(11)
(5)
(14)
(16)
(3)
(1.5)
(12)
(13)
(8)
(9)
(1.5)
(10)
5.10 Determining Sample Size for Short-Term Analysis and Other
Data Collection Issues
The discussion in Chapter 4 assumes that the number of ground-water
samples to be analyzed has been previously specified. In general, determination of the
number of samples to be collected for analysis must be done before collection of the
samples. The appropriate sample size for a particular application will depend upon the
desired level of precision, as well as on assumptions about the underlying distribution of
the measurements. Given below arc some guidelines for determining sample size for
estimating means, percentiles and proportions for short term analyses. When assessing
whether remediation has indeed been successful, use the procedures discussed in chapters
8 and 9 to determine the required sample size. Some discussion of various data collection
issues is also offered hue.
5-33
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
Box 5.19
Example of Constructing Nonparametric Confidence Intervals

Table 5.4 contains concentrations of a contaminant in parts per million from
two hypothetical wells. The data are are assumed to consist of 4 samples taken
each quarter for a year, so that 16 observations are available from each
well. The data are not normally distributed, neither as raw data nor when
log-transformed Thus, the nonparametric confidence interval is used. The
Cs is 25 ppm

(1) The 16 measurements are ordered from the least to greatest within
each well separately. The numbers in parentheses beside each
concentration in Table 5.4 arc the ranks or order of the observation.
For example, in Well 1, the smallest observation is 2.32, which has
rank 1. The second smallest is 3.17, which has rank 2, and so
forth, with the largest observation of 21.36 having rank 16.

(2) The sample size is large enough so that the approximation (equation
5.33) is used to find M:

M -y + 1 + 2.33 -y^ - 13.7 = 14

(3) The approximate 95 percent confidence limits are given by the N + 1
- M observation (16 + 1 -14 - 3rd) and the Mth largest observation
(14th). For Well 1, the 3rd observation is 3.39 and the Mth
observation is 10.25. Thus the confidence limits for Well 1 are
(3.99, 10.25). Similarly for Well 2, the 3rd observation and the
14th observation arc found to give the confidence interval (2.20,
11.02). Note that for Well 2 there were two values below the
detection. These were assigned a value equal to the detection limit
and received the two smallest ranks. Had there been three or more
values below the detection, the lower limit of the confidence interval
would have been the limit of detection because these values would
have been the smallest values and so would have included the third
order statistic.

(4) Neither of the two confidence intervals' upper limit exceeds the
cleanup standard of 25 ppm. Therefore, the short-term median
ground water concentrations arc less than the cleanup standard.
5.10.1 Sample Sizes for Estimating a Mean

In order to determine the sample size for estimating a mean, some
information about the standard deviation, o, (or equivalently, the variance o2) of the
5-34
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

measurements of each contaminant is required This parameter represents the underlying
variability of the conceptual population of contaminant measurements. The symbol "A" is
used to denote that a is an estimate of o. In practice, o is either obtained from prior data or
by conducting a small preliminary investigation. Cochran (1977), pages 78-81, discusses
various approaches to determining a preliminary value for d. Some procedures that are
useful in ground-water studies are outlined below.
Use of Data from a Comparable Period

The value o may be calculated from existing data which is comparable to the
data expected from the sampling effort. Comparable data will have a similar level of
contamination and be collected under similar conditions. For calculating the sample size
required for assessing attainment, one may be able to use data on contamination levels for
the wells under investigation from ground-water samples collected during the period in
which steady state is being established. Using the comparable data, the value o may be
calculated using formula (5.3).

Use of Data Collected Prior to Remedial Action

If data from samples collected prior to remediation are available, the
variability of these sample measurements can be used to obtain a rough estimate of a using
the coefficient of variation. The coefficient of variation is defined to be the standard
deviation divided by the mean. Remediation will usually result in a lowering of both the
mean and the standard deviation of contamination levels. In this case, it might be
reasonable to expect the coefficient of variation to remain approximately constant. In this
case, estimates of the coefficient of variation from the available data can be used to obtain a
as follows.

Using this data, let (X) and s represent the sample mean and sample standard
deviation for data collected prior to remedial action, perhaps from a previous study.
Calculate (X) and (s) using the equations in Section 5.1. An estimate o of the standard
deviation when clean up standards are attained can be computed using the cleanup standard,
Cs, where
5-35
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

(5.35)
x

Conducting a Preliminary Study After Remedial Action

The following approach can be used if there are no existing data on
contamination levels from which to estimate a and if there is time to collect preliminary data

before sampling begins.

(1) After achieving steady state conditions (see Chapter 7), collect a
preliminary sample of at least nj = 8 ground-water samples over a
minimum period of 2 years. Determine the contamination levels for
these samples. The larger the sample size and the longer the period
of time over which the samples are collected, the more reliable the
estimate of o. A minimum of four samples per year is recommended
so that seasonal variation will be reflected in die estimate.

(2) From this preliminary sample, compute the estimated standard
deviation, s, of the contaminant levels. Use this standard deviation
as an estimate of o.
Box 5.20
Estimating a from Data Collected Prior to Remedial Action

Suppose that the number of ground-water samples to be taken from a
monitoring well prior to remedial action was limited to 10. The
concentrations of total PAH'S from the samples axe:

0.24, 2.93, 3.09, 0.14, 0.60, 4.20, 3.81, 2.31, 1.11, and 0.07

Using equations (5.1) and (5.3), the mean concentration is X = 1.85 ppm
and the standard deviation of the measurements is s = 1.60 ppm.

With a cleanup standard of .5 ppm, the value of a to use for determining
sample size can be obtained from:

A Cs * s .5 * 1.60 .,
a = ~ 1.85 s'43
5-36
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

A Rough Approximation of the Standard Deviation

If there are no existing data to estimate o and a preliminary study is not
feasible, a very rough approximation for d can still be obtained. The approximation is
rough because it is based on speculation and judgments concerning the range within which
the ground-water measurements are likely to fall. Because the approximation is based on
very little data, it is possible that the sample sizes computed from these approximations will
be too small to achieve the specified level of precision. Consequently, this method should
only be used if no other alternative is available.

The approximation is based on the fact that the range of possible ground-
water measurements (i.e., the largest such value minus the smallest such value) provides a
measure of the underlying variability of the data. Moreover, if the frequency distribution of
the ground-water measurementsof interest is approximately bell-shaped, then virtually all
of the measurements can be expected to lie within three standard deviations of the mean. In
this case, if R represents the expected range of the data, an estimate of a is given by

8-£. (5.36)

If the data are not bell-shaped, the alternative (conservative) estimate a » R/5 should be
used.

Formula for Determining Sample Size for Estimating a Mean

The equations for determining sample size require the specification of the
following quantities: Cs, ulf a, 0, a. Given these quantities, the required sample size can
be computed from the following formula (e.g., see Neter, Wasserman, and Whitmore,
1982, page 264 and Appendix F):
where z^ and zl^L are the critical values for the normal distribution with probabilities of
1-a and 1 -P (Table A.2) and the factor of 2 is empirically derived in Appendix F.
5-37
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

Strictly speaking, formula (5.37) applies to simple random sampling.
However, the standard error of a mean based on a systematic sample will usually be less
than or equal to the standard error of a mean based on a simple random sample of the same
size. Therefore, using the sample size formula given above may provide greater precision
than is required.
Box 5.21
Example of Sample Size Calculations
Following the example in Box 520, suppose that it is desired to be
able to detect a difference of .2 ppm from the cleanup standard of .5 ppm
(Cs = .5, m - .3) with a power of .80 (i.e., p = .20). Also suppose that d
= .43 and o* .01.
From tables of the cumulative normal distribution (Appendix Table
A.2), we find that zt_a * 2.326 and zt_p = 0.842. Then using
formula (5.37)
n (.43)2 (2.236 + .842)2 ,_ft
n* (.5-.3)2 <- + 2-45.8
Rounding up, the sample size is 46.
5.10.2 Sample Sizes for Estimating a Percentile Using Tolerance
Intervals
To determine the required sample size for tests based on the procedure
described in Section 5.8, the following terms need to be defined: P0,Pi,o, p (e.g., see
Volume 1, Section 7.6). Once these terms have been established, the following quantities
should be obtained from Appendix Table A.2:

Z!_P, the upper ^-percentage point of a normal distribution;
Zi.«, the upper a-percentage point of a normal distribution;
z1 _p0, the upper P0-percentage point of a normal distribution; and

ZJ.PJ, the upper Prpercentage point of a normal distribution.
5-38
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

The sample size necessary to meet the stated objectives is then (see

Guttman, 1970):
JZ1-B * *l-al*

1Z1-P0- Z1-P,J
(5.38)
Box 5.22
Calculating Sample Size for Tolerance Intervals

PCB's have contaminated the ground water near a former industrial
site. The site managers have decided to use the procedures of Section 5.8 to
help decide if the treatment can be terminated, Specifically, after discussion
with ground-water experts, they decide to conclude that the treatment can be
terminated if the 99th percentile of the PCB concentrations is less than Cs.
That is, in the notation of Section 5.8, PQ- 1-.99 « .01. They have also
decided to set the false positive rate of the test to a = .05. Moreover, they
have required the false negative rate to be no more than 20 percent (P =
0.20) when the actual proportion of contaminated samples is 0.5 percent
.005).
From Appendix Table A.2, zl p -z^-2.326; zlp =zw5=2.576;

Zj^=z95=1.645; and Zj «=z ^=0.842. Using formula (5.38), the required
sample size for each well is:
J.842 + 1.645\2
n* 12.326-2.576J
2'48712 =98.96-99
-.250
where z,.p and zlKIare critical values from the normal distribution

associated with probabilities of 1-a and 1-p (Appendix Table A.2).
5.10.3 Sample Sizes for Estimating Proportions

The sample size required for estimating a proportion using the procedures of
Section 5.9 depend on the following quantities: P0, Plt a, and p. Given these quantities,

the sample size can be computed from the following formula (e.g., see Neter, Wasserman,

and Whitmorc, 1982, page 304):
5-39
-------
CHAPTER 5: DESCRIPTIVE STATISTICS
p . p
ro ri
Box 5.23
Sample size Determination Estimating Proportions

At a site with corrosive residues in the top soil, much of the contaminated
top soil has been removed. However, it is known that the contaminants
have leached into the ground water. Wanting to minimise the possibility of
future health effects, the site manager would like to know if, in the short
term, she can be 95 percent confident (a * .05) that less than 10 percent (P0
» .10) of the ground-water samples have concentrations exceeding the
cleanup standard. The expected proportion of contaminated ground-water
samples is very low, less than 5 percent. The manager wants to be 80
percent confident (P »1-.80 » .20) that the ground water will be declared
clean if the proportion of contaminated ground water samples is less than 5
percent (Pi».05).

Using formula (5.39).
r.842V .05(.95) + 1.645V.10(.9Q) 12
I in. o4? /
.10-.

= 183.3

Rounding up gives a final sample size of 184.
5.10.4 Collecting the Data

After the sample size and sampling frequency have been specified,

collection of the ground-water samples can begin. In collecting the samples, it is important

to maintain strict quality control standard and to fully document the sampling procedures.

Occasionally, a sample will be lost in the field or the lab. If this happens, it is best to try to

collect another sample to replace the missing observation before teaching the next sampling

period. Any changes in the sampling protocol should be fully documented.
5-40
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

Data resulting from a sampling program can only be evaluated and
interpreted with confidence when adequate quality assurance methods and procedures have
been incorporated into the program design. An adequate quality assurance program
requires awareness of the sources of error or variation associated with each step of the
sampling effort.

If a timely and representative sample of proper size and content is not
delivered to the analytical lab, the analysis cannot be expected to give meaningful results.
Failing to build in a quality assurance program often results in considerable money spent on
sampling and analysis only to find that the samples were not collected in a manner that
allows valid conclusions to be drawn from the resulting data. Seen in its broadest sense,
the QA program should address the sample design selected, the quality of the ground-water
samples, and the care and skill spent on the preparation and testing of the samples.

The samples should reflect what is actually present in the ground water.
Improper or careless collection of the samples can likely influence the magnitude of the
sample collection error. Sample preparation also introduces quality control issues.

While a full discussion of these topics is beyond the scope of this
document, the implementation of an adequate QA program is important

5.10.5 Making Adjustments for Values Below the Detection Limit

Sometimes the reported concentration for a ground-water sample will be
below the detection limit (DL) for the sampling and analytical procedure used. The rules
outlined in Section 2.3.7 should be used to handle such measurements in the statistical
analysis.

5.11 Summary

This chapter introduces the reader to some basic statistical procedures that
can be used to both describe (or characterize) a set of data, and to test hypotheses and make
inferences from the data. The chapter discusses the calculation of means and proportions.
Hypothesis tests-and confidence intervals are discussed for making inferences from the
data The statistics and inferential procedures presented in this chapter are appropriate &

5-41
-------
CHAPTER 5: DESCRIPTIVE STATISTICS

for estimating short-term characteristics of contaminant levels By "short-term
characteristics" we mean characteristics such as the mean or percentile of contaminant
concentrations during the fixed of time during which sampling occurs. Procedures
far estimating the long-term mean and far assessing attainment are discussed in Chapters 8
and 9. The procedures discussed in this chapter can be used in any phase of the remedial
effort; however, they will be useful during treatment.

This chapter provided procedures for estimating the sample sizes required
for assessing the status of the cleanup effort prior tn a_final assessment of whether the
remediation effort has been successful. It also discussed briefly issues involved in data
collection.
5-42
-------
6. DECIDING TO TERMINATE TREATMENT USING
REGRESSION ANALYSIS
The decision to stop treatment is based on many sources of information
including (1) expert knowledge of the ground water system at the site; (2) mathematical
modeling of how treatment affects ground water flows and contamination levels; and (3)
statistical results from the monitoring wells from which levels of contamination can be
model and extrapolated. This chapter is concerned with the third source of information.
In particular, it describes how one statistical technique, known as regression analysis,
can be used in conjunction with other sources of information to decide when to terminate
treatment. The methods given here are applicable to analyzing data from the treatment
period indicated by the unshaded portion of Figure 6.1. Methods other than regression
analysis, such as time series analysis (Box and Jenkins, 1970) can also be used.
However, these methods are usually computer intensive and require the assistance of a
statistician familiar with these methods,
Figure 6.1 Example Scenario for Contaminant Measurements During Successful
Remedial Action
Start
Treatment
End Start
Treatment Sampling
Measured
Ground
Water
Concentration
Hind Sampting
Declare Clean or
Contaminated
Dale
Section 6.1 provides a brief overview of regression analysis and serves as a
review of the basic concepts for those readers who have had some previous exposure to the
subject. Section 6.2, the major focus of the chapter, provides a discussion of the steps
required to implement a regression analysis of ground water remediation data Section 6.3
6-1
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

briefly outlines important considerations in combining statistical and nonstatistical informa-
tion.

6 . 1 Introduction to Regression Analysis

Regression analysis is a statistical technique far fitting a theoretical curve to
a set of sample data For example, as a result of site clean-up, it is expected that contami-
nation levels will decrease over time. Regression analysis provides a method for modeling
(i.e., describing) the rate of this decrease. In ground-water monitoring studies, regression
techniques can be used to (1) detect trends in contaminant concentration levels over time,
(2) determine variables that influence concentration levels, and (3) predict chemical concen-
trations at future points in time. An example of a situation where a regression analysis
might be useful is given in Figure 6.2 which shows a plot of chemical concentrations for
15 monthly samples taken from a hypothetical monitoring well during the period of treat-
ment. As seen from the plot, there is a distinct downward trend in the observed chemical
concentrations as a function of time. Moreover, aside from some "random" fluctuation, it
appears that the functional relationship between contaminant levels and time can be reason-
ably approximated by a straight line for the time interval shown. This mathematical rela-
tionship is referred to as the regression "curve" or regression model. The goal of a regres-
sion analysis is to estimate the underlying functional relationship (i.e., the model), assess
the fit of the model, and, if appropriate, use the model to make predictions about future
.-observations.

In general, the underlying regression model need not be linear. However,
to fix ideas, it is useful to introduce regression methods in the context of the simple
linear regression model of which the linear relationship in Figure 6.2 is an example.
Underlying assumptions, required notation, and the basic framework for simple linear
regression analysis are provided in Section 6.1.1. Section 6.1.2 gives the formulas
required to fit the regression model. Section 6.1.3 discusses how to evaluate the fit of the
regression model using the residuals. Section 6.1.4 discusses how some important
regression statistics can be used for inferential purposes (i.e., forming statistically defensi-
ble conclusions form the data).
62
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
Figure 6.2 Example of a Linear Relationship Between Chemical Concentration
Measurements and Time
I
J
10 <
i-
F— J *«*>*y**
10 15
20
TtaM (Month*)
6.1.1
Definitions, Notation, and Assumptions
Assume that a total of N ground water samples have been taken from a
monitoring well over a period of time for chemical measurement. Denote the sample
collection time for il sample as t; and the chemical concentration measurement in the il
sample as c; where i = 1, 2, . . ., N. Let y; denote some function of the il observed
concentration, for example, the identity function, y{ « cit the square root, y-t = Vcj. or the
log transformation; yi = ln(Q). Let Xi denote time or a function of the time, for example, if
the "time" variable is the original collection time, x; = t;, if the time variable is the reciprocal
of the collection time then Xi = l/t;, etc. If the samples are collected at regular time inter-
vals, then the time index, i, can be used to measure time in place of the actual collection
time, i.e., Xi = i or x; = 1/i in the examples above. Note that the notation used in this
section is different from that introduced in Chapter 5.

The simple linear regression model relating the concentration mea-
surements to time is defined by equation (6.1) in Box 6.1.
6-3
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.1
Simple Linear Regression Model

Xi - Po + Pi*i + «i. i-1.2 N (6.1)
In equation (6.1), p0 and p, are constants referred to as the regression
coefficients, or alternatively as the parameters of the model, and Cj is a random
error. The term "yi" is often referred to as the dependent, response, or outcome variable.
In this document, the outcome variables of interest are contamination levels or related
measures. The term "xi" is also referred to as an independent or explanatory variable. The
independent variable (for example the collection time) is generally under the control of the
experimenter. The term N represents the number of observations or measurements on
which the regression model is based.

The regression coefficients are unknown but can be estimated from the
observed data under the assumption that the underlying model is correct The non random
pan of the regression model is the formula for a straight line with y-intercept equal to PQ
and slope equal to Pj. In most regression applications, primary interest centers on the
slope parameter. For example, if x} = i and the slope is negative, then the model states that
the chemical concentrations decrease linearly with time, and the value of 0t gives the rate at
which the chemical concentrations decrease.

The random error, ej, represents "random" fluctuations of the observed
chemical measurements around the hypothesized regression line, yi = Po + PiXj. It reflects
the sources of variability not accounted for by the model, e.g., sources of variability due to
unassignable or immeasurable causes. Regression analysis imposes the following
assumptions on the errors:

(i) The tj's are independent;
(ii) The q's have mean 0 for all values of x^
(iii) The EJ'S have constant variance, a2, for all values of xi; and
(iv) The tj's are normally distributed.
6-4
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
These assumptions are critical for the Validity Of the statistical tests used in a
regression analysis. If they do not hold, steps must be taken to accommodate any depar-
tures from the underlying assumptions. Section 6.2.3 describes some simple graphical
procedures which can be used to study the aptness of the Underlying assumptions and also
indicates some corrective measures when the above assumptions do not hold

Interested readers should refer to Draper and Smith (1966) or Neter,
Wasserman, and Kutner (1985) for more details on the theoretical aspects of regression
analysis.

6.1.2 Computational Formulas for Simple Linear Regression

The computational formulas for most of the important quantities needed in a
simple linear regression analysis are summarized below. These formulas are given primar-
ily for completeness, but have been written in sufficient detail so that they can be used by
persons wishing to carry out a simple regression analysis without the aid of a computer,
spreadsheet, or scientific calculator. Readers who do not need to know the computational
details in a regression analysis should skip this section and go directly to Sections 6.1.3
and 6.1.4, where specific procedures for assessing the fit of the model and making infer-
ences based on regression model arc discussed

Estimates of the slope, Plf and intercept, PQ, of the regression line are given
by the values fy and b0 in equations (6.2) and (6.3) in Box 6.2. 'The statistics fy and b0 arc
referred to as least squares estimates. If the four critical assumptions given in Section
6.1.1. hold for the simple linear regression model in Box 6.1, b\ and b0 will be unbiased
estimates of Pj and P0, and the precision of the estimates can be determined.

The estimated regression line (or, more generally, the fitted curve)
under the model is represented by equation (6.4) in Box 6.3.
6-5
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.2
Calculating Least Square Estimates

N N

£x.y. . J'1 'j-1 £X.y. .Njfy
hL_LJ _ „ i=»J (6>2)
N (2Xi)2 £ X? - NX*
N N
lYi I«i
y-bix (6.3)
Box 6.3
Estimated Regression Line
(6.4)
The cakulated value of fy is called the predicted value under the model corresponding to
the value of the independent variable, &;. The difference between the predicted value, 9j,
and the observed value, yit is called the residual. The equation for calculating the residuals
is shown in Box 6.4. If the model provides a good prediction of the data, we would expect
the predicted values, 9i« to be close to the observed values, y{. Thus, the sum of the
squared differences (y; - y*;)2 provides a measure of how well the model fits the data and is
a basic quantity necessary for assessing the model.
6-6
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION:
Box 6.4
(6.5)
Formally, we define the sum of squares due to error (SSE) and the
corresponding mean square error (MSE) by formulas (6.5) and (6.6). respectively, in
Box 6.5.
Box 6.5
Sum of Squares Due to Error and the Mean Square Error

SSE - I (yi - 9i)2 (6.6)
1-1
MSE = g . (6.7)
As seen in the formulas in Box 6.2, the analysis of a simple linear regres-
sion model requires the computation of certain sums and sums of cross products of the
observed data values. Therefore, it is convenient to define the five basic regression quanti-
ties in Box 6.6.

The estimated model parameters and SSE can be computed from these terms
using the formulas in Box 6.7.
6-7
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.6
Five Basic Quantities for Use in Simple Linear Regression Analysis

N
Sx * Z *i (6.8)

N
sy« £y§ (6.9)
N S
S« = Zx2 -Tf (6.10)
N , S2
(6.11)

(6.12)
Box 6.7
Calculation of the Estimated Model Parameters ad SSE
' ' (6.13)

f (6.14)

*
SSE m Sw - «^ (6.15)
a
An example of these basic regression calculations is presented in Box 6.8.
6-8
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
Box 6.8
Example of Basic Calculations for Linear Regression

Table 6.1 gives hypothetical water contamination levels for each of 15
consecutive months. A plot of the data is shown in Figure 6.3. Using the
formulas in Box 6.5, the following quantities were calculated:
Sx - 120

Syx - -51.05
137.4
9.16
280
8
Syy » 11.801
The estimated regression coefficients are then calculated as:
bi» -0.1823 bo » 10.62

Therefore the fitted model is

ft - bo + bixj -10.62-. 1832 Xj

and, the corresponding mean square error is
MSE m SSE/(N - 2)
.1918.
The straight line in Figure 6.4 is a plot of the fitted model.
Table 6.1 Hypothetical Data for the Regression Example in F'igure 6.3
Time (Month)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Contamination (PPM)
10.6
10.4
9.5
9.6
10.0
9.5
8.9
9.5
9.6
9.4
8.75
7.8
7.6
8.25
8.0
6-9
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
Figure 6.3 Plot of data for from Table 6.1

,0
"5
c
E
J
12-
10-
8 •

6 •
4 •
2 •
•
X w V
XXXXYXXXV
X Y
x x x x

»
10
15
Month
Figure 6.4 Plot of data and predicted values for from Table 6.1

a
e
o
1
c
|
c
o
o

12 •
10-
8 •

6 •
4 •
2 -

•
X-X-x-x-*-x-r*_>Lx_
x x *-*

t
t

10
15
Month
6-10
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS

6.1.3 Assessing the Fit of the Model

It is important to note that the computational procedures given in Section
6.1.2 can-always be applied to a set of data, regardless of whether the assumed model is
true. That is, it is always possible to fit a line (or curve) to a set of data. Whether the fitted
model provides an adequate description of the observed pattern of data is a question that

must be answered through examination of the "residuals." The residuals are the difference
between the observed and predicted values for the dependent variable (see Box 6.4). If the
model does not provide an adequate description of the data, examination of the residuals

can provide clues on how to modify the model.

In a regression analysis, a residual is the difference between the observed
concentration measurement, y{ and the corresponding fitted (predicted) value, $1 (Box 6.3).
Recall that9i » b0 + b^, where bo and bt are the least squares estimates given by

equations (6.3) and (6.2), respectively.

Since the residuals, ejf estimate the underlying error, BJ, the patterns exhib-

ited by the residuals should be consistent with the assumptions given in Section 6.1.1 if the

fitted model is correct. This means that the residuals should be randomly and approxi-

mately normally distributed around zero, independent, and have constant variance. Some
graphical checks of these assumptions, arc indicated below. An example of an analysis of

residuals is presented in Box 6.17.

1. To check for model fit, lot the residuals against the time index or
the time variable, x;:. The appearance of cyclical of curvilinear
patterns (see Figure 6.5, plots b and c) indicate lack of fit or inade-
quacy of the model (see Section 6.2.1 for a discussion of corrective
measures).

2. To check for constancy of variance, examine the plot of the residuals
against x; and the plot of the residuals against the predicted value,
yV For both plots, the residuals should be confined within a
horizontal band such as illustrated in Figure 6.5a If the variability
in the residuals increases such as in Figure 6.5d, the assumption of
constant variance is violated (see Section 6.2.4 for a discussion of
corrective measures in the presence of nonconstant variances).
6-11
-------
CHAPTER 6: DECIDINGTO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Figure 6.5 Examples of Residual Plots (source: adapted from figures in Draper and
Smith, 1966, page 89)
OJ
OJ
0*
2 M
•o>
•06
a. MduiiMfcaagoodfttotili
1 «

•OJ. . , .
0 4 I 12 16 20 24
ThM (« toMUfaHMJ to» nrhbto)

hi MaM don nit ahquttlydMaiii
HI

0 4 1 12 16 20 24
ThM (or
015

0.10

(LOS

2 ftOO'
•
' -005

•aio-
415
& HoiW dov not dtqutfriy dwribi
pttmintodtt
Xl
"1
0 4 6 12 16 20 24
Tim (or tmulonMd tlm vwiibto)
•
2
0
4 Vmnc* B not conttiL
1 » I
* '
«
04 8 12 18 20 24
ThM (or trwMfonMd Urn vwlabto)
6-12
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

3. To check for normality of the residuals, plot the ordered residuals
(from smallest to largest) against their expected values under
normality, EV; using the procedures of Section 5.7.2. Note that in
this case, the formula for computing EV; is given by equation (5.24)
with s^, replaced by VMSE.
4. To test for independence of the error terms, compute the serial
correlation of the residuals and perform the Durbin-Watson test (or
the approximate large-sample test) described in Section 5.6.

It may happen that one or more of the underlying assumptions for linear
regression is violated. Corrective measures are discussed in Section 6.2. Figure 6.6
shows the residuals for the analysis discussed in Box 6.8. These residuals can be
compared to the examples in Figure 6.5.
Figure 6.6 Plot of residuals for from Table 6.1

f
a
o

e
I
3
0
u
0.8-
0.6-
0.4 «
0.2*

-0.2,1
-0.4.
-0.6.
-0.8 •

x x
x *
* x x * x

•5 10 15
X
X
X XX
1
Month
6.1.4
Inference in Regression
As mentioned earlier, two important goals of a regression analysis on
ground water remediation arc the determination of significant trends in the concentration
measurements and the prediction of future concentration levels. Assuming that the hypoth-
esized model is correct, the mean square error (MSB) defined by equation (6.6) plays an
6-13
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

important role in malting inferences from regression models. The MSB is an estimate of
that portion of the variance of the concentration measurements that is not explained by the
model. It provides information about the precision of the estimated regression coefficients
and predicated values, as well as the overall fit of the model.

6.1.4.1 Calculating the Coefficient of Determination

The coefficient of determination, denoted by R2, is a descriptive
statistic that provides a measure of the overall fit of the model and is defined in Box 6.9.
Box 6.9
Coefficient of Determination
(6.16)

where SSE is given by equation (6.6) and Syy is given by equation (6.11).
R2 is always a number between 0 and 1 and can be interpreted as the
proportion of the total variance in the y^s that is accounted for by the regression model. If
R2 is close to 1 then the regression model provides a much better prediction of individual
observations than does the mean of the observations. If R2 is close to 0 then using the
regression equation to predict future observations is not much better than using the mean of
the y^s to predict future observations. A perfect fit (i.e., when all of the observed data
points fall on the fitted regression line) would be indicated by an R2 equal to 1. In practice,
a value of R2 of 0.6 or greater is usually considered to be high and thus an indicator that the
model can be reasonably used for predicting future observations; however, it is not a
guarantee. A plot of the predicted values from the model and the corresponding observed
values should be examined to assess the usefulness of the model.

Figure 6.7 shows the R2 values for several hypothetical data sets. Notice
that the data in the middle of the chart (represented by the symbol "x") exhibit a pronounced
6-14
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

downward linear and, and this is reflected in a high R2 of .93. On the other hand,-the set
of data in the top of the chart (represented by "diamonds") exhibits no and in concentra-
tions, and this is reflected in a low R2 of .02. Finally, we note that the R for the set of
data at the bottom of the chart is fairly low (about 0.5). even though there appears to be a
fairly strong (nonlinear) trend. This is because R measures the linear trend over time
(months). For these data, the and in the concentrations is not linear, thus the correspond-
ing R is fairly low. If the time axis were transformed to the reciprocal of time, the
resulting R2 for the third data set would be close to 0.90.
Figure 6.7 Examples of R-Square for Selected Data Sets
i
**
|
£
3
^
1
0
12-
«
10-
8-
6-
,
4-

*«X*« •* ••***•*
* X
x x*x* » R-
XXXX ^ R.
x x* x R-
X
MX ||

R-Squar* - .93
R-Squar*-.Q2
0 5 10 15 20
Tbn* (Month)
25 30
While R is a useful indicator of the fit of a model and the usefulness of the
model for predicting individual observations, it is not definitive. If the model is used to
predict the mean concentration rather than an individual observation or if the trend in the
concentrations is of interest, other measures of the model fit arc more useful. These arc
addressed in the following sections.
6-15
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

6.1.4.2 Calculating the Standard Error of the Estimated
Slope

In a simple linear regression, the slope of the fitted regression line gives the
magnitude and direction of the underlying trend (if any). Because different sets of samples
would provide different estimates of the slope, the estimated slope given by equation (6.2)
'is subject to sampling variability. Even if the form of the assumed model (6.1) were
known to be true, it would still not be possible to determine the slope of the true relation-
ship exactly. However, it is possible to estimate, with a specified degree of confidence, a
range within which the true slope is expected to fall.

The standard error of bj provides a measure of the variability of the
estimated slope. It is denoted by s(bO and is defined in Box 6.10.
Box 6.10
Calculating the Standard Error of the Estimated Slope
(6.17)
The standard error can be used to construct a confidence interval around the
true slope of the regression line. The formula for a 100(1-a) percent confidence interval is
given by equation (6.17) in Box 6.11.
Box 6.11
Calculating a Confidence Interval Around the Slope
where t1^/2;N.2 is the upper 1- y percentage point of a t distribution with
N-2 degrees of freedom (see Appendix Table A. 1).
6-16
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
The confidence interval provides a measure of reliability for the estimated
value t>!. The narrower the interval, the greater is the precision of the estimate b^ Because
the confidence interval provides a range of likely values of P! when the model holds, it can
be used to test hypotheses concerning the significant of the observed trend.

6.1.4.3 Decision Rule for Identifying Significant Trends

If the confidence interval given by equation (6.17) contains the value zero,
there is insufficient evidence (at the a significance level) to conclude that them is a trend

On the other hand, if the confidence interval includes only negative (or only
positive) values, we would conclude that there is a significant negative (or positive) trend.

An example in which the above decision rule is used to identify a significant
trend is given in Box 6.12.

6.1.4.4 Predicting Future Observations

If the fitted model is appropriate, then an unbiased prediction of the concen-
tration level at time h is $h=&0 + b)Xh, where xh is the value of the time variable at time h.
The standard error of the estimate is given by equation (6.18), and the corresponding 100(1
- a) percent confidence limits around the predicted value at time h are given by formula
(6.19) in Box 6.13.

Note that if the fitted regression model is based on data collected during the
cleanup period, the confidence limits given by formula (6.20) may not strictly apply after
treatment is terminated. Consequently, confidence limits based on data from the treatment
period which are used to draw inferences about the post-treatment period should be inter-
preted with caution. Further discussion of the use of predicted values in ground water
monitoring studies is given in Section 6.2.
6-17
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.12
Using the Confidence Interval for the Slope to Identify a Significant Trend

For the data in Table 6.1, the estimated regression line was
determined to be yj» bo + bi x; * 10.62 - .1823 Xj.

SSE
The coefficient of determination for the fitted model is R2 = 1- 4r=

» 1 - (2.49/11.8) « .79. That is, 79 percent of the variability in &
contamination measurements is explained by the regression model provided
that the model is correct.

Using equation (6.16), the standard error of the estimated slope is

s(bj) = V^jf = VTWT = .02617; and the corresponding 95 percent

confidence limits for Pi are given by -.1823 ± (2.101) (.02617) or -.2373 to

-.1273. (Note that a = .05, 1 - y » .975, N = 15, and N-2 = 13; thus,

ti-o/2j*-2 * t.925,13 * 2.101 from Appendix Table A.I.)

Since the interval (-.2373, -.1273) does not include zero, we can
conclude that the observed downward trend is significant at the a = .05
level. That is, we have high confidence that the observed downward trend
is real and not just due to sample variability.
Box 6.13
Calculating the Standard Error and Confidence Intervals for Predicted
Values
w *« (6.19)

XH ± ti^.ji.2s(yh) «•*«
6-18
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

An example in which the regression model is used to predict future values is
presented in Box 6.14.
Box 6.14
Using the Simple Regression Model to Predict Future Values

Continuing the example in Box 6.11, suppose that the site manager
is interested in predicting the contaminant concentration for month 16*. The
predicted concentration level for month 16, assuming that the model holds,
is

$16 - *>o + Mi6 « W.62 - .1823(16) » 7.703.

The standard error of the predicted value is
.1918(1 +Tg+v*Wr) « '*9M-
1J ^fcO\/

Therefore, if the model holds, 99 percent confidence limits around
the predicted value [see formula (6.20)] are given by 7.703 ± 2.878 (.4984)
or from 6.269 to 9.137.
* Again, it shpuld be emphasized that whenever a regression model is used to make
predictions about concentrations outside the range of the sampling period, extreme
caution should be used in interpreting the results. In particular, the regression results
should not be used alone, but should be combined with other sources of information
(see discussion in Section 63).
6.1.4.5 Predicting Future Mean Concentrations

If the fitted model is appropriate, then an unbiased prediction of the mean
concentration level at time h is £h * bo + btxh, where Xj, is the value of the time variable at

time h. Although the predicted mean and the predicted value for an individual observation
arc the same, the prediction error of the predicted mean is less than that for an individual

predicted value. The standard error of the predicted mean is given by equation (6.21), and
the corresponding 100(1 - a) percent confidence limits around the predicted mean at time h

are given by formula (6.22) in Box 6.15

6-19
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.15
Calculating the Standard Error and Confidence Interval a Predicted Mean
(6-22)
Note that if the fitted regression model is based on data collected during the
cleanup period, the confidence limits given by formula (6.19) may not strictly apply after
treatment is terminated. Consequently, confidence limits based on data from the treatment
period which are used to draw inferences about the post-treatment period should be
interpreted with caution. Further discussion of the use of predicted values in ground water
monitoring studies is given in Section 6.2.

6.1.4.6 Example of a "Nonlinear" Regression

Applying regression analysis is not always as straightforward as the
examples in Boxes 6.8, and 6.12 indicate. To show some of the possible complexities and
to help fix some of the ideas presented, we will do a regression analysis on the data in
Table 6.2. As shown in Figure 6.8, these data are not linear with respect to time and hence
a transformation of the independent variable was employed (More information about the
use of transformations is given later in Section 6.2.3.) The analysis is summarized in Box
6.16 and the fitted model is plotted in Figure 6.9.
6-20
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS

Table 6.2 Hypothetical concentration measurement for mercury (Hg) in ppm for 20
ground water samples taken at monthly intends
- Month
January
February
March
April
May
June
July
August
September
October
November
Dpccmbw
January
February
March
April
May
June
July
August
Year
1986
1986
1986
1986
1986
1986
1986
1986
1986
1986
1986
1986
1987
1987
1987
1987
1987
1987
1987
1987
Goded
month (0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
COOCGQtfSDOfl

0.401
0380
0.352
0.343
OJ54
0.350
0.343
0.333
0.325
0.325
0.327
0.329
0.324
0.325
0.319
0.323
0.316
0.318
0.321
0.331
Reciprocal
of month (x)
1.0000
0.5000
0.3333
0.2500
0.2000
0.1667
0.1429
0.1250
0.1111
0.1000
0.0909
0.0833
0.0769
0.0714
0.0667
0.0625
0.0588
0.0556
0.0526
0.0500
Figure 6.8 Plot of Mercury Measurements as a Function of Time (See Box 6.16)
o
0.42-

0.40-

0.38

0.36-

0.34-

0.32-

0.30
X
X X
M
****** «
* X*
10 15

Tbn* (Month)
20
25
6-21
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.16
Example of Basic Regression Calculations

Table 6.2 shows mercury concentrations for 20 ground water samples taken
from January 1986 to August 1987. A plot of the concentration measure-
ments as a function of time is shown in Figure 6.8. Because the data
exhibited a nonlinear trend, it was decided to consider the model yi = Po +
PlXj + Cj, where x; = 1/i. The values of the reciprocals of time are shown in
the last column of the table.

For these data, the following quantities were calculated: Sx « 3.598; Sy =
6.739; Sxx = .949; Syy = .00909; Syx - .0866, y = .337, y - .337, x =
.180.

The estimated regression coefficients were then calculated as: bi =
.0866/.949 = .0913; and bo = .337 - (.0913)0180) = .321. The fitted
model is therefore

A ..,,„ ,,..,..0913
7i * bo + biXi ».321 + —j—

and the associated mean square error is

nnono •0866^
eep .VAWW - -.Q
MSE * ^= lg •** » .000066.

Figure 6.9 shows a plot of the fitted model against the observed
concentration values.
Figure 6.9 Comparison of Observed Mercury Measurements and Predicted Values
under the Fitted Model (See Box 6.16)
o
0.42 i

0.40 -

0.38-

0.36-

0.34-

0.32-

0.30
Fitted modal: y - .321 * 0.0913/1
XX'
5 10 15
Tbm (Month)
20 25
6-22
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.17
Analysis of Residuals for Mercury Example

Figure 6.10 shows a plot of the residuals for the mercury data in Table 6.2
based on the fined model, $t » .321 + 0.0913Ai (see Box 6.16). The
residual plot indicates some lack of fit of the model In particular, it appears
that the fitted model tends to underestimate concentrations at the earlier times
while overestimating concentrations at the later times. (Since the residuals
represent the differences between the actual and predicted values, the
positive values of the residuals in the earlier months indicate that the actual
values tend to be larger than the predicted values then. Hence, the model
underestimates the earlier concentrations.)

To see whether the fit could be improved by using a different transformation
of i, the following alternative model was considered: y4 * 0o + Pi/VT + ej.
For this model, the estimated regression coefficients are bo = .2957 and fy
-. 1087, and the coefficient of determination is R2 = .927 (compared to .89
for the earlier model). This indicates a somewhat better fit when 1/VT is
used as the independent variable (see Figure 6.11). The residual plot under
the new model (see Figure 6.12) seems to support this conclusion.
Moreover, the standard error of bt is s(bt) » .0072, and hence 95 percent
confidence limits around the true slope are given by .1087 ±
(2.101)(.0072), or .094 to .124. Since the interval does not include zero,
we further conclude that the trend is significant.

Finally, Figure 6.13 shows a normal probability plot of the ordered
residuals based on the revised model, where the expected values, EVj were
computed using formula (5.24) with $„,» VMSE. There is a nonlinear
pattern in the residuals which suggests that the normality assumption may
not be appropriate for this model. If a formal test indicates the lack of
normality is significant, nonlinear regression procedures should be
considered.
6-23
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS

Figure 6.10 Plot of Residuals Against Time for Mercury Example (see Box 6.17)
-0.02
0.01
2 0.00
-0.01-
-0.02
X *«
X X
X
5 10 15 20 25
Time (Month)
Figure 6.11 Plot of Mercury Concentrations Against x * 1/VT, and Alternative Fitted
Model (see Box 6.17)
I
8
I

I
0.30
0.2 0.4 0.6 0.8 1.0
X (reciprocal of square root of time)
6-24
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
Figure 6.12 Plot of Residuals Based on Alternative Model (see Box 6.17)
0.02
0.01 -
o.oo
-0.01
* * *
5 10 15 20 25
Tim* (Month)
Figure 6.13 Plot of Ordered Residuals Versus Expected Values for Alternative Model
(see Box 6.17)
0.02-1
0.01 •
o.oo-
-0.01
-0.015 -0.005 0.005

Expected value
0.015
6-25
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

To summarize, if the data are originally linear (such as the data in Table
6.1), then we may fit the simple linear regression model of Box 6.1. If the data are more
complex (e.g. the data in Table 6.2). then a transformation may be used as was done in
Box 6.16. One can transform either the independent (i.e., the explanatory) variable or the
dependent (i.e., the outcome) variable, or both. Finding the appropriate transformation is
as much an art as it is a science. Consultation with a statistician is recommend in order to
help identify useful transformations and to help interpret the model based on the
transformed data.
6.2 Using Regression to Model the Progress of Ground Water
Remediation
As samples arc collected and analyzed during the cleanup period, trends or
other patterns in the concentration levels may become evident. As illustrated in
Figure 6.14, a variety of patterns are possible. In situation 1, regression might be used to
determine the slope for observations beyond time 20 to infer if the treatment is effective. If
not, a decision might be made to consider a different remedial program. For Situation 2,
the concentration measurements have decreased below the cleanup standard, and regression
might be used to investigate whether the concentrations can be expected to stay below the
cleanup standard. For Situation 3 in Figure 6.14, which could arise from factors such as
interruptions or changes in the treatment technology or fluctuating environmental condi-
tions, regression can be used to assess trends. However, due to the highly erratic nature of
the data any p&dictions of trends of future concentrations arc likely to be very inaccurate.
Additional data collection will be necessary before conclusions can be reached. Where
appropriate, regression analysis can be useful in estimating and assessing the significance
of observed trends and in predicting expected levels of contaminant concentrations at future
points in time.

Figure 6.15 summarizes the steps for implementing a simple linear regres-
sion analysis at Superfund sites. These steps are described in detail in the sections that
follow.
6-26
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

Figure 6.14 Examples of Contaminant Concentrations that Could Be Observed During
Cleanup
1
•
16
14
12
10
•
6<
4
2
Situation 1
AtymptMi Starton: Ctoanup
standard potanttaNy unatMnabto.
Ctoanup standard, Cs - 6 ppm
0 10
Start Ctoanup
20 30 40 80
Tiaia
Situation 2
J
bataiv ttw daanup stanoafd.
Oaanup stadard. C* • 8 ppm
10 20 30 40 50
Situation 3
Highly vartaMa
I
10 20 30 40 80
6-27
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS

Figure 6.15 Steps for Implementing Regression Analysis at Superfund Sites
Choose a
linear or nonlinear
regression
(Section 6.2.1)
regression
(Section 6.2.1)
Consult a
about nonlinear
models
Estimate model parameters
and calculate residuals
(Section 6.23)
f
1
Tnnsfonii

Ron
variables or
use weighted
regression
(Section 6.2.4)
autocorrelation
from residuals
(Section 6.2.5)
Assess fit of model
(Section 6.23)
Is variance
of residuals
constant?
(Section 6.23)
is there
a good fit to the data?
(Section 6.23)
Are the
errors independent?
(Section 6.25)
Test for significant trend
and set confidence limits
around predicted values
(Section 6.1.4)
Combine regression re-
sults with other inputs
(Section 63)
End

6-28
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
6.2.1 Choosing a Linear or Nonlinear Regression

The first step in a regression analysis is to decide whether a linear or nonlin-
ear model is appropriate. An initial choice can often be made by observing a plot of the
sample data over time. For example, for the data of Figure 6.2, the relationship between
concentration measurements and time is apparently linear. In this case, the regression
model (6.1) with x; = i would be appropriate. However, for the data displayed in Figure
6.16. some sort of nonlinear model would be appropriate.

Sometimes it is possible to model a nonlinear relationship such as that
shown in Figure 6.16 with linear regression techniques by transforming either the depen-
dent or independent variable. In some cases, theoretical considerations of ground water
flows and the type of treatment applied may lead to the formulaton of a particular nonlinear
model such as "exponential decay." This, in turn, may lead to consideration of a particular
type of transformation (e.g., logarithmic or inverse transformations). However, these a
priori considerations do not preclude testing the model for adequacy of fit. Choosing the
appropriate transformation may require the assistance of a statistician; however, if the
(nonlinear) relationship is not too complicated, some relatively simple transformations may
be sufficient to "linearize" the model, and the procedures given in Section 6.1 may be used.
On the other hand, after analysis of the residuals (as described below in Section 6.2.3), if
none of the given transformations appears to be adequate, nonlinear regression methods
should be used (see Draper and Smith, 1966; Neter, Wasserman, and Kutner, 1985). A
statistician should be consulted about these methods.

Figure 6.17 shows examples of two general types of curves that might
reasonably approximate the relationship between observed contaminant levels and time. If
a plot of the concentration measurements versus time exhibits one of these patterns, the
transformations listed below in Box 6.18 may be helpful in making the model linear. Since
the initial choice of transformation may not provide a "good" fit, the process of determining
the appropriate transformation may require several iterations. The procedures described in
Section 6.2.3 can be used to assess the fit of a particular model. Box 6.18 contains some
suggested transformations for the two types of curves shown in Figure 6.17 (source:
Neter, Wasserman, and Kutner, 1985).
1 Although a model such as y » ft) + P i ( - J is a nonlinear equation; it is called a linear regression model
because the coefficients. Po and Pi, occur in a linear form (as opposed to say y » Po +
6-29
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION

ANALYSIS
Figure 6.16 Example of a Nonlinear Relationship Between Chemical Concentration

Measurements and Time
i
8

2
X X
10

Thiw (Month)
20
Figure 6.17 Examples of Nonlinear Relationships
o
o
6-
4-
2-
TYP.A
10 15

Time
20
Typ«B
26
6-30
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6. 18
Suggested Transformations

Type A: Contaminant concentrations following this pattern decrease
slowly at tit and then more rapidly later on. A useful transformation to
consideris
Xj-iP

where p is a constant greater than 1. If the decline in concentrations is very
steep, set p = 2, initially, and then try alternative values, if necessary, to
obtain a ood fit.

Type B: Contaminant concentrations following this pattern decrease
rapidly at first and then more slowly later on. Useful transformations to
consider in this case are
• xi» VT.

Alternatively, one can also consider transforming y;; e.g., use the
transformed variable
yi' - 1/yi

either in lieu of or together with the transformed time variable, whichever
appears to be appropriate.

There is no guarantee that using transformations will help; and its effective-
ness must be determined by checking the fit of the model and examining the
residuals. Consultation with a statistician is recommended to help identify
useful transformations and to interpret the model based on the transformed
measurements.
6-31
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

6.2.2 Fitting the Model

In a regression analysis, the process of "fitting the model" refers to the process of
estimating the regression parameters and associated sampling errors from the observed
data. With these estimates, it is then possible to (1) determine whether the model provides
an adequate description of the observed chemical measurements; (2) test whether there is a
significant trend in the chemical measurements over time; and (3) obtain estimates of
concentration levels at future points in time.

Given a set of concentration measurements, y;, i = 1, 2,. . . , N, and corre-
sponding time values, x;, the estimated slope and interpret of the fitted regression line can
be computed from the equations in Section 6.1.2. For the fitted model, the error sum of
squares, SSE, and coefficient of determination should also be computed,

Note that the model fitting will, in general, be an iterative process. If the
fitted model is inadequate for any of the reasons indicated below in Section 6.2.3, it may be
possible to obtain a better fitting model by considering transformations of the data.

6.2.3 Regression in the Presence of Nonconstant Variances

If the residuals for a fitted model exhibit a pattern such as that shown in
Figure 6.14d the assumption of constant variance is violated, and corrective steps must be
taken. The two most common corrective measures arc: (1) transform the dependent
variable to stabilize the variance; or (2) perform a "weighted least squares regression"
(Neter, Wasserman, andKutner, 1985).

Transformations of the dependent variable that are useful for stabilizing
variances are the square root transformation, the logarithmic transformation, and the
inverse transformation. Which transformation to use in a particular situation depends on
the way the variance increases. To determine this relationship, it is useful to divide the data
into four or five groups based on the time at which observations were made. For example,
the first group might consist of the first four observations, the second group might consist
of the next four observations, and so on. For the g^ group, compute the mean of the
observed concentrations, yr and the standard deviation of the concentrations, sg (Section
5.1). If a plot of sg versus ?K is approximately a straight line, use V7i»the square root
6-32
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

transformation, in the regression analysis; if a plot of s, versus pg is approximately a
straight line, use logty;), the logarithmic transformation, in the analysis; and, finally, if a
plot of VsTversus y. is approximately a straight line, use —, the inverse transformation, in
• Ji
the analysis (Neter, Wasserman, and Kutner, 1985).

The other major method for dealing with nonconstant variance is weighted
least squares regression. Weighted least squares analysis provides a formal way of
accommodating nonconstant variance in regression. To apply this method, the form of the
underlying variance structure must be known or estimated from the data. This method is
described elsewhere; e.g., Draper and Smith (1966). A statistician should be consulted
when applying these methods.
6.2.4 Correcting for Serial Correlation

It is sometimes possible to remove the serial correlation in the residuals by
transforming the dependent and independent variables. Applied Linear Statistical Models
by Neter, Wasserman, and Kutner (1985), amplifies the following iterative procedure.
6.2.4.1 Fitting the Model

The four steps for fitting the model to remove serial correlations arc
discussed below.
(1) Calculate the serial correlation of the residuals, s, using the formula in Box
5.14.

(2) For i = 2, 3, . . . , N, transform both the dependent and independent variables
using equation (6.23) in Box 6.19. Perform an ordinary least squares regression on the
transformed variables. That is, using the procedures of Section 6.1.2, fit the "new" model
given by equation (6.24).
6-33
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.19
Transformation to "New" Model

Transform both the dependent and independent variables using the
formulas:
yi' * yi - Wi-i and Xj1 = Xj - *obsxi-i. (6-23)

Fit the following model using the transformed variables:

yi' - Po' + Pi'*i' + «i • (6.24)

Note that one observation is lost in the transformed measurements because
(6.26) cannot be determined for i = 1.
Denote the least squares estimates of the parameters of the new
(transformed) model by bo* and bj' and denote the fitted model for the transformed

variables by equation (6.25) in Box 6.20.
Box 6.20
"New" Fitted Model for Transformed Variables
yY » V + 0,'Xi' (6.25)
Calculate the residuals for the new model: e2' = y^ - (bo* + bj'Xj1). Note

that the fitted model (6.25) is expressed in terms of the transformed variables and not the

original variables.

(3) Perform the Durbin-Watson test (or approximate test if the sample size is large)

on the residuals of the model fitted in step (2). If the test indicates that the serial correla-

tion is not significant, go to step (4). Otherwise, terminate the process and consult a

statistician far alternative methods of correcting for serial correlation.
6-34
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
(4) In terms of the original variables, the slope and the intercept of the fitted
regression line are provided in Box 6.21.
Box 6.21
Slope and Intercept of Fitted Regression Line in Terms of Original Variables

(6.26)
'obs
where $obs is the estimated autocorrelation determined by using the
residuals obtained from fitting the untransformed data, and bg' and bj' are
least squares estimates obtained from the transformed data.
The approach given above has the effect of adjusting the estimates of
variance to account for the presence of autocorrelation. Typically, the variance of the
estimated regression coefficients is larger when the errors are correlated, as compared with
uncorrelated errors. An example of the use of this technique is given in Box 6.22.
6.2.4.2 Determining Whether the Slope is Significant

The standard error of the slope of the original model is simply the standard
error of the slope, bb obtained from the regression analysis performed on the transformed
data defined in Box 6.21. The formulas given in Section 6.1.4 can be used to compute the
standard error of bp The decision rule in Section 6.1.4.3 can be used to identify whether
the trend is statistically significant Note that for the transformed data, the total number of
observations is N- 1.
6-35
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Box 6.22
Correcting far Serial Correlation

Table 6.3 shows the concentration of benzene in 15 quarterly ground water
samples taken from a monitoring well at a former manufacturing site. It
appeared from a plot of the data (see Figure 6.18) that a simple linear model
of the form: yj * fa + fM + q might be appropriate in describing the relation-
ship between concentrations and time.

A regression analysis was performed on the data with the following results:
(a) the fitted model was estimated to be & » 29.20 - .4781; (b) R2 = 0.73;
(c) 95 percent confidence limits around the slope of the line were calculated
to be -0.478 ± (2.16K.082), or -0.66 to -0.30; and (d) the Durbin-Watson
statistic was computed to be D ».795.

For N « 15 and p-l=l (there are two parameters in the model), the critical
value for the Durbin-Watson test is dy = 1.36 at the .05 significance level.
Since D < 1.36, it was concluded that there was a significant autocorrela-
tion. Although the calculated confidence interval for the slope of the line
apparently indicated that the observed downward trend was significant, it
was recognized that the presence of autocorrelations could lead to erroneous
conclusions. Therefore, the data were re-analyzed using the method of
transformations described earlier in this section.

First, the serial correlation was computed from the residuals as $0bs = -57.
Then the observed concentrations and time variable were transformed as
follows: yj' = y{ - -Sly^; and Xj1« i - .57(i-l). A regression of y{ on Xj1
resulted in least squares estimates of bj* * -.34 and b0' = 11.89 for the
transformed variables, with s(bi') « .17. Therefore, using equation (6.26),
estimates of the slope and intercept for the original data were calculated as

b! = bi' = -.34, and bo - rt - ^it2 - 27-65- Note *« *« revised
11 " '••5/ .
estimates are close to the original estimates, except that now the standard
error of bi is much larger that it was before the effect of the autocorrelations
was taken into account in the analysis (.17 vs. .082). Because of this
increase in variance, 95 percent confidence limits around the true slope are
now given by -.34 ± (2.179)(.17), or -.71 to .03. In this case, the interval
includes zero, and therefore at the five percent significance level, we cannot
conclude that the observed trend is significant
6-36
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS

Table 6.3 Benzene concentrations in 15 quarterly samples (see Box 6.22)
Ye*
1983

1986

1987

1988

Qianer
Pint
Second
HiM
Fourth
First
Second
Hunt
Fourth
First
Second
Hud
Fourth
First
Second
Third
(>vd
quarter (i)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Concentration
inppb(y)
30.02
29.32
28.12
28.32
27.01
24.78
24.00
23.78
24.25
23.24
21.98
25.00
24.10
23.75
23.00
Figure 6.18 Plot of Benzene Data and Fitted Model (see Box 6.22)
e
o
e
o
32

24 H

22
20
Fitted modal:
y • 29.2 • .478 i
0 2 4 6 8 1012141618 20
Coded quarter (I)
6-37
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
6.2.4.3 Calculating the Confidence Interval for a
Predicted Value
The general procedures in Section 6.1.4 can also be used to develop confi-

dence limits for the predicted concentration at arbitrary time h (as shown in Box 6.23).
Box 623
Constructing Confidence Limits around an Expected Transformed Value

Referring to the fitted model (6.28), use equation (6.19) to construct
confidence limits around the expected transformed value at time h:
(6-27)

and

W « 9h - ti.«/2*r.3 *# h' >• <6-2*)

where, 9h' * bo! + bi'xh'; xh * the value of the time variable at time h;
and s$h') is the standard error of $h' as computed from equation (6.18)
using the transformed data. Note that the "t value" used in die confidence
interval is based on N-3 (instead of N-2) degrees of freedom because we are
estimating and additional parameter (the serial correlation) from the data.

Since the limits given in equations (6.27) and (6.28) are in the transformed
scale, the upper- and lower-confidence limits in the original scale arc given
by:
Cupper " Uh( + obrf,, (6.29)

and

yh.iow« - W + fobsVh- (6.30)
6.3 Combining Statistical Information with Other Inputs to the
Decision Process
The statistical techniques presented in this chapter can be used to (1) deter-

mine whether contaminate concentrations are decreasing over time, and/or (2) predict future

concentrations if present trends continue. Other factors must be used in combination with
6-38
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

these statistical results to decide whether the remedial effort has been successful, and when

treatment should be terminated Several factors to consider are:

Expert knowledge of the ground Water at this site and experience
with, other remedial efforts at similar sites;

The results of mathematical models of ground water flow and
chemistry with sensitivity analysis and assessment of the accuracy
of the modeling results; and

cost and scheduling considerations.

The sources of information above can be used to answer the following

questions:

How long will it take for the ground water system to reach steady
state before the sampling for the attainment decision can begin?

What is the chance that the ground water concentrations will
substantially exceed the cleanup standard before the ground water
reaches steady state?

What are the chances that the final assessment will conclude that the
site attains the cleanup standard?

what are the costs of (1) continuing treatment, (2) performing the
assessment, and (3) planning for and initiating additional treatment if
it is decided that the site does not attain the cleanup standard?

The answers to these questions should be made in consultation with both

statistical and ground water experts, managers of the remediation effort and the regulatory
agencies.
6.4 Summary

This chapter discussed the use of regression methods for helping to decide

when to stop treatment. In particular, procedures were given for estimating the and in

contamination levels and predicting contamination levels at future points in time. General

methods for fitting simple linear models and assessing the adequacy of the model were also

discussed.

In deciding when to terminate treatment, the chapter emphasized that:

6-39
-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
ANALYSIS
Interpreting the data is usually a multiple-step process of refining the
model and understanding the data;

Models are a useful but imperfect description of the data. The
usefulness of a model can be evaluated by examining how well the
assumptions fit the data, including an analysis of die residuals;

Correlation between observations collected over time can be impor-
tant and must be considered in the model;

Changes in treatment over time can result in changes in variation,
and correlation and can produce anomalous behavior which must be
understood to make correct conclusions from the data; and

Consultation with a ground water expert is advisable to help inter-
pret the results and to decide when to terminate treatment
Deciding when to terminate treatment should be based on a combination of

statistical results, expert knowledge, and policy decisions. Note that regression is only one

of various statistical methods that may be used to decide when treatment should be termi-

nated. Regression analysis was discussed in this document because of its relative simplic-

ity and wide range of applicability; however, this does not constitute an endorsement of

regression as a method of choice.
6-40
-------
7. ISSUES TO BE CONSIDERED BEFORE STARTING
ATTAINMENT SAMPLING
After terminating treatment and before collecting water samples to assess
attainment, a period of time must pass to ensure that any transient effects of treatment on
the ground water system have sufficiently decayed. This period is represented by the
unshaded portion in the figure below. This chapter discusses considerations for deciding
when the sampling for the attainment decision can begin and provides statistical tests,
which can be easily applied, to guide this decision. The decision on whether the ground
water has reached steady state is based on a combination of statistical calculations, ground
water modeling, and expert advice from hydrogeologists familiar with the site.
Figure 7.1 Example Scenario for Contaminant Measurements During Successful Remedial Action
• Start
Treatment
Measured
Ground
Water
Concentration g.4,
0.2
Date
The degree to which remediation efforts affect the ground water system at a site is difficult
to determine and depends on the physical conditions of the site and the treatment technolo-
gies used. As previously discussed, the ground water can only be judged to attain the
cleanup standard if both present and future contaminant concentrations are acceptable.
Changes in the ground water system due to treatment will affect the contaminant concentra-
tions in the sampling wells. For example, while remediation is in progress pumping can
alter water levels, water flow, and thus the level of contamination being measured at
monitoring wells. To adequately determine whether the cleanup standard has been attained,
the ground water conditions for sampling must approximate the expected conditions in the
7-1
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
SAMPLING

future. Consequently, it is important to establish when the residual effects of the treatment
process (or any other temporary intervention) on the ground water appear to be negligible.

When this point is reached, sampling to assess attainment can be started and inferences on

attainment can be drawn. We will define the state of the ground water when temporary

influences no longer affect it as a "steady state." "Steady state," although sometimes

defined in the precise technical sense, is used here in a less formal manner as indicated in

Section 7.1.
7.1 The Notion of "Steady State"
components:
The notion of "steady state" may be characterized by the following

l.a. After treatment, the water levels and water flow, and the
corresponding variability associated with these parameters (e.g.,
seasonal patterns), should be essentially the same as for those from
comparable periods of time prior to the remediation effort.

l.b. In cases where the treatment technology has resulted in permanent
changes in the ground water system, such as the placement of slurry
wells, the hydrologic conditions may not return to their previous
state. Nevertheless, they should achieve a state of stability which is
likely to reflect future conditions expected at the site. For this steady
state, the residual effects of the treatment will be small compared to
seasonal changes.

2. The pollutant levels should have statistical characteristics (e.g., a
mean and standard deviation) which will be similar to those of future
periods.

The first component implies that it is important to establish estimates of the

ground water levels and flows prior to remediation or to predictively model the effect of

structures or other features which may have permanently affected the ground water.

Variables such as the level of ground water should be measured at the monitoring wells for

a reasonable period of time prior to remediation, so that the general behavior and character-

istics of the ground water at the site are understood.

The second component is more judgmental. Projections must be made as

to the future characteristics of the ground water and the source(s) of contamination, based
7-2
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING

on available, current information. Of course, such projections cannot be made with cer-
tainty, but reasonable estimates about the likelihood of events may be established.

The importance of identifying when ground water has reached a steady state
is related to the need to make inferences about the future. Conclusions drawn from tests
assessing the attainment of cleanup standards assume that the current state of the ground
water will persist into future. There must be confidence that once a site is judged clean,
it will remain clean. Achieving a steady state gives credence to future projections derived
from current data.
7.2 Decisions to be Made in Determining When a Steady State is
Reached

Immediately after remediation efforts have ended, the major concern is
determining when ground water achieves steady state. In order to keep expenditures of
time and money to a minimum, it is desirable to begin collecting data to assess attainment as
soon as one is confident that the ground water has reached a steady state.

When sampling to determine whether the ground water system is at steady
state, three decisions arc possible:

The ground water has reached steady state and sampling for assess-
ing attainment can begin;
The measurements of contaminant concentrations during this period
indicate that the contaminant(s) are unlikely to attain the cleanup
standard and further treatment must be considered; or
More time and sampling must occur before it can be confidently
assumed that the ground water has reached steady state.

Next, various criteria will be considered that can be used in determining
whether a steady state has been reached
7.3 Determining When a Steady State Has Been Achieved

In the following sections, qualitative and quantitative criteria involved in
making the decision as to whether the ground water has returned to a steady state following
7-3
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
SAMPLING
remediation are discussed. Some of these criteria are based on a comparison of present
ground water levels with comparable levels before treatment. Others are based solely on
measurements and conditions after treatment has terminated. To a certain extent, the
decision as to when steady state has been reached is judgmental. It is not possible to prove
that a ground water system has achieved steady state. Thus, it is important to examine data
obtained from the ground water system to see if there are patterns which suggest that steady
state has not been achieved. If there are no such patterns (e.g., in the water level or speed
and direction of water flow), it may be reasonable to conclude that a steady state has been
reached.

Any data on the behavior of the ground water prior to the undertaking of
remediation may serve as a useful baseline, indicating what "steady state" for that system
had been and, thus, to what it might return. However, the actions of remediation and the
resulting physical changes in the area may change the characteristics of steady state. In this
case, such a comparison may be less useful. When it seems clear that steady state charac-
teristics have changed after remediation efforts, it is usually prudent to allow mom time for
remediation effects to decay.

Collection of data to determine whether steady state has been achieved
should begin at the various monitoring wells at the site after remediation has been termi-
nated. The variables for which data will be obtained should include measures related to the
contaminant levels, the ground water levels, the speed and direction of the flow, and any
other measures that will aid in determining if the ground water has returned to a steady
state. The frequency of data collection will depend on the correlation among consecutively
obtained values (it is desirable to have a low correlation). A period of three months
between data collection activities at the wells may be appropriate if them appears to be some
correlation between observations. With little or no correlation, monthly observations may
prove useful. If the serial correlation seems to be high, the time interval between data
collection efforts should be lengthened. With little or no information about seasonal
patterns or serial correlations in the data, at least six observations per year are recom-
mended. After several years of data collection, this number of observations will allow an
assessment of seasonal patterns, trends, and serial correlation. It may be useful to consult
with a statistician if there is some concern about the appropriate sampling frequency.

All data collected should be plotted over time in order to permit a visual
analysis of the extent to which a steady state exists for the ground water. In Section 7.4,
7-4
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING

the chatting of data and the construction of plots arc discussed. Section 7.4.3 provides
illustrations of such plots and their interpretation. In Section 7.4.4, statistical tests that can
be employed for identifying departures from randomness (e.g., trends) in the data are
indicated. Suggestions far seasonally adjusting data prior to plotting are provided, and
graphical methods are discussed.

7.3.1 Rough Adjustment of Data for Seasonal Effects

One concern in applying graphical techniques is that the data points being
plotted are assumed to be independent of each other. Even if the serial correlation between
observations is low, there may be a seasonal effect on the observations. For example,
concentrations may be typically higher than the overall average in the spring and lower in
the fall. To adjust for seasonal effects, one may subtract a measure of the "seasonal"
average from each data value and then add back the overall average (Box 7.1). The addi-
tion of the overall average will bring the adjusted values back to the original levels of the
variable to maintain the same reference frame as the original data.
Box 7.1
Adjusting for Seasonal Effects

Suppose we let xjlc be the jth individual data observation in year k, Xj be the
average for period j obtained from the baseline period prior to treatment for
period j, and x be the overall average for all data collected for the baseline
period. For example, if six data values per year have been collected
bimonthly for each of three years during the baseline period, six X: values
would be computed, each based on three data points taken from the three
different years for which data were collected. The value x would be
computed over all 18 data values. The adjusted jth data observation in year
k, X£, can then be computed from:
i
xjk « Xjfc-Xj + X (7.1)

If there are missing values, calculate Xj as in Box 5.4.
7-5
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
Plot the values of Xjk versus time. In examining these plots, checks for
runs and trends can be made for the adjusted values.
7.4
Charting the Data
In general, it is useful to plot the data collected from a monitoring program.
Such plots are similar to "control charts" often used to monitor industrial processes, except
control limits will not appear on the charts discussed here. Use the horizontal, or X-axis,
to indicate the time at which the observation was taken; and use the vertical, or Y-axis, to
indicate the value of the variable of interest (e.g., the contaminant level or water table level
or the value of other variables after adjustment for seasonal effects). Figure 7.2 gives an
example of a plot which may be used to assess stability during the period immediately
following treatment.

Notice that in Figure 7.2, the "prior average" has also been placed on the
plot. This line represents the average of the baseline data collected before remediation
efforts began. For example, this value could be the average of eight points collected
quarterly over a two-year period. It may also be useful to plot separately the individual
observations gathered to serve as the baseline data, so that information reflecting seasonal
variability and the degree of serial correlation associated with the baseline period can be
readilyexamined.

Figure 7.2 Example of Time Chart for Use in Assessing Stability
1
0.9
0.8
£ 0.7-
« 0.6-
0.5 .
1
0.4 .
0.3
02.
0.1 .
0
PRORAVERAGE
1
3 4
Time (Recorded Quarterly)
7-6
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
7 . 4 . 1 A Test for Change of Levels Based on Charts

If the ground water conditions after remediation are expected to be compa-
rable to the prior conditions, we would expect that the behavior of water levels and flows to
resemble that of those same variables prior to the remediation effort in terms of average and
variability. One indication that a steady state may not have been reached is the presence of
a string of measurements from the post treatment period which arc consistently above or
below the average prior to beginning remediation. A common rule of thumb used in indus-
trial Statistical Process Control (SPC) is that if eight consecutive points are above or below
the average (often called a "run" in SPC terminology), the data are likely to come from a
different process than that from which the average was obtained (Grant and Leavenworth,
1980). This rule is based on the assumption that the observations are independent. This
assumption is not strictly applicable in ground water studies since there is likely to be serial
correlation between observations as well as seasonal variability. Assuming independent
observations, an eight-point run is associated with a 1 in 128 chance of concluding that the
mean of the variable of interest has changed when, in fact, there has been no change in the
mean.

The above discussion suggests that for the purpose of deciding whether the
ground water has achieved steady state, a string of 7 to 10 consecutive points above or
below the prior average might serve as evidence indicating that the state of the ground water
is different from that in the baseline period. If it is suspected that a high degree of serial
conelation exists, it would be appropriate to require a larger number of consecutive points.

742 A Test for Trends Based on Charts

The charts described here provide a simple way of identifying trends. If six
consecutive data points arc increasing (or decreasing)1 ~ sometimes stated as "5
consecutive intervals of data" so that it is understood that the first point in the string is to be
counted ~ then there is evidence that the variable being monitored (e.g., water levels or
flows, or contaminant concentrations) has changed (exhibits a trend). Again, independence
'This rule of 6 is based on the assumption that all 720 orderings of the points are equally likely. This is
not always true. Hence such rules are to be considered only as quick but reasonable approximations.

7-7
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
SAMPLING
of the observations is assumed. A group of consecutive points that increase in value is
sometimes referred to as a "run up," while a group of consecutive points that decrease in
value is referred to as a "run down."

With the rule of six consecutive data points described above, the chance of
erroneously concluding that a trend exists is only 1 in 360, or about 0.3 percent. In
contrast, a rule based on five consecutive points has a 1 in 60 chance (1.6 percent) of
erroneously concluding that there is a trend, while a rule based on seven consecutive points
would have a corresponding 1 in 2,520 chance (0.04 percent) of erroneously concluding
that there is a trend. Thus, depending on the degree of serial correlation expected, a "and"
of 5 to 7 points may suggest that the ground water levels and flows are not at steady state.

In practice, data for many ground water samples may be collected before
any significant runs are identified. For example, in a set of 30 monthly ground water flow
rate measurements, there may be a run up of seven points and several shorter runs. Such
patterns of runs can be analyzed by examining the length or number of runs in the series.
Formal statistical procedures for analyzing trends in a time series are given by Gilbert
(1987).

A quick check for a general trend over a long period of time can be accom-
plished as follows. Divide the total number of data points available, N, by 6. Take the
closest integer smaller than N/6 and call it I. Then select the I data value over time, the
2(1*), the 3(Ith), etc. For example, if N = 65, then I = 10, and we would select the 10th,
20th, etc., points over time. If there are six consecutive points increasing or decreasing
over time, there is evidence of a trend. This test will partially compensate for serial
con-elation.
7.4.3 Illustrations and Interpretation

Once the plotting of data has begun, there are various patterns that may
appear. Figures 7.3 through 7.8 represent six charts which indicate possible patterns that
may be encountered. Evidence of departures from stability is being sought. The first five
charts, except Figure 7.4, indicate evidence of instability (or in the cases of Figures 7.5 and
7.6, suspicions of possible instability), i.e., changes in characteristics over time.
Figure 7.3 shows "sudden" apparent outliers or spikes that indicate unexpected variability
7-8
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
SAMPLING
in the variable being monitored. Figure 7.4 illustrates a six-point trend in the variable being
monitored. Figures 7.5 and 7.6 suggest that a trend may exist but there is insufficient
evidence to substantiate it. Attention should be paid to the behavior of subsequent dam in
these cases. (In particular, the data in Figure 7.5 could indicate a general trend using the
"quick check" discus& in the previous section depending on the randomly selected set of
points included in the test.) Figure 7.7 reflects a change (around observation 15) in both
variability (the spread of the data becomes much greater) and average (the average appears
to have increased). Figure 7.8 indicates a variable that appears to be stable.

In interpreting the plots, the return to a steady state will generally be indi-
cated by a random scattering of data points about the prior average. The existence of
patterns such as runs or trends suggests instability. Pattern, associated with seasonality
and serial correlation should be consistent with those seen prior to remediation. At the
very least, the average value for levels of contaminants after remediation should be lower
than that prior to remediation. A run below the prior average fur contaminant level
measures would certainly not be evidence that the ground water is not at steady state, since
the whole point of the remediation effort is to reduce the level of contamination. A trend
downwards in contamination levels may be an indication that a steady state has not been
reached. Nevertheless, if substantial evidence suggests that this decline or an eventual
leveling off will be the future state of that contaminant on the site, tests for attainment of the
cleanup standards would be appropriate.

On the other hand, if it seems that the average contamination level after
remediation will be above the prior average or that there is a consistent trend upwards in
contamination levels, it may be decided that the previous remediation efforts were not
totally successful, and further remediation efforts must be undertaken. This may be done
with a minimal amount 'of data, if, based on the data available, it appears unlikely that the
cleanup standard will be met. However, what should be taken into account is the relative
cost of making the wrong decision. Two costs should be weighed against each other: the
cost of obtaining further observations from the monitoring wells if it turns out that the
decision to resume remediation is made at a later date (the loss here is in terms of time and
the cost of monitoring up to the time that remediation actually is resumed) against the cost
of resuming remediation when in fact a steady state would eventually have been achieved
(the loss here is in terms of the cost of unnecessary cleanup effort and time). In addition,
the likelihood of making each of these wrong decisions, as estimated based on the available
information, should be incorporated into the decision process.
7-9
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
Figure 7.3 Example of Apparent Outliers
45
40
35
30
25 .
20 .
15 .
10
5
0
—I
25
10
15
20
Figure 7.4 Example of a Six-point Upward Trend in the Dam
10
15
20
25
7-10
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
7.4.4 Assessing Trends via Statistical Tests

The discussions- in Section 7.4.3 considered graphical techniques for
exploring the possible existence of trends in the dam. Regression techniques discussed in
Chapter 6 provide a formal statistical procedure for considering possible trends in the
data.

Other formal procedures for testing for trends also exist. Gilbert (1987)
discusses several of them, such as the Seasonal Kendall Test, Sen's Test for Trend, and a
Test for Global Trends (the original articles in which these tests ate described were: Hirsh
and Slack, 1984; Hirsch, Slack, and Smith, 1982; Farrell, 1980; and van Belle and
Hughes, 1984).

The Seasonal Kendall Test provides a test for trends removes seasonal
effects. It has been shown to be applicable in cases where monthly observations have been
gathered for at least three years. The degree to which critical values obtained from a normal
table approximate the true critical values apparently has nut been established for other time
intervals of data collection-e.g., quarterly or semi-annually. This test would have to be
carried out for each monitoring well separately at a site. Sen's Test for Trend is a more
sensitive test far detecting monotonic trends if seasonal effects exist, but requires more
complicated computations if there are missing data. The Test for Global Trends provides
the capability for looking at differences between seasons and between monitoring wells, at
season-well interactions, and also provides an overall trend test. All three of these tests
(the Seasonal Kendall, Sen's, and the Global tests) require the assumption of independent
observations. (Extensions of these tests allowing for serial correlations require that much
more data be collected—for example, roughly 10 years worth of monthly data for the
Seasonal Kendall test extension.) If this assumption is violated these tests tend to indicate
that a trend exists at a higher rate than specified by the chosen a level when it actually does
not. Thus, these tests may provide useful tools for detecting trends, but the finding of a
trend via such a test may not necessarily represent conclusive evidence that a trend exists.
Gilbert provides a detailed discussion of all three tests as well as computer code that can be
used for implementing the tests. However, this discussion does not consider the power of
these trend tests, i.e., the likelihood that such tests identify a trend when a trend actually
7-13
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING

exists is not addressed. If the power of these tests is low, existing trends may not be
detected in a timely fashion.

7.4.5 Considering the Location of Wells

In addition to assessing the achievement of steady state in a well over time,
it is also useful to consider the comparison of water and contamination levels across wells
at given points in time. This can readily be done by constructing either (1) a scatter plot
with water or contamination levels on the vertical axis and the various monitoring wells
indicated on the horizontal axis, or (2) constructing a contour plot of concentrations or
water levels across the site and surrounding area. Commercial computer programs are
available for preparing contour plots. In particular, see the discussion in Volume 1 (Chapter
10) on kriging. If there are, large, unexpected differences in water or contamination levels
between wells, this may suggest that steady state has not yet been reached.

7.5 Summary

Finding that the ground water has returned to a steady state after terminating
remediation efforts is an essential step in the establishment of a meaningful test of whether
or not the cleanup standard have been attained. There arc uncertainties in the process, and
to some extent it is 'judgmental. However, if an adequate amount of data are carefully
gathered prior to beginning remediation and after ceasing remediation, reasonable decisions
can be made as to whether or not the ground water can be considered to have reached a
state of stability.

The decision on whether the ground water has reached steady state will be
based on a combination of statistical calculations, plots of data, ground water modeling,
use of predictive models, and expert advice from hydrogeologists familiar with the site
7-14
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
Figure 7.5 Example of a Pattern in the Data that May Indicate an Up ward Trend
70

so .

20 -

0
PPDRAVBWGE
10
15
20
Figure 7.6 Example of a Pattern in the Data that May Indicate a Downward Trend
7-11
-------
CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
SAMPLING

Figure 7.7 Example of Changing Variability in the Data Over Time
Figure 7.8 Example of a Stable Situation with Constant Average and Variation
7-12
-------
8. ASSESSING ATTAINMENT USING FIXED SAMPLE
SIZE TESTS
After the remediation effort and after the ground water has achieved steady
state, water samples can be collected to determine whether die contaminant concentrations
attain the relevant cleanup standards. The sampling and evaluation period for making this
attainment decision is represented by die unshaded portion in the figure below.
Figure 8.1 Example Scenario for Contaminant Measurements During Successful
Remedial Action
1.2 -
Start
Treatment
Measured
Ground
Water
Concentration
In this chapter statistical procedures are present for assessing the attain-
ment of cleanup standards for ground water at Superfund sites. As discussed previously,
the procedures presented arc suitable for assessing the time series of chemical concentra-
tions measured in individual wells relative to a cleanup standard. Note that attainment
objectives, as discussed in Chapter 3, must be specified by those managing the site
remediation before the sampling for assessing attainment begins.

The collection of samples for assessing attainment of the cleanup standards
will occur after the remedial action at the site has been completed and after a subsequent
period has passed to allow transient affects due to the remediation to dissipate. This will
allow the ground water concentrations, flows, and water table levels to reach equilibrium
with the surrounding environment. It will be important to continue to chart the ground
8-1
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

water dam to monitor the possibility of unexpected departures from an apparent steady
state. Some such departures are illustrated in Figures 7.3 through 7.7.

The attainment decision is an assessment of whether the post-cleanup
contaminant concentrations are acceptable compared to the cleanup standard and whether
they are likely to remain acceptable. To assess whether the contaminant concentrations are
likely to remain acceptable, the statistical procedures provide methods for determining
whether or not a long-term average concentration or a long-term percentage of the well
are below the established cleanup standards.
It is assumed in this chapter that the periodic or seasonal patterns in the data
repeat on a yearly cycle. It may be that another, perhaps shorter, period of time would be
appropriate. In such a case, the reference to "yearly" averages may be adjusted by the
reader to reflect the appropriate period of time for the site under consideration. In the text,
mention of alternative "seasonal cycles or periods" indicates where such adjustments may
be appropriate.

This chapter presents statistical procedures for determining whether:
The mean concentration is below the Cleanup standards; or
A selected percentile of all samples is below the cleanup standard
(e.g., does the 90th percentile of the distribution of concentrations
fall below the cleanup standard?).

Many different statistical procedures can be used to assess the attainment of
the cleanup standard. The procedures presented here have been selected to provide reason-
able results with a small sample size in the presence of correlated dam. They require
minimal statistical background and expertise. If other procedures are considered, consulta-
tion with a statistician is recommended. In particular, in the unlikely event that the
measurements are not serially correlated, the methods presented in chapter 5 which assume
a random sample can be used.

The procedures presented arc of two types: fixed sample size tests are
discussed in this chapter, and sequential tests arc discus& in Chapter 9. Figure 8.2 is a
flow chart outlining the steps involved in the cleanup process when using a fixed sample
size test. Section 8.6 discusses testing for trends if the levels of contaminants are
scccptdblc.
8-2
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

Figure 8.2 Steps in the Cleanup Process When Using a Fixed Sample Size Test
Reuwtt Cleanup
Technology
Sun

I
Wau for Ground Water
to Reach Steady Stato
i
Specify SanplaDMifn
Collect the Data
I
Dettaninelfihe
Ground Water Attains
the Ckannp Standard
8-3
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
8.1 Fixed Sample Size Tests

This chapter discusses assessing the attainment of cleanup standards using a
test based on-a predetermined sample size. For a fixed sample size test, the ground water
samples are collected on a regular schedule, such as every two months, for a predetermined
number of years. After all the data have been collected, the data are analyzed to determine
whether the concentrations in the ground water attain the cleanup standard. Even if the
initial measurements suggest that the ground water may attain the cleanup standard, all
samples must be collected before the statistical test can be performed. An advantage of this
approach is that the number of samples required to perform the statistical test will be known
before the sampling begins, making some budgeting and planning tasks easier than when
using a sequential test (Chapter 9).

Three procedures are presented for testing the mean when using fixed
sample size tests. The first and second procedures use yearly averages concentrations.
The first method, based on the assumption that the yearly means have a normal distribu-
tion, is recommended when there are missing values in the data and the missing values are
not distributed evenly throughout the year. The second procedure assumes that the distri-
bution of the yearly average is skewed, similar to a lognormal distribution, rather than
symmetric. If there are few or no missing values, the second method using the log trans-
formed yearly averages is recommended even if the data are not highly skewed. The third
method requires calculation of seasonal effects and serial correlations to determine the
variance of the mean. Because the third method is sensitive to the skewness of the data, it
is recommended only if the distribution of the residuals is reasonably symmetric.
Regardless of the procedure used, the sample size for assessing the mean should be deter-
mined using the steps described in Section 8.2.1.
8.2 Determining Sample Size and Sampling Frequency

Whether the calculation procedures used for assessing attainment use yearly
averages or individual measurements, the formulas presented below for determining the
required sample size use the characteristics of the individual observations. In the unlikely
event that many years of observations are available for estimating the variance of yearly
average, the number of years of sampling (using the same sample frequency as in the
available data) can also be determined from the yearly averages using equation (5.35). The

8-4
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

following sections discuss the calculation of sample size for testing the mean and testing
proportions.

8.2.1 Sample Size for Testing Means

The equations for determining sample size require the specification of the
following quantities: Cs, m, a, and f) (see Sections 3.6 and 3.7) for each chemical under
investigation. In addition, estimates of the serial correlation $ between monthly observa-
tions and the standard deviation
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.1
Steps for Determining Sample Size for Testing the Mean

(1) _ Determine the estimates of o and ^ which describe the data. Denote
these estimates by ft and $.

(2) Estimate the ratio of the annual overhead cost of maintaining
sampling operations at the site to the unit cost of collecting process-
ing, and analyzing one ground water sample. Call this ratio $R.

(3) Based on the values of $R and $, use Appendix Table A.4 to deter-
mine the approximate number, np, of samples to collect per year or
seasonal period. The value np may be modified based on site-
specific considerations, as discussed in the text

(4) The sampling frequency (i.e., the number of samples to be taken per
year) is np or 4, whichever is larger. Denote this sampling
frequency as n. Note that, under this rule, at least four samples per
year per sampling well will be collected.

(5) For given values of n and $, determine a "variance factor" from
Appendix Table A.5. Denote this factor by F. For example, for
$ = 0.4 and n = 12, the factor is F * 5.23.

(6) A preliminary estimate of the required number of years to sample,
mj, is
(8.1)
where zj.p and zi - are the critical values from the normal distribu-
tion with probabilities of 1-a and 1-fJ (Table A.2).

(7) The number of years of data will be denoted by m and will be
determined by rounding m^ to the next highest integer. The total
number of samples per well will be N=nm.
Appendix Table A.4 shows the approximate number of observations per

year (or period) which will result in the minimum overall cost for the assessment (see

Appendix F for the basis for Table A.4). Note that the sampling frequencies given in Table

A.4 are approximate and are based on numerous assumptions which may only approximate

the situation and costs at a particular Superfund site. Using the table requires knowledge of

the serial cotrelations between observations separated by one month (or one-twelfth of the
8-6
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

seasonal cycle) and the cost of extending the sampling period for one more year relative to
taking an additional ground water sample.

Find the column in Table A.4 that is closest to the estimate of $R being
used. Find the row which most closely corresponds to $. Denote the tabulated value by
rip. For example, suppose that the cost ratio is estimated to be 25 and $ * 0.3. Then from
Table A.4 under the fifth column (ratio * 20), Up « 9. Since die costs and serial correla-
tions will not be known exactly, die sample frequencies in Table A.4 should be considered
as suggested frequencies. They should be modified to a sampling frequency which can be
reasonably implemented in the field. For example, if collecting a sample every month and a
half (np «8) will allow easy coordination of schedules, Dp can be changed from 9 to 8.

For determination of sample frequency, these quantities need not be precise.
If there arc several compounds to be measured in each sample, calculate the sample
frequency for each compound. Use the average sample frequency for the various
compounds

It is recommended that at least four samples per year (or seasonal period) be
collected to reasonably reflect the variability in the measured concentration within the year.
Therefore, the sampling frequency (i.e., number of samples to be taken per year) is the
maximum of four and np. Denote the sampling frequency by n. Note that, under this rule,
at least four samples per year per sampling well will be collected.

As more observations per year arc collected, the number of years of
sampling required for assessing attainment can be reduced. However, there arc limits to
how much the sampling time can be reduced by increasing the number of observations per
year. If the cost of collecting, processing, and analyzing the ground water samples is very
small compared to the cost of maintaining the overall sampling effort many samples can be
collected each year and the primary cost of the assessment sampling will be associated with
maintaining the assessment effort until a decision is reached. On the other hand, if the cost
of each sample is very large and a monitoring effort is to be maintained at the site regardless
of the attainment decision, the costs of waiting for a decision may be minimal and the
sampling frequency should be specified so as to minimize the sample collection, handling,
and analysis costs. It should be noted that it is assumed that the ground water remains in
steady state throughout the period of data collection.
8-7
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

The frequency of sampling discussed in this document is the simplest and
most straightforward to implement: determine a single time interval between samples and
select a sample at all wells of interest after that period of time has elapsed (e.g., once every
month, once every six weeks, once a quarter, etc.). However, there are other approaches
to determining sampling frequency, for example, site specific data may suggest that time
intervals should vary among wells or groups of wells in order to achieve approximately the
same precision for each well. Considering such approaches is beyond the scope of this
document, but the interested reader may reference such articles as Ward, Loftis, Nielsen,
and Anderson (1979), and Sanders and Adrian (1978). It should be noted that these arti-
cles arc oriented around issues related to sampling surface rather than ground water but
many of the general principlesapply to both. In general, consultation with a statistician is
recommended when establishing sampling procedures.

Use the sample frequency per year, the estimated serial correlation between
monthly observations, and Appendix Table A.5 to determine a "variance factor" for esti-
mating the required sample size. For the given values of n and $, determine the variance
factor in Table A.5. Denote this factor by F. For example, for $ « 0.4 and n = 12, the
factor is F « 5.23. For values of $ and n not listed in Table A.5, interpolation between
listed values may be used to determine F. Alternatively, if a conservative approach is
desired (i.e., to take a larger sample of data), take the smaller value of F associated with
listed values of $ and n. For values outside the range of values covered in Table A.5, see
Appendix F.

A preliminary estimate of the required number of years of sampling, m,j is
given by equation (8.1). The first ratio in this equation is the estimated variance of the
yearly average, o^ = ^r. The final addition of 2 to the sample size estimate improves the
estimate with small sample sizes (see Appendix F).

Because the statistical tests require a full year's worth of data, the number of
years of data collection, md, is rounded to the next highest integer, m. Thus, n samples
will be collected in each of m years, for a total number of samples per well of N where N is
the product m*n. An example of using these procedures to calculate sample size for testing
the mean is provided in Box 8.2.
8-8
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.2
Example of Sample Size Calculation for Testing the Mean

Suppose that, for a » .01, it is desired to detect a difference of .2 ppm
from the cleanup standard of J ppm (for example: Cs ».5, m ».3) with a
power of .80 (i.e., |5 * .20). Also suppose that the ratio of annual overhead
costs to per-unit sampling and analysis costs ($R) is close to 10. Further, it
is estimated that & -.43 and $ - .20. Then for $ - .20 and cost($n) * 10,
Table A.4 gives np = 9. For np = 9 and $ = .20, F = 7.17 from Table A.5.
Further, using equation (8.1):
to determine the number of years, 014, to collect data, we find

.842 + 2.3262
where zi.p » .842 and z\^ • 2.326, as can be found from Table A.2 or any
normal probability table.

Rounding up gives a sampling duration of nine years and a total sample size
of 9*9= 81 samples.
8.2.2 Sample Size for Testing Proportions

The testing of proportions is similar to the testing of means in that the

average coded observation (e.g., the proportion of samples fop which the cleanup standard
has been exceeded) is compared to a specified proportion. The method for determining
sample size described below works well when there is a low con-elation between observa-
tions and no or small seasonal patterns in the data. If the correlation between monthly

observations is high or there arc large seasonal changes in the measurements, then consul-
tation with a statistician is recommended. If the parameter to be tested is the proportion of

contaminated samples from either one well or an array of wells, one can determine the
sample size for a fixed sample size test using the procedures in Box 8.3. These procedures
for determining sample size require the specification of the following quantities: O, |J, PO,

and P! (see Section 3.7 and Section 5.4.1). In general, many samples are required for
testing when testing small proportions.
8-9
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.3
Determining Sample Size for Testing Proportions

(1) Compute the estimates of o and + which describe the measurements
(not the coded values). Denote this estimates by 6 and lm.

Let $ » Y?, ($ is the estimated correlation between the coded
observations).

(2) Estimate the ratio of the annual overhead cost of maintaining
sampling operations at the site to the unit cost of collecting, pro-
cessing and analyzing one ground water Sample. Call this ratio $R.

(3) Based on the values of $R and |, use Table A.4 to determine the
approximate number, np, of samples to collect per year or seasonal
period. Based on site-specific considerations, the value np may be
modified to a number which is administratively convenient

(5) For given values of n and $, determine a "variance factor" from
Table A.5. Denote this factor by F.

(6) For given values of F, a, f), PQ, and Pt a preliminary
estimate of the number of years to sample is
P0 - P,
where ZI_P and z\^ are critical values from the normal distribution
associated with probabilities of l-o and 1^ (Appendix Table A.2).
If aid is less than -^-, use mj » ^- instead. Equation (8.2) is an
adaptation of (8.1), using equation (5.25) of Chapter 5.

(7) The number of years of data will be denoted by m, and will be
determined by rounding n^ to the next highest integer. The total
number N or samples per well will be N=nm.
8-10
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FORCED SAMPLE SIZE TESTS

8.2.3 An Alternative Method for Determining Maximum Sampling
Frequency

The maximum sampling frequency can be determined using the hydrogeo-
logic parameters of ground water wells. The Darcy equation (Box 8.4) using the hydraulic
conductivity, hydraulic gradient, and effective porosity of the aquifer, can be used to
determine the horizontal component of the average linear velocity of ground water. This
method is useful for determining the sampling frequency that allows sufficient time to pass
between sampling events to ensure, to the greatest extent technically feasible, that there is a
complete exchange of the water in the sampling well between collection of water samples.
Although samples collected at the maximum sampling frequency may be independent in the
physical sense, statistical independence is unlikely. Other factors such as the effect of
contamination history, remediation, and seasonal influences can also result in correlations
over time periods greater than that required to flush the well. As a result, we recommend
that the sampling frequency be less than the maximum frequency based on Darcy's
equation. Use of the maximum frequency can be approached only if estimated correlations
based on ground-water samples are close to zero and the cost ratio, $R, is high. A detailed
discussion of the hydrogeologic components of this procedure is beyond the scope of this
document. For further information refer to Practical Guide for Ground-Water Sampling
(Barcelona et al., 1985) or Statistical Analysis of Grnnnfl-Water Monitoring Data at RCRA
Facilities (U.S. EPA, 1989b).
Box 8.4
Choosing a Sampling Interval Using the Darcy Equation
The sampling frequency can be based on estimates using the average linear
velocity of ground water. The Darcy equation relates ground water velocity
(V) to effective porosity (Nc), hydraulic gradient (i), and hydraulic
conductivity (k):

V»^ (8.3)

The values for k, i, and Ne can be determined from a well's hydrogeologic
characteristics. The time required for ground water to pass through the well
diameter can be determined by dividing the monitoring well diameter by the
average linear velocity of ground water (V). This value represents the
minimum time interval required between sampling events which might yield
an independent ground water sample.
8-11
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

8.3 Assessing Attainment of the Mean Using Yearly Averages

When using yearly averages for the analysis, the effects of serial correlation
can generally be ignored (except for extreme conditions unlikely to be encountered in
ground water). For the procedures discussed in this section, the variance of the observed
yearly averages is used to estimate the variance of the ova-all average concentration. First,
data are collected using the guidelines indicated in Chapter 4. Values recorded below the
detection limit should be recorded according to the procedures in Section 2.3.7. Wells can
be tested individually or a group of wells can be tested jointly. In the latter case, the data
for the individual wells at each point in time are used to produce a summary measure (e.g.,
the mean or maximum) for the group as a whole.

Two calculation procedures for assessing attainment are described below.
Both procedures use the yearly average concentrations. The first is based on the assump-
tion that the yearly averages can be described by a (symmetric) normal distribution. This is
based on a standard t-test described in many statistics books. The second procedure uses
the log transformed yearly averages and is based on the assumption that the distribution of
the yearly averages can be described by a (skewed) lognormal distribution. Because the
second procedure performs well even when the data have a symmetric distribution, the
second method is recommended in most situations. Only when there arc missing data
values for which the sampling dates axe not evenly distributed throughout the year and there
is also an apparent seasonal pattern in the data is the first procedure recommended.

The calculations and procedures when using the untransformed yearly
averages are described below and summarized in Box 8.5. This procedure is appropriate in
all situations but is not preferred, particularly if the data axe highly skewed. The calcula-
tions can be used (with some minor loss in efficiency) if a some observations are missing.
If the proportion of missing observations varies considerably from season to season and
there are differences in the average measurements among seasons, consultation with a
statistician is recommended. If there arc few missing values and the data arc highly
skewed, the procedures described in Box 8.12 which use the log transformed yearly
averages are recommended.
8-12
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.5
Steps far Assessing Attainment Using Yearly Averages
(1J Calculate the yearly averages (see Box 8.6)
(2) Calculate the mean, xm, and variance. s|, of the yearly averages
(see Box 8.7)
(3) If there are no missing observations, set
x-xm (8.4)
Otherwise, if there are missing observations calculate the seasonal
averages arid the mean of the seasonal averages, f ms, (Box 8.8)
and set
x - xms (8.5)
where x is the mean to be compared to die cleanup standard.
(4) Calculate the uper 1-a percent one-sided confidence interval for the
pper
8.9)
mean, x. (Box
(5) Decide whether the ground water attains the cleanup standards
(Box 8.10).
Use the formulas in Box 8.6 for calculating the yearly averages. If there ate
missing observations within a year, average the non-missing observations. Using the
yearly averages for the statistical analysis, calculate the mean and variance of the yearly
averages using the equations in Box 8.7. The variance will have degrees of freedom equal
to one less than the number of years over which the data was collected.

If there are no missing observations, the mean of the yearly averages, xm,
will be compared to the cleanup standard for assessing attainment If however, there are
missing observations, the mean of the yearly averages may provide a biased estimate of the
average concentration during the sample period. This will be true if the missing observa-
tions occur mostly at times when the concentrations are generally higher or lower than
throughout most of the year. To correct for this bias, the average of the seasonal averages
will be compared to the cleanup standard when there are missing observations. Box 8.8
provides equations for calculating the seasonal averages and ?ms. the mean of the seasonal
averages. Using x to designate the mean which is to be compared to the cleanup standard;
set x » xm if there are no missing observations, otherwise set x * f ms-
8-13
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.6
Calculation of the Yearly Averages

Let Xjfc * the measurements from an individual well or a combined measure
from a group of wells obtained for testing whether the mean attains the
cleanup standard; Xjk represents the concentration for season j (the jth
sample collection time out of n) in year k (where data is collected for m
years.

For each year, the yearly average is the average of all of the observations
taken within the year. If the results for one or more sample times within a
year are missing, calculate the average of the non-missing observations.
If there are nk (nk £ n) non-missing observations in year k, the yearly
average, xk, is:
(8.6)

where the summation is over all non-missing observations within the year.
Box 8.7
Calculation of the Mean and Variance of the Yearly Averages

The mean of the yearly averages, xm is:

** (8'7)
where Xk is the yearly average for year k and the summation covers m years.
The variance of the yearly averages, sj-, can be calculated using either of the
two equivalent equations below:

£(*>- *»
W
* (m-l) (m-1)

This variance estimate has m- 1 degrees of freedom.
8-14
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.8
Calculation of Seasonal Avenges and the Mean of the Seasonal Averages
For the n sample collection times within the year, the jth seasonal average is
the average of all the measurements taken at the jth collection time. If there
is a missing observation, the measurement from the jth sample collection
time may be different from the jth sequential measurement within the year.
Note that observations below the detection limits should be replaced by the
detection limit and are not counted as missing observations.
For all collection times j, from 1 to n, within each year, calculate the
seasonal average, &, where the number of observations at the jth collection
time is mi £ m. If there are missing observations, sum over the mj non-
missing observations.

(8-9)
The mean of n seasonal averages is:
^ms--Il (8.10)
Using the mean which is to be compared to the cleanup standard, x, and the
standard deviation of the mean calculated from the yearly averages, calculate the upper one
sided 1-a percent confidence interval for the mean using equation 8.11 in Box 8.9. The
standard deviation, is the square root of the variance calculated from equation (8.7).
Calculation of the upper confidence interval requires use of a, specified in the attainment
objectives, and the degrees of freedom for the standard deviation, the number of years of
data minus one, to determine the relevant t-statistic from Table A.1 in Appendix A. If the
lower one-sided confidence limit is desired, replace the plus sign in equation (8.11) with a
minus sign.

Finally, if the upper one-sided confidence interval is less than the cleanup
standard and if the concentrations are not increasing over time, decide that the tested ground
water attains the cleanup standard. If the ground water from all wells or groups of wells
attains the cleanup standard then conclude that the ground water at the site attains the
cleanup standard. The steps in deciding attainment of the cleanup standard are shown in
Box 8.10.
8-15
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.9
Calculation of Upper One-sided Confidence Limit for the Mean

The tipper one-sided confidence limit is:
where X is the mean level of contamination, and sj is the square root of the
variance of the yearly means. The degrees of freedom associated withsx is
m-1, and the appropriate value of tj^yn.1 can be obtained from Table A.l.
Box 8.10
Deciding if the Tested Ground Water Attains the Cleanup Standard
< Cs» conclude that the average ground water concentration in the
well (or group of wells) attains the cleanup standard.

If the average ground water concentration in the wells is less than the
cleanup standard, perform a trend test using the regression techniques
described in Chapter 6 to determine if there is a statistically significant
increasing trend to the yearly averages over the sampling period (also see
Section 8.6). Note that at least 3 years' worth of data are required to iden-
tify a trend. If there is not a statistically significant increasing trend
conclude that the ground water attains the cleanup standard (and possibly
initiate a follow-up monitoring program). If a significant trend does exist,
resume sampling or reconsider treatment effectiveness.
conclude that the average ground water concentration in the
wells does not attain the cleanup standard.
When the data are noticeably skewed, the calculation procedures in Box
8.12 (using the log transformed yearly averages) are recommended over those in Box 8.5.
Because the procedures in Box 8.12 also perform well when the data have a symmetric
distribution, these procedures are generally recommended in all cases where there are no
missing data. There is no easy adjustment for missing data when using the log transformed
yearly averages. Therefore, if the number of observations per season (month etc.) is not
the same for all seasons and if there is any seasonal pattern in the data, use of the proce-
dures in Box 8.5 is recommended
8-16
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8. 11
Sample of Assessing Attainment of the Mean Using Yearly Averages

To test whether the cleanup standard (Cs - 0.50) has been attained for a
particular chemical, 48 ground water samples were collected for four years
at monthly intervals. All 48 ground water samples were collected and
analyzed, and three values which were below the detection level were
replaced in the analysis by the detection limit Based on the sample data, the
overall mean concentration was determined to be .330 ppb. The corre-
sponding yearly means were computed as: X} * .31; Kj • -32; £3 = .34;
and *4 » .35. The variance of the yearly means is sj » .000333.

The one-sided 99 percent confidence interval extends from zero to
Since the cleanup standard is Cs = 0.5 ppm the average is significantly less
than the cleanup standard. However, the yearly averages are consistently
increasing and regression analysis indicates that the trend is statistically
significant at the 5 percent level (p = .0101). Therefore, it cannot be
concluded that the attainment objectives have been achieved. If the present
trend continues, the concentrations would exceed the cleanup standard in
about 10 years. Possible options include continued monitoring to determine
if the trend will continue or to reassess the treatment effectiveness and why
the upward trend exists.
The calculations when using the log transformed yearly averages are slightly

more difficult than when using the untransformed yearly averages. After calculating the

yearly averages, the natural log is used to transform the data. The transformed averages are

then used in the subsequent analysis. The upper confidence interval for the mean concen-

tration is based on the mean and variance of the log transformed yearly averages. The

formulas are based on the assumption that the yearly averages have a log normal

distribution.
8-17
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.12
Steps far Assessing Attainment Using the Log Transformed Yearly
Averages
(1) ~ Calculate the yearly averages (see Box 8.6)
(2) Calculate the natural tog of the yearly averages (see Box 8.13)
*y
(3) Calculate the mean, xm, and variance, Sg, of the log transformed
yearly averages (see Box 8.14)
(4) Calculate the upper 1-a percent one-sided confidence interval for the
overall mean. (Box 8.15)
(5) Decide whether the ground water attains the cleanup standards
(Box 8.10).
Use the formulas in Box 8.6 for calculating the yearly averages. If there are
missing observations within a year, average the non-missing observations. Calculate the
log transformed yearly averages using equation (8.12) in Box 8.13. The natural log trans-
formation is available on many calculators and computers, usually designated as "LN",
"In", or "loge." Although the equations could be changed to use the base 10 logarithms,
use only the base e logarithms when using the equations in Boxes 8.13 through 8.15.
Calculate the mean and variance of the log transformed yearly averages using the equations
in Box 8.14. The variance will have degrees of freedom equal to one less than the number
of years over which the data was collected.
Box 8.13
Calculation of the Natural Logs of the Yearly Averages
The natural log of the yearly average is:

yk«ln(xk) (8.12)
8-18
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.14
Calculation of Mean and Variance of the Natural Logs of the Yearly
The average of the m log transformed yearly averages, ym:

*m -5JE yk (8-13)

The variance of the tog transformed yearly averages, s»:

y }2
— (8.14)
2
Jk ~ m\£*J*\ Z-Vyk ~ ym/
2 k.1
S
y (m-1) (m-1)
This variance estimate has m-1 degrees of freedom.
Calculate the upper one sided 1-a percent confidence interval for the mean
using equation 8.x in Box 8.15. Calculation of the upper confidence interval requires use
of a, specified in the attainment objectives, and the degrees of freedom for the standard
deviation, the number of years of data minus one, to determine the relevant t-statistic from
Table A.2 in Appendix A. If the lower one-sided confidence limit is desired, replace the
second plus sign in equation (8.15) with a minus sign.

Finally, if the upper one-sided confidence interval is less than the cleanup
standard and if the log transformed concentrations arc not increasing over time, decide that
the tested ground water attains the cleanup standard If the ground water from all wells or
groups of wells attains the cleanup standard then conclude that the ground water at the site
attains the cleanup standard. The steps in deciding attainment of the cleanup standard are
shown in Box 8.10.
8-19
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8. 15
Calculation of the Upper Confidence Limit for the Mean Based on Log
Transformed Yearly Averages
The Upper one-sided confidence limit for die mean is:
f 4 -v /"4 4~1
(ym •»• f + li.a-.m-l \ 5f£f + ID? J (8'
where the degrees of freedom (Df) associated with so is m-1, and the
appropriate value of t^oj,,.) can be obtained from Table A.I. The term
4
under the square root is the variance of ym + -£ and was calculated from the
variance of the two terms, which are independent if the data have a lognor-
mal distribution.
8.4 Assessing Attainment of the Mean After Adjusting for Seasonal
Variation
This section provides an alternative procedure for testing the mean concen-
tration. It is expected to provide more accurate results with large sample sizes, correlated
data, and data which is not skewed. Because this procedure is sensitive to skewed data, it
is recommended only, if the distribution of the residuals is reasonably symmetric.

After the data have been collected using the guidelines indicated in
Chapter 4, wells can be tested individually or a group of wells can be tested jointly. In
the latter case, the data for the individual wells at each point in time arc used to produce a
summary measure for the group as a whole. This summary measure may be an average,
maximum, or some other measure (see Section 2.35). These summary measures will be
averaged over the entire sampling period. The tests for attainment and the corresponding
calculations required when removing seasonal averages arc described next.

The calculations and procedures when using the mean adjusted fop seasonal
variation arc described below and summarized in Box 8.16. This procedure is not recom-
mended if the data are noticeably skewed. The following calculations and procedures are
appropriate if the number of observations per year is the same far all years. However, they
8-20
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

can still be used (with some minor loss in efficiency) if a few observations are lost as long

as the loss is not concentrated in a particular season (note example in Section 8.3). If the

proportion of observations varies considerably from season to season, consultation

with a statistician is recommended. If the data are obviously skewed, the procedures

described in Box 8.15 which use the log transformed yearly averages are recommended.
Box 8.16
Steps for Assessing Attainment Using the Mean After Adjusting for
Seasonal Variation

(1) Calculate the seasonal averages and the mean of the seasonal
averages, £ m$, (Box 8.8)

(2) Calculate the deviations from the seasonal averages (residuals) (Box
8.17)

(3) Calculate the variance, sj| of the residuals (see Box 8. 18)

(4) Calculate the lag 1 serial correlation of the residuals using equation
8.18) in Box 8.19. Denote the computed serial correlation by
£8.18
*obs-
(5) Calculate the upper 1-a percent one-sided confidence interval for the
mean, x. (Box 8.20)

(6) Decide whether the ground water attains the cleanup standards
(BpxS.10).
Use the formulas in Box 8.8 for calculating the seasonal averages and the

mean of the seasonal averages. If there are missing observations within a season, average

the non-missing observations. Calculate the residuals, the deviations of the measurements

from the respective seasonal means using equation (8.16) in Box 8.17. Box 8.18 shows

how to calculate the variance of the residuals. The variance will have degrees of freedom

equal to the number of measurements less the number of seasons. Calculate the serial

correlation of the residuals using equation (8.18) in Box 8.19. If the serial correlation is

less than zero, use zero when calculating the confidence interval.
8-21
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
Box 8.17
Calculation of the Residuals

From each sample observation, subtract the corresponding seasonal mean.
That is, compute the 6jk, the deviation from the mean:

e^-Xk-X. (8.16)
Box 8.18
Calculation of the Variance of the Residuals

Calculate the variance of the residuals ejk after adjustments for possible
seasonal differences:
Alternatively, the ANOVA approach described in Appendix D can be used
to compute the required variance.
Box 8.19
Calculating the Serial correlation from the Residuals After Removing
Seasonal Averages

The sample estimate of the serial correlation of the residuals is:

N
(8-18)
Where eit i = 1,2, ...,N are the residuals after removing seasonal averages,
in the time order in which the samples were collected.
8-22
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

Using the mean of the seasonal averages and the standard deviation of the
mean, calculated from the residuals, calculate the upper one sided 1-ot percent confidence
interval for the mean using equation (8.19) in Box 8.20. The standard deviation is the
square root of the variance calculated from equation (8.17). If the observed serial correla-
tion is less than zero, use zero in equation (8.19). Calculation of the upper confidence
interval requires use of a, specified in the attainment objectives, and the degrees of
freedom for the standard deviation, the number of yean of data minus one, to determine the
relevant t-statistic from Table A.2 in Appendix A. If the lower one-sided confidence limit
is desired, replace the plus sign in equation (8.19) with a minus sign.
Box 8.20
Calculation of the Upper Confidence Limit for the Mean After Adjusting for Seasonal Variation

Calculation of the Upper One-Sided Confidence Limit

(!U9)
where x is the computed mean level of contamination computed from
equation (8.8), and s is die square root of the variance of the observations
taking into account possible seasonal variation as computed from equation
(8.17). The degrees of freedom, Df, associated with s is Df = *yi and the
appropriate value of M^f can be obtained from Table A.I. If $obsis less
than zero, set^j, to zero. For the derivation of the term under the square
root, see Appendix F.
8-23
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE! SIZE TESTS
Box 8.21
Example calculation of confidence Intervals
Table 8.1 and Figure 8.3 show hypothetical arsenic measurements for
ground water samples taken at quarterly intervals for four years. For these
data, the four seasonal (quarterly) means are: Xj * 6.688; X2 * 6.013; X3
» 5.078; and £4 » 5.878, and the overall mean is X * 5.914 ppb. The
adjusted arsenic measurements labeled "residuals," shown in the last
column of the table, are obtained by subtracting the seasonal means from the
original observations.
The estimated variance of the data, taking into account possible seasonal
differences, is s2 = ^= .163 (equation (8.11)) with 4 (i.e. ^. ^)
degrees of freedom, and the corresponding auto correlation is $0b8 = .37
(eq. 8.18).
The upper one-sided 90 percent confidence interval extends from zero to
5.914 + 1.533
6.142 ppb.
If the cleanup standard were 6 ppb, it would be concluded that the ground
water has not attained the cleanup standard.
Figure 8.3 Plot of Arsenic Measurements for 16 Ground Water Samples (see Box
8.21)
Arsenic M«a«ur«m«nts: 1984-1987
8.00
7.00
6.00
5.00
AnMfc 4.00
3.00
2.00
1.00
0.00
2 3
7 8 9 1011121314151ft
TblW m QutttM*
8-24
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

Table 8.1 Arsenic measurements (ppb) for 16 ground water samples (see Box 8.21)
Year
1984
1984
1984
1984
1985
1985
1985
1985
1986
1986
1986
1986
1987
1987
1987
1987
Quarter
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Arsenic
Measurement
6.40
5.91
4.51
5.57
7.21
6.19
4.89
5.51
6.57
5.70
5.32
5.87
6.57
6.25
5.59
6.56
Residual
-.288
-.103
-.568
-.308
.522
.177
-.188
-.368
-.118
-.313
.242
-.008
-.118
.237
.512
.682
8.5 Fixed Sample Size Tests for Proportions

If the parameter to be tested is the porportion of contaminated samples from

either one well or an array of wells, the sample collection and analysis procedures are the

same as those outlined above for testing the mean with the following changes:

To apply this nonparametric test, each measurement is either coded
"1" (me actual measurement was equal to or above the relevant
cleanup standard Cs), or "0" (below Cs). The statistical analysis is
based on the resulting coded variable of O's and 1's.

Only the analysis procedure which used yearly averages, in Box 8.6
is appropriate for the calculations. Do not use the calculation proce-
dures which correct for the seasonal pattern in the data and the serial
correlation of the residuals or which use the log transformed data.

See Section 8.22 for procedures far estimating the sample size.
8-25
-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

8.6 Checking for Trends in Contaminant Levels After Attaining the
Cleanup Standard

Once a fixed sample size statistical test indicates that the cleanup standard
for the site has been met, there remains one final concern. The model we have used
assumes that ground water at the site has reached a steady state and that there is no reason
to believe that contaminant levels will rise above the cleanup standard in the future. We
need to check this assumption. Regression models, as discussed in Chapter 6, can be used
to do so. By establishing a simple regression model with the contaminant measure as the
dependent variable and time as the independent variable, a test of significance can be made
as to whether or not the estimated slope of the resulting linear model is positive (see Section
6.1.3). Scatter plots of the data will prove useful in assessing the model. When using the
yearly averages, the regression can be performed without adjusting for serial correlation.

To minimize the chance of incorrectly concluding that the concentrations are
increasing over time, we recommend that the alpha level for testing the slope (and selecting
the t statistic in Box 6.11) be set at a small value, such as 0.01 (one percent). If, on the
basis of the test, there is not significant evidence that the slope is positive, then the evidence
is consistent with the preliminary conclusion that the ground water in the well(s) attains the
cleanup standard. If the slope is significantly greater than zero, then the concern that
contaminant levels may later exceed the cleanup standard still exists and the assumption of a
steady state is called into question. In this case, further consideration must be given to the
reasons for this apparent increase and, perhaps, to additional remediation efforts.

8.7 Summary

This chapter presented the procedures for assessing attainment of the
cleanup standards for ground water measurements using a fixed sample size test. The
testing procedures can be applied to samples from either individual wells or wells tested as
a group. These procedures are used after the ground water has achieved steady state. Both
parametric and nonparametric methods for evaluating attainment arc discussed. If the
ground water at the site is judged to attain the cleanup standards because the concentrations
are not increasing and the long-term average is significantly less than the cleanup standard,
follow-up monitoring is recommended to check that the steady state assumption holds.
8-26
-------
9. ASSESSING ATTAINMENT USING SEQUENTIAL
TESTS
After the remediation effort ha» been terminated and the ground water has
achieved steady state, ground water samples can be collected to determine whether the
resulting concentrations of contaminants attain the relevant cleanup standard The
sampling and evaluation period making this attainment decision is represented by the
unshaded portion in the figure below.
Figure 9.1 Example for Contaminant Measurements During Successful
1.2
1

Measured °'8
Ground oe
Water 0'6
Concentration Q4 .

0.2
o:
Start
Treatment
End!
Dedare<
Contaminated
Dale
In this chapter statistical procedures are presented for assessing the attain-
ment of cleanup standard for ground water at Superfund sites using sequential statistical
tests. Note that attainment objectives, as discussed in Chapter 3, must be specified before
the sampling for assessing attainment begins.

The collection of samples for assessing attainment of the cleanup standards
will occur after the remedial action at the site has been completed and after a subsequent
period has passed to allow transient affects due to the remediation to dissipate. The attain-
ment decision is an assessment of whether the remaining contaminant concentrations are
acceptable compared to the cleanup standard and whether they are likely to remain accept-
9-1
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

able. To assess whether the contaminant concentrations are likely to remain acceptable, the
statistical procedures provide methods for determining whether or not a long-term average
concentration or a long-term percentage of the well water concentration measurements are
below the established cleanup standards. In particular, in the unlikely event that the
measurements are not serially correlated, the methods presented in chapter 5, which assume
a random sample, can be used and consultation with a statistician is recommended. If
sequential tests are being considered, note that on the average, the sequential tests will
require fewer samples than the fixed sample size tests in Chapter 8 or, if applicable, those
in chapter 5.

This chapter discusses assessing the attainment of cleanup standards using a
sequential statistical test. For a sequential test, the ground water samples are collected on a
regular schedule, such as every two months. Starting after the collection of three years of
data, a statistical test is performed every year to determine whether (1) the ground water
being sampled attains the cleanup standard, or (2) the ground water does not attain the
cleanup standard, or (3) more data are required to make a decision. If more data are
required, another year's worth of data is collected before the next statistical test is per-
formed. Figure 9.2 is a flow chart outlining the steps involved in the cleanup process
when using a sequential statistical test.

Unlike the fixed sample size test, the number of samples required to reach a
decision using the sequential test is not known at the beginning of the sampling period. On
the average, the sequential tests will require fewer samples and a corresponding shorter
time to make the attainment decision than for the tests in Chapter 8. If the ground water
clearly attains the cleanup standard, the sequential test will almost always require fewer
samples than a fixed sample size test. Only when the contaminant concentrations are less
than the cleanup standard and greater than the mean for the alternate hypothesis might the
sequential test be likely to require more samples than the fixed sample size test.

This chapter presents statistical procedures for determining whether:
The mean concentration is below the cleanup standard; or
A selected percentile of all samples is below the cleanup standard
(e.g., does the 90th percentile of the distribution of concentrations
fall below the cleanup standard?).
9-2
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Figure 9.2 Steps in the Cleanup Process When Using a Sequential Statistical Test
Objectives
Treat the ground
Wait for ground water
to reach steady state
Specify Sample Desigi
and Analysis Plan
Collect the Data for
TwoY<
Collect the Data for
an Additional Year
etfl
Determine If the Ground
water in wells Attains the
Cleanup Standard
Reassess Cleanup
Technology
Is the
Cleanup
Sondanl
Attained?
Do
Yes /Concentrations
Increase Over
Tune?
More Data is
Required
9-3
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

The measured ground water concentrations may fluctuate over time due to
many factors including:

Seasonal and short-term weather patterns affecting the ground water
levels and flows;
Variation in ground water concentrations due to historical fluctua-
tions in the contamination introduced the ground water; and
Sampling errors and laboratory measurement errors and fluctua-
tions.

The effects of periodic seasonal fluctuations in concentration can be elimi-
nated from the analysis, resulting in a more precise statistical test, by either averaging the
measurements over a year or correcting for any seasonal patterns found in the data These
two statistical analysis procedures arc presented in sections 9.3 and 9.4, respectively. The
method of using yearly averages is, in general, easier to implement and preferred.
Correcting for the seasonal pattern may provide more precise statistical tests in situations
where large correlations exist between measurements and when the measurement errors
have a symmetric distribution.

Three procedures are presented for testing the mean when using sequential
tests. The first and second procedures use yearly average concentrations. The first
method, based on the assumption that the yearly means have a normal distribution, is
recommended when there are missing values in the data and the missing values are not
distributed evenly throughout the year. The second procedure assumes that the distribution
of the yearly average is skewed, similar to a lognormal distribution, rather than symmetric.
If there are no missing values, the second method using the log transformed yearly
averages is recommended even if the data are not highly skewed. The third method
requires calculation of seasonal effects and serial correlations to determine the variance of
the mean. Because the third method is sensitive to the skewness of the data, it is recom-
mended only if the distribution of the residuals is reasonably symmetric. Regardless of the
procedure used, the sample frequency for assessing the mean should be determined using
the steps described in Section 9.1.

These sequential procedures arc an adaptation of Wald's sequential proba-
bility ratio test, specifically a version of the sequential t-test They assume that the data is
normally distributed or can be made so by a log transformation. See Hall (1962). Hayre
(1983). and Appendix F for details.

9-4
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
9.1 Determining Sampling Frequency for Sequential Tests

The ground water samples will be collected at regular intervals using a

systematic sample with a random start as described in Chapter 4. An important part of
determining the sample collection procedures is to select the time interval between samples

or the number of samples to collect per seasonal period usually per year. Asdiscussed in
Chapter 8, the term "year" will be used to mean a full seasonal cycle, which in most cases

can be considered a calendar year.

The steps for determining sample frequency when testing the mean are

provided in Box 9.1 and are discussed in Section 8.2 in more detail. The procedures for
determining sample frequency require the specification of die serial correlation, 4>, and the

measurement error, o, for the chemical under investigation. The procedures described in

Section 5.3 may be used to obtain rough estimates of the serial correlation. Denote these
estimates by $. An example of calculating sample frequency is presented in Box 9.3.
Box 9.1
Steps for Determining Sample Frequency for Testing the Mean

(1) Determine the estimates of o and $ which describe the data. Denote
these estimates by ft and $.

(3) Based on the values of $R and $, use Appendix Table A.4 to deter-
mine the approximate number, np. of samples to collect per year or
seasonal period. The value np may be modified based on site-
specific considerations, as discussed in the text

(4) The sampling frequency (i.e., the number of samples to be taken per
year) is np or 4, whichever is larger. Denote this sampling fre-
quency as n. Note that, under this rule, at least four samples per
year per sampling well will be collected.
The steps for determining sample frequency when testing a proportion are

provided in Box 9.2 and are discussed in Section 8.2 in more detail.
9-5
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.2
Steps for Determining Sample Frequency far Testing a Proportion

(1) Compute the estimates of o and t which describe the measurements
(not the coded values). Denote this estimates by d and <$m.

Let $ = y-k ($ is the estimated correlation between the coded
observations, the constant 2.5 was determined from simulations).

(2) Estimate the ratio of the annual overhead cost of maintaining
sampling operations at the site to the unit cost of collecting, process-
ing, and analyzing one ground water sample. Call this ratio $R.

In Box 8.2, an example of determining the sample frequency is provided for
a fixed sample size test The determination of the number of samples to be
taken per year is required for sequential sampling also. In that example, it
was found that np = 9, so that 9 samples per year (practically speaking,
once every 1.5 months) should be collected. This is all that is needed for
sequential sampling. Samples will then be collected until a decision can be
made. Note that in Box 8.2, a further calculation was done (computing 1x14)
to determine the number of years for which data are to be collected for the
fixed sample size approach. After this period of time (eight years in the
example) a statistical test would be made to determine whether the ground
water could be considered clean or not On average, a sequential test will
require a shorter time period to reach a decision than a fixed sample size
test, but this is not guaranteed.
9-6
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

9.2 Sequential Procedures for Sample Collection and Data
Handling

The samples are assumed to be collected using a systematic sample as
discussed Chapter 4.

The sample collection and analysis procedures require the following limita-
tions on the quantity and frequency of data collected:

To provide the minimal amount of data required for the statistical
tests, at least three years of data must be collected before any statisti-
cal test can be performed.
It is strongly recommended that at least four samples be collected in
each period or year to capture any seasonal differences or variation
within a year or period.
The statistical tests are performed only on data representing a
complete year of data collection. Thus, the first statistical test would
be performed after three full years of data collection, and the second
after four full years of data collection, etc.
If the proportion of contaminated samples is required to be below a
specified value of Pa collect at least a number of samples N' such
that N1*?^ before doing the first sequential test
Handling of outliers and measurements below the detection limit is dis-
cussed in Section 2.3.7.
9.3 Assessing Attainment of the Mean Using Yearly Averages

As noted in Chapter 8, the approach of using yearly averages substantially
reduces the effects of any serial correlation in the measurements. For the procedures
discussed in this section, the variance of the observed yearly averages is used to estimate
the variance of the overall average concentration. Wells can be tested individually or a
group of wells can be tested jointly. In the latter case, the data for the individual wells at
each point in time are used to produce a summary measure for the group as a whole. This
may be an average, a maximum, or some other measure for all data values collected at a
particular point in time (see Sections 2.3.5). These summary measures will be averaged
over the yearly period.
9-7
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

Two calculation procedures for assessing attainment are described in this
section. Both procedures use the yearly average concentrations. The first is based on the
assumption that the yearly averages can be described by a symmetric normal distribution.
The second procedure uses the log transformed yearly averages and is based on the
assumption that the distribution of the yearly averages can be described by a (skewed)
lognormal distribution. Because the second procedure performs well even when the data
have a symmetric distribution, the second method is recommended in most situations.
Only when there are missing data values which are not evenly distributed throughout the
year and there is also an apparent seasonal pattern in the data is the first procedure recom-
mended.

The calculations and procedures when using the untransformed yearly
averages are described below and summarized in Box 9.4. This procedure is appropriate in
most situations but is not preferred particularly if the data are highly skewed. The calcula-
tions can-be used (with some minor loss in efficiency) if some observations (are missing. If
the proportion of missing observations varies considerably from season to season and there
are differences in the average measurements among seasons, consultation with a statistician
is recommended. If the data are highly skewed, the procedures described in Box 9.12
which use the log transformed yearly averages are recommended unless the data exhibit
both a seasonal pattern and missing observations.

Use the formulas in Box 9.5 for calculating the yearly averages for the m
years of data collect&l so far. If there are missing observations within a year, average the
non-missing observations. Calculate the mean and variance of the yearly averages using
the equations in Box 9.6. The variance will have degrees of freedom equal to m-1, one
less than the number of years over which the data was collected.

If there are no missing observations, the mean of the yearly averages, ?m>
will he compared to the cleanup standard for assessing attainment. If however, there are
missing observations, the mean of the yearly averages may provide a biased estimate of the
average concentration during the sample period. This will be true if the missing observa-
tions occur mostly at times when the concentrations arc generally higher or lower than the
mean concentration. To correct for this bias, the mean of the seasonal averages will be
compared to the cleanup standard when there are missing observations. Box 9.7 provides
equations for calculating the seasonal averages and f ms, the mean of the seasonal averages.
9-8
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUfcN I1AL SAMPLING

Using x to designate the mean value which is to be compared to the cleanup standard, set x

- f m if there are no missing observations, otherwise set x * xms>
Box 9.4
Steps for Assessing Attainment Using Yearly Averages

Q) Calculate the yearly averages for the m years of data collected so far
(see Box 9.5)

(2) Calculate the mean, ?m. and variance, sf, of the yearly averages
(see Box 9.6)

(3) If there are no missing observations, set

x-xm (9.1)

Otherwise, if there are missing observations calculate the seasonal
averages and the mean of the seasonal averages, f ms. (Box 9.7)
and set

x*fms (9.2)

where x is the mean to be compared to the cleanup standard.

(4) Calculate the t and 8 for the likelihood ratio. (Box 9.8)

(5) Calculate the likelihood ratio for the statistical test (Box 9.9)

(6) Decide whether the ground water attains the cleanup standards
(Box 9.10).

(7) If more data are required, collect an additional years samples and
repeat the procedures in this Box.
9-9
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.5
Calculation of the Yearly Averages

Let Xjk the measurements from an individual well or a combined measure
from a group of wells obtained for testing whether the mean attains the
cleanup standard, xjk represents the concentration for season j (the jth
sample collection time out of n) in year k (where data has been collected for
m years).

The yearly avenge is the average of all of the observations taken within the
year. If the results for one or more sample times within a year are missing,
calculate the average of the non-missing observations. If there are
n) non-missing observations in year k, the yearly average, 3^, is:
where the summation is over all non-missing observations within the year.
Calculate the yearly average for all m years.
Box 9.6
Calculation of the Mean and Variance of the Yearly Averages

The mean of the m yearly averages, f m is:

fm = ££ *k (9.4)

where Xk is the yearly average for year k.
2
The variance of the yearly averages, Sg. can be calculated using either of the
two equivalent formulas below:
x (m-1) (m-1)

This variance estimate has m-1 degrees of freedom.
9-40
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.7
Calculation of Seasonal Averages and the Mean of the seasonal Averages
Far the n sample collection times within the year, the j1 seasonal average is
the average of all the measurements taken at the jl collection time. Note
that if there is a missing observation at one collection time, the tnieasurement
from the jl sample collection time may be different than the j sequential
measurement within the year.
For all collection times j, from 1 to n, within each year, calculate the
seasonal average, Xj. The number of observations at the j111 collection time
is mj £ m. If there are missing observations, sum over the mj non-missing
observations.
The mean of n seasonal averages is:
(9.7)
The total number of observations is:
N » 2 inj (9.8)
Using the mean x, and the standard deviation of the mean calculated from
the yearly averages, s^, calculate t and 5 using equations (9.9) and (9.10) in Box 9.8.
These values are used in the calculation of the likelihood ratio. The standard deviation is
the square root of the variance calculated from equation (9.5). The t-statistic used here is
slightly different from that used in the standard t-test. Use of this definition of t makes
calculation of the likelihood ratio easier.

Use equation (9.11) in Box 9.9 to calculate the likelihood ratio for the
sequential test 'This equation provides a good approximation to the actual likelihood ratio
which is difficult to calculate exactly. For references and more details about this approxi-
mation, see Appendix F.
9-H
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9. 8
Calculation of t and 8 When Using the Untransfonned Yearly Averages
(9.9)
8- ^"f! (9.10)
V?
where x is the mean level of contamination, and sj is the square root of the
variance of the yearly means. The degrees of freedom associated with s* is
m-1
Box 9.9
Calculation of the Likelihood Ratio for the Sequential Test
The likelihood ratio is:
where m is the number of years of data collected so far and t and 8 are
calculated from the m years of data.
Finally, the likelihood ratio, a, and p* are used to decide if the average
concentration is less than the cleanup standard. If the average is less than the cleanup
standard and if the concentrations are not increasing over time (see Section 9.7), conclude
that the tested ground water attains the cleanup standard. If the ground water from all wells
or groups of wells attains the cleanup standard then conclude that the ground water at the
site attains the cleanup standard. If the average concentration is not less than the cleanup
standard or if the concentrations are increasing over time, conclude that the ground water in
the well does not attain the cleanup standard. The steps in deciding attainment of the
cleanup standard are shown in Box 9.10.
9-12
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.10
Deciding if the Tested Ground Water Attains the Cleanup Standard
Calculate:

(912)
If LR £ A, conclude that the ground water in the wells does nnt attain the
cleanup standard.,

If LR > B, conclude that the average ground water concentration in the well
(or group of wells) is less than the cleanup standard. Perform a trend test
using the regression techniques described in Chapter 6 to determine if there
is a statistically significant increasing trend in the yearly averages over the
sampling period (also see Section 9.7).

If there is not a statistically significant increasing trend, conclude that the
ground water attains the cleanup standard (and possibly initiate a follow-up
monitoring program). If a significant trend does exist, conclude that the
ground water in the wells does not attain the cleanup standard and resume
sampling or reconsider treatment effectiveness.

If A < LR £ B then collect an additional years worth of data before perform-
ing the hypothesis test again.
9-13
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.11
Example Attainment Decision Based on a Sequential Test

In this example we will use the arsenic measurements appearing in Table
8.1. Suppose we wish to compare the cleanup standard (Cs * 6) with a
targeted cleanup average Oi}) of 5.72 Oi} is the value for which the false
negative rate |5 is to be controlled). Box 8.21 indicates the four yearly
means *k and the overall average Xm = 5.914. Using equation (9.5), the
value of sj = .0706 for m = 4. Thus,

6+5.72

-. 406 and
.0706

With a = .1 and J3 = .1, then A = .111, B = 9.0. Since 0.618 is neither
less than A or greater than B, we have insufficient data to conclude that the
cleanup standard has been either attained or not attained Thus, more data
must be gathered

Suppose data continue to be collected for seven more years without a
2
decision being reached. At that time, the overall average Xm = 5.77 and s*
= .1024 for m » 11. Thus,

... 6+5.72
J.// - 5 r •••) f-
-.933 and 8 = -^t=== -2.902
.1024
11
LR - exp [-2.902 ^ (-.933) Vn-ilW] = 9'29

Since LR = 9.38 > 9.0, we conclude that the mean ground water concentra-
tions are less than the cleanup standard.
9-14
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

When the data are noticeably skewed, the calculation procedures using the
log transformed yearly averages (Box 9.12) arc recommended over those in Box 9.4.
Because the procedures in Box 9.12 also perform well when the data have a symmetric
distribution, these procedures arc generally recommended, in all cases where there are no
missing data. There is no easy adjustment for missing data when using the log transformed
yearly averages. Therefore, if the number of observations per season (month etc.) is not
the same for all seasons and if there is any seasonal pattern in the data, use of the proce-
dures in Box 9.4 is recommended.

The calculations procedure when using the log transformed yearly averages
is described below and summarized in Box 9.12. The calculations arc slightly more
difficult than when using the transformed yearly averages. After calculating the yearly
averages, take the natural log is used to transform the data. The transformed averages are
then used in the subsequent analysis. The upper confidence interval for the mean concen-
tration is based on the mean and variance of the log transformed yearly averages. The
formulas are based on the assumption that the yearly averages have a log normal
distribution.
Box 9.12
Steps for Assessing Attainment Using the Log Transformed Yearly Averages

(1) Calculate the yearly averages (see Box 9.5)
(2) Calculate the natural log of the yearly averages (see Box 9.13)
(3) Calculate the mean, ym, and variance, s«, of the log transformed
yearly averages (see Box 9.14) y
(4) Calculate the t and 6 for the likelihood ratio. (Box 9.15)
(5) Calculate the likelihood ratio (Box 9.9)
(6) Decide whether the ground water attains the cleanup standards
(Box 9.10).
(7) If more data are required, collect an additional years samples and
repeat the procedures in this Box.
9-15
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

Use the formulas in Box 9.5 for calculating the yearly averages. If there are
missing observations within a year, average the non-missing observations. Calculate the
log transformed yearly averages using equation (9.13) in Box 9.13. The natural log
transformation is available on many calculators and computers, usually designated as
"LN", "In", or "loge." Although the equations could be changed to use the base 10 loga-
rithms, use only the base e logarithms when using the equations in Boxes 9.13 through
9.15. Calculate the mean and variance of the log transformed yearly averages using the
equations in Box 9.14. The variance will have degrees of freedom equal to one less than
the number of years over which the data was collected.
Box 9.13
Calculation of the Natural Logs of the Yearly Averages
The natural log of the yearly average is:

yk=ln(Xk) (9.13)
Box 9.14
Calculation of the Mean and Variance of the Natural Logs of the Yearly
Averages

The average of the m log transformed yearly averages, ym:
2
The variance of the log transformed yearly averages, s«:
4 - - - - - (9.15)
y (nvl) (m-1)
This variance estimate has m-1 degrees of freedom.
9-16
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

Using the mean ym, and the variance of the mean calculated from the log
M
transformed yearly averages, s£, calculate t and 5 using equations (9.16) and (9.17) in Box
9.15. These values are used in the calculation of unlikelihood ratio.
Box 9.15
Calculation of t and 5 When Using the Log Transformed Yearly Averages
.2
*y ln(Cs)+ln(u-i)
ym •*• 9 • 2
. t- * L (9.16)
(917)
where the degrees of freedom (Df) associated with s| is m-1
Use equation (9.11) in Box 9.9 to calculate the likelihood ratio for the
sequential test. Finally, the likelihood ratio, a, and (J are used to decide if the average
concentration is less than the cleanup standard. If the average is less than the cleanup
standard and if the concentrations are not increasing over time, conclude that the tested
ground water attains the cleanup standard. If the ground water from all wells or groups of
wells attains the cleanup standard then conclude that the ground water at the site attains die
cleanup standard. If the average concentration is not less than the cleanup standard or if die
concentrations are increasing over time, conclude that the ground water in the well does not
attain the cleanup standard. The steps in deciding attainment of the cleanup standard are
shown in Box 9.10.
9-17
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

9.4 Assessing Attainment of the Mean After Adjusting for Seasonal
Variation

This section provides an alternative procedure for testing if the mean
concentration is less than the cleanup standard. It is expected to provide more accurate
results when there are many samples per year and the data is both serially correlated and the
distribution of the data is not skewed. Because this procedure is sensitive to skewness in
the data, it is recommended only if the distribution of the measurement errors is reasonably
symmetric.

After the data have been collected using the guidelines indicated in
Chapter 4, wells can be tested individually or a group of wells can be tested jointly. In
the latter case, the data for the individual wells at each point in time are used to produce a
summary measure for the group as a whole. This summary measure may be an average,
maximum, or some other measure (see Chapter 2). These summary measures will be
averaged over the entire sampling period. The steps involved for incorporating seasonal
adjustments and serial correlations into the calculations associated with the statistical tests
arc discussed.

The calculations and procedures for assessing the mean after adjusting for
seasonal variation arc described below and summarized in Box 9.16. An example is
provided in Box 9.21. The calculations can be used (with some minor loss in efficiency) if
some observations are missing. With a large proportion of missing observations in any
season, consultation with a statistician is recommended. If the data are obviously skewed,
the procedures described in Box 9.12 which use the log transformed yearly averages arc
recommended.

Use the formulas in Box 9.7 for calculating the seasonal averages and the
mean of the seasonal averages. If there are missing observations within a season, average
the non-missing observations. Calculate the residuals, the deviations of the measurements
from the respective seasonal means, using equation (9.18) in Box 9.17. Box 9.18 shows
how to calculate the variance of the residuals. The variance will have degrees of freedom
equal to the number of measurements less the number of seasons.
9-18
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.16
Steps for Assessing Attainment Using the Mean After Adjusted for Seasonal
Variation

(1) Calculate the seasonal averages and the mean of the seasonal
averages, ?ms, (Box 9.7)

(2) Calculate the residuals, the differences between the observations and
the corresponding seasonal averages (Box 9.17)

(3) Calculate the variance, s2, of the residuals (see Box 9.18)

(4) Calculate the lag 1 serial correlation of the residuals using equation
£9.20) in Box 9.19. Denote the computed serial correlation by
+obs-

(5) Calculate the t statistic based on the mean, x*mst the standard devia-
tion s, and $obs. (Box 9.20)

(6) Calculate the likelihood ratio (Box 9.21)

(7) Decide whether the ground water attains the cleanup standards
(Box 9.10).
Box 9.17
Calculation of the Residuals

From each sample observation, subtract the corresponding seasonal mean.
That is, compute the, 6jk the deviation from the seasonal mean:

Cjk-Xjk-Xj. (9.18)
Using the mean of the seasonal averages and the variance of the residuals,

s2, calculate t and 5 using equations (9.21) and (922) in Box 9.20. These values are used

in die calculation of the likelihood ratio.
9-19
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.18
Calculation of the Variance of the Residuals

Calculate the variance of the observations 6jk reflecting adjustments for
possible seasonal differences using the equation in Box 8.12.
jk2- (9.19)

Alternatively, the ANOVA approach described in Appendix D can be used
to compute the required variance.
Box 9.19
Calculating the Serial Correlation from the Residuals After Removing
Seasonal Averages

The sample estimate of the serial correlation of the residuals is:

N
ICiCM
Sobs =4 (8-18)

I'?
i-i '

Where Cj, i =.1,2, ...,N are the •residuals after removing seasonal averages,
in the time order in which the samples were collected
9-20
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.20
Calculation of t and 5 When Using the Mean Corrected for Seasonal
Variation

Cs+m

2 (9.20)
2 14-

TTT
(9.21)
where f ms " me mean level of contamination computed from equation
(9.7), and s2 is the variance of the observations computed from equation
(9.16). The degrees of freedom, Df, associated with these estimates is
Use the formula in Box 9.21 to calculate the likelihood ratio for the sequen-

tial test Although this formula for calculating the likelihood ratio looks different than when

using the yearly averages (see Box 9.9). the two formulas are equivalent
Box 9.21
Calculation of the Likelihood Ratio for the Sequential Test When Adjusting
for Serial correlation

The likelihood ratio is:

- ,a~~
(9>22)
where Df is the degrees of freedom for s2.
9-21
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Box 9.22
Example Calculation of Sequential Test Statistics after Adjustments for
Seasonal Effects and serial Correlation

In Box 8.21, a test was performed for a fixed sample size after adjusting for
seasonal effects and seasonal correlation. We will use the same data (from
Table 8. 1) to conduct the corresponding sequential test after four years of
data collection. From Box 8.21 we have X * 5.914, s2 * .163, $obs = .37,
cs » 6.0, m « 4, and N » 16. We will stipulate that a * .1, f) = .1, and m
»5.72. Thus,
.Cs+Ui 6+5.72
5.914 -
.s+Ui
-- 2
Z
VS2
TT
.0706 1+37
= 0.551
8= El"01 - 5'72"6 ,.2858
VS2 n-2^ _ / .0706 1+.37
v * T OOa ^ I -^^""^^^^^ ^^™^™«^
TTTC ^ 16 *'*

and Df=^ = l|i = 4

«-«P(5 ^Tl V ^? ) -«P(«8 JL6.5S1

With a = .1 and P = .1, then A = .111, B » 9.0. Since 0.746 is neither
less than A or greater than B, we have insufficient data to conclude that the
cleanup standard has been either attained or not attained. Thus, more data
must be gathered.
9.5 Sequential Tests for Proportions

In general, sequential procedures for testing proportions require that more samples be

collected before starting the fast test of hypothesis than when testing the mean. If the

parameter to be tested is the proportion of contaminated samples from either one well or an

array of wells, the sample collection and analysis procedures arc the same as those outlined

above for testing the mean, with the following changes:
9-22
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

To apply this test, each ground water sample measurement is either
coded "1" (the actual measurement was equal to or above the
cleanup standard Cs), or "0" (below Cs). The statistical analysis is
based on the resulting coded variable of O's and 1's.
Only the analysis procedure which used yearly averages is appro-
priate for the calculations (Box 9.4). Do not use either of the
calculation procedures in Boxes 9.12 or 9.16.

• A total of at least p- samples should be collected before using the
statistical procedures to determine, on a yearly basis, whether
sampling can be stopped and a decision can be made.
9.6 A Further Note on Sequential Testing

It should be noted that sequential testing, as discussed in this chapter, has a
small chance of continuing for a very long time if the data gathered provide insufficient
evidence for making a clear-cut determination. A stopping rule, such as the following can
be implemented to handle such cases: determine the sample size necessary for a fixed
sample test for the specified values of Cs, |ii, a, and P (data collected during the sampling
for assessing attainment can be used to estimate the variance so the sample size can be
computed). Call this sample size mfixed. If the number of years of sample collection
exceeds twice mfixed, determine the likelihood ratio. If the likelihood ratio is less than 1.0,
conclude that the ground water does not attain the cleanup standard. If the likelihood ratio
is greater than 1.0 conclude that the mean concentration is less than the cleanup standard
and test if there is a significant positive slope in the data.
9.1 Checking for Trends in Contaminant Levels After Attaining the
Cleanup Standard

9-23
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

to do so. By establishing a simple regression model with the contaminant measure as the
dependent variable and time as the independent variable, a test of significance can be made
as to whether or not the estimated slope of the resulting linear model is positive (see Section
6.1.3). Scatter plots of the data will prove useful in assessing the model. When using the
yearly averages, the regression can be performed without adjusting for serial correlation.

This chapter presented the procedures for assessing attainment of the
cleanup standard for ground water measurements using a sequential statistical test. For
most statistical tests or procedures, the analysis is performed after the entire sample has
been collected and the laboratory results are complete. However, in sequential testing, the
samples are analyzed as they are collected. A statistical analysis of the data collected so far
is used to determine whether another years worth of samples should be collected or
whether the analysis should terminate.

We presented three alternate procedures for assessing attainment using
sequential tests. Two procedures use the yearly average concentrations, one assumes the
yearly average has a normal distribution, the other assumes a log normal distribution. The
third procedure uses the individual observations and makes a correction for seasonal
patterns and serial correlations. In general, the method which assumes the yearly averages
have a log normal distribution is recommended.

These testing procedures can be applied to samples from either individual
wells or wells tested as a group. These procedures are used after the ground water has
9-24
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

achieved steady state. If the ground water at the site is judged to attain the cleanup
standards because the concentrations arc not increasing and the long-term average is
significantly less than the cleanup standard, follow-up monitoring is recommended to check
that the steady state assumption holds.
9-25
-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
blank page
9-26
-------
BIBLIOGRAPHY
Abraham, B. and Ledolter, J., 1983, Statistical Methods for Forecasting. New
York John Wiley and Sons, Inc.

Albers, W., 1978, "One Sample Rank Tests Under Autoregressive Dependence,"
Annals of Statistics, Vol. 6, No. 4: 836-845.

Albers, W., 1978, 'Testing the Mean of a Normal Population Under Dependence,"
Annals of Statistics, Vol. 6, No. 6: 1337-1344.

Armitage, P. 1947, "Some Sequential Tests of Student's Hypothesis," Journal of the
Royal Statistical Society, Series B, Vol. 9 : 250-263.

Armitage, P. 1957, "Restricted Sequential Procedures," Biometrika, Vol. 44 : 9-26.

Barcelona, M., Gibb, J., and Miller, R., 1983, A Guide to the Selection of
Materials For Monitoring Well Construction and Ground-Water Sampling.
Illinois State Water Survey, Champaign, Illinois, US EPA- RSKERL, EPA

Barcelona, M., Gibb, J., Helfrich, J., and Garske, E., 1985, Practical Guide
for Ground-Water Sampling, Illinois State Water Survey, Champaign, Illinois,
USEPA-RSKERL, EPA 600/2-85/104.

Barnett, V., and Lewis, T., 1984, Outliers in Statistical Data. New York: John
Wiley and Sons, Inc.

Bartels, R., 1982 "The Rank Version of von Newmann's Ratio Test for Randomness,"
Journal of the American Statistical Association, Vol. 77, No. 377: 40-46.

Bauer, P., and Hackl, P., 1978, "The Use of MOSUMS for Quality Control,"
Technometrics, Vol. 20, No. 4: 431-436.

Bell, C., and Smith, E., 1986, "Inference for Non-Negative Autoregressive
Schemes," Communications in Statistics: Theory and Methods, Vol. 18, No.
8: 2267-2293.

Berthouex, P., Hunter, W., and Pallisen, L., 1978, "Monitoring Sewage
Treatment Plants: Some Quality Control Aspects," Journal of Quality
Technology, Vol. 10, No. 4: 139-149.

Bisgaard, S., and Hunter, W. G., 1986, Report No. 7, Studies in Quality
Improvement: Designing Environmental Regulations,. Center-for Quality and
Productivity Improvement. University of Wisconsin-Mason, (February
BIB-1
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

Simulations

Preliminary simulations using lognormally distributed data and a factorial design with 100
simulations for each set of parameters was used to determine which factors affected the power of
the sequential tests. The factors in the simulations were: scale factor, proportion of the random
variance which is correlated versus independent; lag 1 correlation; presence of a seasonal pattern;
proportion of the observations which were censored; number of samples per year, and \L Analysis
of the factorial design clearly indicated that the skewness and scale factor were most important in
determining the power of the test The serial correlation and censoring were also important The
presence of a cyclical component (which resulted in significant changes in the variance throughout
the year) did not significantly affect the power of the test.

As a result of these preliminary simulations, further simulations were run using scale factors
ranging from 1.6 to 4.8, a = (J = .05, p. = HQ or m, and the following distributions and sampling
designs:

(1) Normal distribution with independent errors and 4 samples per year,
(2) Lognormal distribution with coefficient of variation of 0.5, independent errors and
4 samples per year. This is die basic distribution. The following simulations all are
based on changes to the basic distribution.
(3) The basic distribution with 12 observations per year,
(4) The basic distribution but more skewed, with a coefficient of variation of 1.5;
(5) The basic distribution with censoring of 30% of the data (censored values were set
equal to the detection limit);
(6) The basic distribution with correlated errors, the serial correlation between log
transformed monthly observations is 0.8; and
(7) Data which are both skewed and correlated, with coefficient of variation of 1.5 and
serial correlation between log transformed monthly observations is 0.8. For this set
of simulations, the random error was the sum of two components, one random,
representing random measurement error, and the second correlated, reflecting
correlations in the the groundwater concentrations. The correlated error made up
75% of the total error variance.

For each test and each set of simulations with the same distributional assumptions, Figure 8 shows
the range in the false positive rate across simulations. Figure 9 shows similar information for the
false negative rate.

As can be seen from Figure 8, the false positive rate for the tests are close to the nominal level of
0.05 when the data have a normal distribution, as desired. For skewed and correlated data, the
false positive rate generally exceeds the nominal level.

For skewed and correlated data, the false positive rate for the standard sequential t-test exceeds the
nominal value for all simulations. The performance of the modified test and the modified test with
adjustments for seasonal patterns and serial correlations had similar false positive rates. Both of
these tests are sensitive to correlated and skewed data. The false positive rate for the modified test
adjusted for skewness is lower than for the other three tests. Only for correlated data does this test
have a false positive rate consistently greater than the nominal level. Censoring resulted in a
relative decrease in the false positive rate. Of the tests based on the modified sequential t-test, the
test with adjustments for skewness had the lowest average sample sizes and lowest false positive
rates.

Based on both the average sample sizes and false positive rates from the simulations, the modified
test adjusted for skewness is preferred over the other sequential tests. To the extent that the false
positive rate exceeds the nominal level for skewed and correlated data, the power can be improved

F-21
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

by using two year averages instead of one year averages. Results for the skewed and correlated
data using two year averages are also shown in Figure 8.

As shown in Figure 9, the false negative rate for all tests was generally similar to or less than the
nominal level. The false negative rate for the standard sequential t-test exceeded that for the
procedures based on the modified test For all tests, the false negative rate increased greatly in the
presence of censoring. Procedures based on the modified test, the modified test adjusted for
skewness had a false negative rate closest to the nominal level under the simulated conditions.
Although the average sample sizes for the tests were similar, the test adjusted for skewness had
highest average sample sizes. At the alternate hypothesis no one calculation procedure is clearly
preferred, however, the modified test has false negative rates lower than the nominal value for all
but censored observations and is the simplest to calculate.

The sample sizes for the skewed data were similar to those for the normally distributed data for
which the sequential test required fewer samples, on the average, than the equivalent fixed sample
size test. Therefore, it is likely that the sequential tests would also have lower average sample size
than for a fixed sample size test where the sample size calculations accounted for the skewed and/or
correlated nature of the data.

6. Conclusions and Discussion

For assessing attainment of Superfund cleanup standards based on the mean contaminant levels
using sequential tests, the conclusions from this simulation study are:

Given the situations found at Superfund sites, a sequential test can reduce the number of
samples compared to the that for an equivalent fixed sample size test;

The standard sequential t-test can have false negative rates greater than the nominal value.

• An adjustment factor can be used to improve the power performance of the sequential t-test
without greatly increasing the sample sizes. Different criteria will result in the selection of
different adjustment factors, however, all of the adjustment factors considered improved
the performance of the test. In this paper, the adjustment factor (n-2)/n was evaluated.

Use of a simple approximation to the likelihood ratio performs well compared to that based
on the non-central t distribution;

• Sampling rules which terminate the sequential test if the number of samples exceeds twice
the sample size for the equivalent fixed sample size test are likely to have little effect on the
power of the sequential t-test;

A modified sequential t-test with an adjustment for skewness has the lowest false positive
rate among the tests considered and has acceptable false negative rates and sample sizes
relative to the other tests; and

• All test procedures were sensitive to censored data.

The procedures used here set censored values equal to the detection limit. Other possible
approaches place censored values at half the detection limit or at zero. Further work is required to
determine how the sequential tests perform using different rules for handling values below the
detection limit The decision rule which places censored values at the detection level was chosen to
protect human health and the environment when assessing attainment at Superfund sites.
F-22
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
The problem of testing multiple wells and contaminants is particularly troublesome when the
decision rule requires that all wells and all contaminants must attain the relevant cleanup standards.
Even if all concentrations are below the cleanup standard, the probability of a false negative on any
one of several statistical tests increases the probability of falsely concluding that additional cleanup
is required. The false negative rate for the modified sequential tests considered in this paper are
generally lower than the nominal value for all but censored data. Therefore, use of these tests will
generally not contribute, beyond that planned for in the sample and analysis plan, to incorrectly
concluding that the ground water attains the Cleanup standard unless the data are censored.

All of the power curves are based on the assumption that the standard deviation will remain
constant as the mean changes. Another possible assumption is that the coefficient of variation will
remain constant as the mean changes. While the assumption about how the standard deviation
changes as the mean changes does not affect the conclusions presented, the actual shape of the
power curves will depend on the assumptions made.

Finally, these modified sequential t-tests can also be used when the alternate hypothesis is greater
than the null hypothesis. The results above can be applied if the false negative and false positive
labels are reversed. For compliance monitoring, i.e., to answer the question: do the concentrations
exceed an action level?, all of the modified sequential tests perform well if the data arc not
censored. With censored data, alternate rules for handling the observations below the detection
level should be considered.

Bibliography

Ghosh, B. K., 1970, Sequential Tests of Statistical Hypotheses, Reading MA, Addison
Wesley.

Hall, W. J., 1962, "Some Sequential Analogs of Stein's Two Stage Test," Biometrika, Vol 49,:
367-378.

Hayre, L. S,, 1983, "An Alternative to the Sequential T-Test." Sankhya : The Indian Journal of
Statistics, 45, Series A, Pt. 3, 288-300
"Water Quality Sampling: Some Statistical Considerations,"
mrch, Vol. 16, No. 6: 1717-1725.
Liebetrau, A. M., 1979, "Water Quality Sampling: Some Statistical Considerations," Water
Resources Research,
Loftis, J., Montgomery, R., Harris, J., Nettles, D., Porter, P., Ward, R., and
Sanders, T., 1986, "Monitoring Strategies for Ground Water Quality Management,"
Prepared for the United States Geological Service by Colorado State University, Fort
Collins Colorado.

Rushton, S., 1950, "On a Sequential t-Test," Biometrika, Vol. 37: 326-333.

Wald, A., 1947, Sequential Analysis. New York: Dover Publications.
F-23
-------
Figure 1 Example of Simulated Monthly Ground Water Data
2
.*».
Monthly
Measurements

Cleanup
Standard
0
1
2
3
Years
4
5
6
rn
b
i
on
i
01
-------
Figure 2 Power Curve and Average Sample Size for a
Sequential t-Test
7>
s>
Ui
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

Mean of Simulated Measurements
' Power, sequential
test

Nominal power
~ ~ • Average sample
size, sequential
test
Sample size,
fixed test
3
i
in
I
-------
Figure 3 False Decision Rate and Sample Size versus Scale
Factor (Centered test)
0.16 T
K>
0
0 0.5
1.5 2 2.5

Scale Factor
3.5
•• 160

•• 140

•• 120

ioo J8
"55
80 -f[
S
60 £

0
1 Power,
sequential test

Nominal false
decision rate

• Average
sample size,
sequential test

Sample size,
fixed test
i
V)
s
I
CO
-------
Figure 4 Distribution of Sample Sizes for the Centered and
Modified Sequential t-test, by Test Result
True mean = alternate hypothesis
N>
-J

-------
APPENDIX F: DERIVATIONS AND EQUATIONS
=
o>
O
oo
I—f
I s
H g-
ee
I I

c e
•"• s
< 'S

azis
O
ro
S o
VO Tf CS

odd
H h

8 ?
d d
u
O
cs
p
d
/ 9AIJISOJ
F-28
-------
Figure 6 Power Curve and Average Sample Size for
the Modified Sequential t-Test
1
u.
I
0
.30 .40 .50 .60 .70 .80 .90 1.001.101.20

Mean of Simulated Measurements
• Power, sequential
test
Nominal power

' Average sample
size, sequential
test

• Sample size,
fixed test
R
rp
b
i
C/J
1
CO
-------
Figure 7 Sample Size Distribution for Modified Sequential t
Test versus Mean
71
I**
o
ioo :
.
•
•
OJ
^N
C^
QJ
"H-
5 •
CO
crt

•

1 •
»

•

0.25

-
_
•

—
-

—

•N

••
-

•
1
•

—

W
\
•
•

•
•

0.50

\ '
:\

•
•
• •

• _ •
•
t.VfS.VSf*f.
\
\
\
\
m

m m \ m
\

0.75

•
•
V
•

•
•
• •
\
\
\

I
•

•
•

X.
^s

•
-

•
1

•
•

— .—

1.00
•
•
•

» •
•
•

•
•
•
• •
•

•

1 >
• 0.9 |
• 0.8 *
• 0.7

• 0.6 .
u
.o.5|
• 0.4 ^

• 0.3

5% 10% 25% 50%
75% 90% 95%

• Average sample size,
sequential test

"v™v"^~~~~~~' Sample size, Rued test

O
2
§
d
§
00
1
m
%
•0.2 §
z
• 0.1
0
1.25
Mean of Simulated Measurements
-------
BIBLIOGRAPHY
Bishop, T., 1985, Statistical View of Detection and Determination Limits in Chemical
Analyses prepared for the Committee on Applications of Statistical Techniques
in Chemical Problems. Columbus, Ohio: Battelle Columbus Laboratories,
[January 5, 1982].

Box, G. E. P., and Jenkins, G.M., 1970, Time Series Analysis Forecasting and
Control. San Francisco Holden-Day.

Box, G. E. P., Hunter W. G., and Hunter, J. S., 1987, Statistics For
Experimenters. New York: John Wiley and Sons, Inc.

Bross, Irwin D., 1985, "Why Proof of Safety is Much More Difficult Than Proof of
Hazard," Biometrics, Vol. 41: 785-793.

Brown, G. H, and Fisher, N. I., 1972, "Subsampling a Mixture of Sampled
Material," Technometrics, Vol. 14, No. 3: 663-668.

Brown, M. B., and Wolfe, R. A., 1983b, "Estimation of the Variance of Percentile
Estimates," Computational Stat&&s and Data Analysis, Vol. 1: 167-174.

Brow&, K. A., 1965, Statistical Theory and Methodology in Science and
Engineering, 2nd. New York: John Wiley and Sons, Inc.

Cantor, L. W., and Knox, R. C., 1986, Groundwater Pollution Control. Chelsea,
Michigan: Lewis Publishers.

Cantor, L. W., Knox, R. C., and Fairchild, D. M., 1987, Groundwater Quality
Protection. Chelsea, Michigan: Lewis Publishers.

Casey, D., Nemetz, P: N. and Uyeno, D., 1985, "Efficient Search Procedures for
Extreme Pollutant Values,' Environmental Monitoring and Assessment, Vol. 5:
165-176.

Clayton, C. A., Hines, J. W., Hartwell, T. D., and Burrows, P. M., 1986,
Demonstration of a Technique for Estimating Detection Limits with Specified
Assurance Probabilities. Washington, D.C.: EPA, [March 1986].

Cochran, W., 1977, Sampling Techniques. New York: John Wiley and Sons, Inc.

Cohen, A. C., 1961, Tables for Maximum Likelihood Estimates: Singly Truncated and
Singly Censored Samples," Technometrics, Vol.3, No.4: 535-541.

Conover, W. J., 1980, Practical Nonparametric Statistics. New York: John Wiley and
Sons, Inc.

D'Agostino, R. B., 1970, "A Simple Portable Test of Normality: Geary's Test
Revisited," Psychological Bulletin, Vol. 74, No. 2: 138-140.

Draper, N., and Smith, H., 1966, Applied Regression Analysis. New York: John
Wiley and Sons, Inc.
BIB-2
-------
BIBLIOGRAPHY
DuMouchel, W.H., Govindarajulu, Z., and Rothman, E., 1973, "Note on
Estimating the Variance of the Sample Mean in Stratified Sampling," Canadian
Journal of Statistics, Vol. 1, No.2: 267-274.

Duncan, A, 1974, Quality Control and Industrial Statistics, Fourth Edition. Homewood
IL: Richard Irwin, Inc.

Elder, R S., Thompson, W. O., and Myers, R. H., 1980, "Properties of
Composite Sampling Procedures," Technometrics, Vol. 22, No. 2: 179-186.

Environ Corporation, 1985a, principles of Risk Assessment: A Nontechnical Review.
EPA Workshop on Risk Assessment. Easton, Md., March 17-18, 1985.

Environ Corporation; Jellinek, Schwartz, Connolly, and Freshman; and
Temple, Barker, and Sloan, Inc., 1985c, Case Study on Risk
Assessment: Part I. EPA Workshop on Risk Assessment. Easton, Md., March
17-18,1985.

Environ Corporation; Jellinek, Schwartz, Connolly, and Freshman; and
Temple, Barker, and Sloan, Inc., 1985b, Additional Data on the Risk
Assessment Case: Part II. EPA Workshop on Risk Management. Easton,
Md., March 17-18, 1985.

Fairbanks, K., and Madsen, R., 1982, "P Values for Tests Using a Repeated
Significance Test Design," Biometrika, Vol. 69, No. 1: 69-74.

Farrell, R., 1980, Methods for Classifying Changes in Environmental Conditions,
Technical Report VRF-EPA7. 4-FR80-1, Vector Research Inc., Ann Arbor,
Michigan.

Filliben, J J. 1975, "Probability Plot Correlation Coefficient Test for Normality,"
Technomerics, Vol. 17, No.l: 111-117.

Ford, P., Turina, P., and GCA Corporation, 1985, Characterizatipn of Hazardous
Waste Sites-A Methods Manual, Volume I-Site Investigations. Las Vegas,
Nevada: EPA Environmental Monitoring Systems Laboratory, [April 1985].

Fuller. F., and Tsokos, C 1971, 'Time Series Analysis of Water Pollution Data,"
Biometrics, Vol. 27: 1017-1034.

Garner, F.C., 1985, Comprehensive Scheme for Auditing Contract Laboratory Data
[interim report]. Las Vegas, Nevada: Lockheed-EMSCO.

Gastwirth, J.L., and Rubin, H., 1971, "Effect of Dependence on the Level of Some
One-Sample Tests," Journal of the American Statistical Association, Vol. 66:
§16-§20

Geraghty and Miller, Inc., 1984, "Annual Report, August 1984, Rollins
Environmental Services, Baton Rouge, Louisiana," Baton Rouge, Louisiana.

Ghosh, B. K. 1970, Sequential Tests of Statistical Hypotheses, Reading MA, Addison
BIB-3
-------
BIBLIOGRAPHY
Gilbert, R. O., 1987, Statistical Methods for Environmental Pollution Monitoring.
New York: Van Nostrand Reinhold.

Gilbert, R, O., and Kinnison, R. R., 1981, "Statistical Methods for Estimating the
Mean and Variance from. Radionuclide Data Sets Containing Negative,
Unreported or Less-Than Values," Health Physics, Vol. 40: 377-390.

Gilliom, R. J., and Helsel, D. R., 1986, "Estimation of Distributional Parameters
for Censored Trace Level Water Quality Data, 1. Estimation Techniques,"
Water Resources Research, Vol. 22, No. 2: 135-146.

Gleit, A., 1985, "Estimation for Small Normal Data Sets with Detection Limits,"
Environmental Science and Technology, Vol. 19, No. 12: 1201-1206.

Goldstein, B., 1985, Elements of Risk Assessment. EPA Risk Assessment
Conference. Easton, Md., March 18, 1985.

Goodman, I., 1987, "Graphical and Statistical Methods to Assess the Effect of Landfills
on Groundwater Quality," Land Resources Program, University of Wisconsin-
Madison.

Grant, E. L., and Leavenworth, R. S., 1980, Statistical Quality Control Fifth
Edition. New York: McGraw-Hill.

Groeneveld, L., and Duval, R., 1985, "Statistical Procedures and Considerations for
Environmental Management (SPACEMAN)," Prepared for the Florida
Department of Environmental Regulation, Tallahassee, Florida

Grubbs, F. E., 1969, "Procedures for Detecting Outlying Observations in Samples,"
Technometrics, Vol. 11, No.l: 1-21.

Guttman, I., 1970, Statistical Tolerance Regions: Classical and Bayesian. (Being
Number Twenty-Six of Griffin's Statistical Monographs and Courses edited by
Alan Stuart.) Darien, Conn.: Hafner Publishing.

Hall, W. J., 1962, "Some Sequential Analogs of Stein's Two Stage Test," Biometrika,
Vol 49,: 367-378.

Hansen, M., Hurwitz, W, and Madow, W., 1953, Sample Survey Methods and
Theory, Volume 1. New York: John Wiley and Sons, Inc.

Hayre, L. S., 1983, *An Alternative to the Sequential T-Test." Sunkhya : The Indian
Journal of Statistics, 45, Series A, R. 3, 288-300

Hazardous Materials Control Research Institute, 1985, 6th National Conference
on Management of Uncontrolled Hazardous Waste Sites. Washington, B.C.:
HMCRI, [November 4-6, 1985].

Hazardous Materials Control Research Institute, 1986, 7th National Conference
on Management of Uncontrolled Hazardous Waste Sites. Washington, B.C.:
HMRCI, [Becember 1-3, 1986].
BIB-4
-------
BIBLIOGRAPHY
Hazardous Materials Control Research Institute, 1988, 9th National Conference
on Management of Uncontrolled Hazardous Waste Sites. Washington, B.C.:
HMRCI, [November 28-30, 1988].

Helsel, D.R., and Cohn, T.A., 1988, "Estimation of Descriptive Statistics for
Multiply Censored Water Quality Data," Water Resources Research, Vol. 24,
No. 12: 1997-2004.

Helsel, D. R., and Gilliom, R., 1986, "Estimation of Distributional Parameters for
Censored Trace Level Water Quality Data, 2. Verification and Applications,"
Water Resources Research, Vol. 22, No. 2: 147-155.
Hem, J.D., 1989, Study and Interpretation of the Chemical Characteristics of Natural
Water, Third Edition, U.S. Geological Survey Water-Supply Paper 2254.

Hipel, K., Lennox, W., Unny, T., and McLeod, A., 1975, "Intervention
Analysis in Water Resources," Water Resources Research, Vol. 11, No. 3:
567-575.

Hipel, K., McLeod, A., and Lennox, W, 1977, "Advance in Box-Jenkins
Modeling, 1. Model Construction," Water Resources Research, Vol. 12, No.
6: 855-861.

Hirsch, R. M., and Slack, J. R, 1984, "ANonparametric Trend Test for Seasonal
Data with Serial Dependence," Water Resources Research, Vol. 20, No. 6:
727-732.

Hirsch, R. M., Slack, J. R. and Smith R. A., 1982, 'Techniques for Trend
Analysis for Monthly Water Quality Data," Water Resources Research, Vol. 18,
No. 1: 107-121.

Hoaglin, D.C., Mosteller, F., and Tukey, J.W., 1983, Understanding Robust
and Exploratory Data Analysis. New York: John Wiley and Sons, Inc.

Johnson, N. L., and Kotz, S., 1970, Distributions in Statistics: Continuous
Univariate Distributions - 2, Houghton Mfflin Co.

Johnson, Norman L. and Leone, F. C, 1977, Statistics and Experimental Design
in Engineering and the Physical Sciences. Vol. I, Second Edition. Johns
Wiley and Sons, Inc.

Joiner, B.L., and Rosenblatt, J. R., 1975, "Some Properties of the Range in
Samples from Tukey's Symmetric Lambda Distributions," Journal of the
American Statistical Association Vol. 66: 394.

Kedem, B., 1980, "Estimation of Parameters in Stationary Autoregressive Processes
After Hard Limiting," Journal of the American Statistical Association, Vol. 75,
No. 369: 146-153.

Land, C. E., 1971, "Confidence Intervals for Linear Functions of the Normal Mean and
Variance," Annals of Mathematical Statistics, Vol. 42 No.4 1187-1205.
BIB-5
-------
BIBLIOGRAPHY
Land, C. E., 1975, Tables of Confidence for Linear Functions of the Normal Mean and
Variance. Selected Tables in Mathematical Statistics, Vol. Ill, pp. 385-419.
Providence, R.I.: American Mathematical Society.

Lehmann, EL, 1975, Nonparametrics: Statistical Methods -Based on Ranks. San
Francisco: Holden-Day.

Lettenmaier, D., 1976, "Detection of Trends in Water Quality Data From Records With
Dependent Observations," Water Resources Research, Vol. 12, No. 5: 1037-
1046.

Liebetrau, A. M., 1979, "Water Quality Sampling: Some Statistical Considerations,"
Water Resources Research, Vol. 16, No. 6: 1717-1725.

Liggett, W., 1985, "Statistical Designs for Studying Sources of Contamination," Quality
Assurance for Environmental Measurements, ASTM STP 867, J. K. Taylor
and T. W. Stanley Eds. American Society for Testing Materials, Philadelphia,
22-40.

Locks, M. O., Alexander, M. J., and Byars, B. J., 1963, "New Tables of the
Noncentral t-Distribution," Report ARL63-19, Wright-Patterson Air Force
Base.

Loftis, J. and Ward, R., 1980, "Sampling Frequency for Regulatory Water Quality
Monitoring," Water Resources Bulletin, Vol. 16, No. y. 501-507.

Loftis, J. and Ward, R., 1980, "Water Quality Monitoring-Some Practical Sampling
Frequency Considerations," Environmental Management, Vol. 4, No. 6: 521-
526.

Loftis, J., Montgomery, R., Harris, J., Nettles, D., Porter, P., Ward, R.,
and Sanders, T., 1986, "Monitoring Strategies for Ground Water Quality
Management," Prepared for the United States Geological Service by Colorado
State University, Fort Collins Colorado.

Madow, W. C., and Madow, L. H., 1944, "On the Theory of Systematic
Sampling" Annals of Mathematical Statistics Vol. 15:1-24.

Mage, D. T., 1982, "Objective Graphical Method for Testing Normal Distributional
Assumptions Using. Probability Plots," American Statistician, Vol. 36, No.2:
116-120.

McLeod, A., Hipel, K., and Comancho, F., 1983, "Trend Assessment of Water
Quality Time Series," Water Resources Bulletin, Vol. 19, No. 4: 537-547.

McLeod, A., Hipel, K., Lennox, W., 1977, "Advance in Box-Jenkins Modeling,
1. Applications," Water Resources Research, Vol. 13, No. 3: 577-586.

Mee, Robert W., 1984, 'Tolerance Limits and Bounds for Proportions Based on Data
Subject to Measurement Error," Journal of Quality Technology. Vol.16, No.2:
74-80.
BIB-6
-------
BIBLIOGRAPHY
Mee, Robert W., Owen, D.B., and Shyu, Jyh-Cherng., 1986, "Confidence
Bounds for Misclassification Probabilities Based on Data Subject to
Measurement Error'" Journal of Quality Technology, Vol. 18, No. 1: 29-40.

Mendenhall, W., and Ott, L., 1980, Understanding Statistics. N. Scituate, Mass.:
Duxbury Press.

Millard, S., Yearsley, J., and Lettenmaier, D., 1985, "Space-Time Correlation
and Its Effects on Methods for Detecting Aquatic Ecological Change'" CAN. J.
FISH. AQUAT. SCI, Vol. 42: 1391-1400.

Montgomery, R., and Loftis, J., 1987, "Applicability of the t-test for Detecting
Trends in Water Quality Variables'" Water Resources Bulletin, Vol. 23, No. 4:
653-662.
Montg
;omery, R., and Reckhow, H., 1984, "Techniques for Detecting Trends in
Lake Water Quality," Water Resources Bulletin, Vol. 20, No. 1: 43-52.

Natrella, M., 1963, Experimental Statistics. Washington, D.C.: U.S. Department of
Commerce, National Bureau of Standards.

Nelson, J., and Ward, R., 1981, "Statistical Considerations and Sampling
Techniques for Ground-Water Quality Monitoring," Ground Water, Vol. 19,
No. 6: 617-625.

Neter, J., Wasserman, W., and Kutner, M., 1985, Applied Linear Statistical
Models. Homewood Illinois: Irwin.

Neter, J., Wasserman, W., and Whitmore, G., 1982, Applied Statistics. Boston:
Allyn-Bacon.

Noether, C., 1956, "Two Sequential Tests Against Trend'" Journal of the American
Statistical Association, September 1956: 440-450.

Ness, R., 1985, "Groundwater Quality," Journal-Water Pollution Control Fe&ration,
Vol. 57, No. 6: 642-649.

Nyer, Evan K., 1985, Groundwater Treatment Technology. New York: Van Nostrand
Reinhold Co.

Oak Ridge National Laboratory, 1984, "Results of the Groundwater Monitoring
Performed at the Former St. Louis Airport Storage Site for the Period January
1981 Through January 1983," ORNL/TM-8879, Oak Ridge, Tennessee.

Owen, D.B., 1963, Factors for One-Sided Tolerance Limits and for Variables Sampling
Plans. Albuquerque, N.M.: Sandia Corporation, [March 1963].

Patel, J.K., 1986, "Tolerance Limits-A Review," Communications in Statistics: Theory
and Methods, Vol. 15, No. 9: 2719-2762.
BIB-7
-------
BIBLIOGRAPHY
Pederson, G.L., and Smith, M.M., 1989, U.S. Geological Survey Second National
Symposium of Water Qu
Florida, November 1989.
Symposium of Water Quality; Abstracts of the Technical Sessions. Orlando,
lo "
Pettyjohn, W., 1976, "Monitoring Cyclic Fluctuations in Ground-Water Quality,"
Ground Water, Vol. 14, No. 6: 472-480.

Pucci, A., and Murashige, J., 1987, "Applications of Universal Kriging to an
Aquifer Study in New Jersey," Ground Water, Vol. 25, No. 6: 672-678.

Rendu, J. M., 1979, "Normal and Lognormal Estimation," Mathematical Geology,
Vol. ll,No.4: 407-422.

Resnikoff, G. J., and Lieberman, G. J., 1957, Tables of the Non-central
t-distribution. Stanford: Stanford University Press.

'Rockwell International, 1979, "Hanford Groundwater Modeling-Statistical Methods
for Evaluating Uncertainty and Assessing Sampling Effectiveness," Rockwell
Hanford Operations, Energy Systems Group, Richland Washington, RHO-C-
18.

Rohde, C. A., 1976, "Composite Sampling," Biometrics, Vol. 32: 278-282.

Rohde, C. A., 1979, "Batch, Bulk, and Composite Sampling," Sampling Biological
Popularions, pp. 365-367. Edited by R.M. Cormack. Fairland, Md.:
International Cooperative Pub. House.

Rushton, S.. 1950, "On a Sequential t-Test," Biotnetrika, Vol. 37: 326-333.

Rushton, S., 1952, "On a Two-Sided Sequential t-Test," Biometrika, Vol. 39: 302-308.

Sanders, T. and Adrian, D., 1978, "Sampling Frequency for River Quality
Monitoring," Water Resources Research, Vol. 14, No. 4: 569-576.

SAS Institute, 1985, Sas Users Guide: Statistics. Gary, North Carolina.

Schaeffer, D. and Kerster, H., 1988, "Quality Control Approach to NPDES
Compliance Determination," Journal-Water Pollution Control Fe&ration, Vol.
60: 1436-1438.

Scheaffer, R. L., Mendenhall, W., and Ott, L., 1979, Elementary Survey
Sampling, Second Edition. Boston: Duxbury Press.

Schmid, C. F., 1983, Statistical Graphics: Design Principles and Practices. New York:
John Wiley and Sons, Inc.

Schmidt, K., 1977, "Water Quality Variations for Pumping Wells," Ground Water,
Vol. 15, No. 2: 130-137.

Schwartz, J. E., 1985, "Neglected Problem of Measurement Error in Categorical Data,"
Sociological Methods and Research, Vol. 13, No. 4: 435-466.
BIB-8
-------
BIBLIOGRAPHY
Schweitzer, G. E., and Santolucito, J. A., editors, 1984, Environmental
Sampling for Hazardous Wastes. ACS Symposium Series 267. Washington,
D.C.: American Chemical Society.

Shapiro, S.S., and Wilk, M.B., 1965, "Analysis of Variance Test for Normality
(Complete Samples)," Biometrika, Vol. 52: 591-611.

Sharpe, K., 1970, "Robustness of Normal Tolerance Intervals," Biometrika. Vol. 57,
No.l: 71-78.

Siegmund, D., 1985, Sequential Analysis: Tests and Confidence Intervals. New York:
Springer-Verlag.

Sirjaev, A.N., 1973, Statistical Sequential Analysis. Providence, R.I.: American
Mathematical society.

Size, W. B., editor, 1987, Use and Abuse of Statistical Methods in the Earth Sciences.
New York: Oxford University Press.

Snedecor, G. W., and Cochran, W. G., 1980, Statistical Method. Seventh Edition
Ames Iowa: The Iowa State Press.

Sokal, R. R., and Rohlf, F. J., 1981, Biometry: The Principles and Practice of
Statistics in Biological Research. Second Edition. New York: W. I-L Freeman.

Stoline M., and Cook, R., 1986, "A Study of Statistical Aspects of the Love Canal
Environmental Monitoring Study," American Statistician, Vol. 40, No. 2: 172-

Switzer, P., 1983, When Will a Pollutant Standard Be Exceeded: Model Prediction and
Uncertainty. Technical Report No. 67. New Canaan, Ct: SIMS, [January

Temple, Barker, and Sloan, Inc. 1986, Case Study on Risk Management. EPA
Workshop on Risk Management. Easton, Md., April 13-14, 1986.

Tomqvist, L., 1963, "Theory of Replicated Systematic Cluster Sampling With Random
Stan," Review of the International Statistical Institute, Vol. 31, No. 1: 11-23.

Tukey, J. W., 1977, Exploratory Data Analysis. Reading, Mass.: Addison-Wesley.

U.S. Congress. Office of Technical Assessment, 1985, Superfund Strategy.
Washington, D.C.: G.P.O.

U.S. Department of Energy, 1985a How Clean is Clean: A Review of Superfund
Cleanups. Washington D. C. (GJ/TMC-08-ED.2).

U.S. Department of Energy, 1985b, Procedures for Collections and Preservation of
Groundwater and Surface Water Samples and For the Installation of Monitoring
Wells. Washington D. C. (CONF-87 1075-21).
BIB-9
-------
BIBLIOGRAPHY
U.S. Environmental Protection Agency, 1982, The Handbook for Sampling and
Sample Preservation of Water and Wastewater. Washington, D. C., September
1982 (EPA-600/4-82-029).

U.S. Environmental Protection Agency, 1984, Sampling Procedures for Ground
Water Quality Investigations. Washington D.C, May 1984 (EPA-600/D-84-
137).

U.S. Environmental Protection Agency, 1985a, Data Quality Objectives for the
RI/FS Process: Accuracy Testing Definitions, Appendix F [draft].
Washington, D.C., [November 5, 1985].

U.S. Environmental Protection Agency, 1985b, EPA Guide for Minimizing the
Adverse Environmental Effects of Cleanup of Uncontrolled Hazardous Wastes
Sites. Washington, D.C., June 1985 (EPA/600/8-85/008).

U.S. Environmental Protection Agency, 1986a, Guidance Document for Cleanup
of Surface Impoundment Sites. Washington D. C., June 1986.

U.S. Environmental Protection Agency, 1986b, Resource Conservation and
Recovery Act (RCRA) Ground-Water Monitoring Technical Enforcement
Guidance Document. Washington D.C., September 1986 (OSWER-9950.1).

U.S. Environmental Protection Agency, 1986c, Superfund Public Health
Evaluation Manual. Washington D. C.: EPA [October 1986].

U.S. Environmental Protection Agency, 1987a, Data Quality Objectives For
Remedial Response Activities, Development Process. Washington D. C.,
March 1987 (EPA 540/G-87/003).

U.S. Environmental Protection Agency, 1987b, Data Quality Objectives For
Remedial Response Activities, Example Senario: RI/FS Activities at a Site with
Contaminated Soils and Ground Water. Washington D. C., March 1987 (EPA
540/G-87/004).

U.S. Environmental Protection Agency, 1987c, EPA Journal, The New
Superfund: Protecting People and the Environment. Washington D. C., Vol.
13, No. 1.

U.S. Environmental Protection Agency, 1987d, Surface Impoundment Clean
Closure Guidance Manual [draft]. Washington D. C., March 1987.

U.S. Environmental Protection Agency, 1987e, Using Models in Ground-Water
Protection Programs. Washington D.C., January 1987 (EPA/600/8-87/003).

U.S. Environmental Protection Agency, 1988, Guidance on Remedial Actions for
Contaminated Ground Water at Superfund Sites [Interim Final]. Washington
D.C.
BIB-10
-------
BIBLIOGRAPHY
U.S. Environmental Protection Agency, 1989a, Methods for Evaluating the
Attainment of Cleanup Standards, Volume 1: Soils and Solid Media. Office of
Policy, Planning, and Evaluation, Washington, D.C., February 1989 (EPA
230/02-89-042).

U.S. Environmental Protection Agency, 1989b, Statistical Analysis of Ground-
Water Monitoring Data at RCRA Facilities. Office of Solid Waste,
Washington, D.C., April 1989

van Belle, G., and Hughes, J. P., 1984, "Nonparametric Tests for Trend in Water
Quality," Water Resources Research, Vol. 20, No. 1: 127-136.

Wald, A., 1947, Sequential Analysis. New York Dover Publications.

Ward, C., Loftis, J., Nielsen, K., and Anderson, R., 1979, "Statistical
Evaluation of Sampling Frequencies in Monitoring Networks," Journal-Water
Pollution Control Federation, Vol. 51, No. 9: 2292-2300.

Wesolowsky, G.O., 1976, Multiple Regression and Analysis of Variance. New York:
John Wiley.

Wetherill, G. B., 1975, Sequential Methods in Statistics. New York: Halsted Press.

Wilson, J., 1982, Ground Water: A Non-Technical Guide. Academy of Natural
Sciences, Philadelphia, Pa.

Wolter, K. M., 1984, "Investigation of Some Estimators of Variance for Systematic
Sampling," Journal of the American Statistical Association, Vol. 79, No.388:
781-790.

Wolter, Kirk M.,, 1985, Introduction to Variance Estimation. New York: Springer-
Verlag.

Wood, E. F., Ferrara, R. A., Gray, W. G., and Pinder, G. F., 1984,
Groundwater Contamination From Hazardous Wastes. Englewood Cliffs, New
Jersey: Prentice-Hall.
BIB-11
-------
APPENDIX A: STATISTICAL TABLES
Table A. 1 Tables of t for selected alpha and degrees of freedom
Use alpha to determine which column to use based on the desired parameter,
Use the degrees of freedom to determine which row to use. The t value will be found at the
intersection of the row and column. For values of degrees of freedom not in the table, interpolate
between those values provided.
When detemiining t^jy for a specified t
.10 .05 .025 .01 .005
.0025
.001
When determining i\^j/2jxtor a specified ac
.50 .20 .10 05 .02 01 OO5

Degrees of
Freedom
Df

Df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
120
400
infinite

1.000
0.816
0.765
0.741
0.727
0.718
0.711
0.706
0.703
0.700
0.697
0.695
6.694
0.692
0.691
0.690
0.689
0.688
0.688
0.687
0.686
0.686
0.685
0.685
0.684
0.684
0.684
0.683
0.683
0.683
0.681
0.679
0.677
0.675
0.674

3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1319
1.318
1.316
1315
1314
1.313
1.311
1.310
1.303
1.296
1.289
1-2*4
1.282

6.314
1920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.684
1.671
1.658
1.649
1.645

12.706
4.303
3.182
1776
1571
1447
1365
2.306
1262
2.228
1201
1179
1160
1145
1131
1120
1110
2.101
2.093
2.086
1080
1074
1069
1064
1060
1056
1052
1048
1045
1042
2.021
1000
1.980
1.966
1.960

31.821
6.965
4.541
3.747
3365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
2.423
2390
2358
2336
2326

63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
1977
1947
2.921
1898
1878
1861
1845
1831
1819
1807
1797
1787
1779
1771
1763
1756
1750
1704
2.660
2.617
1588
1576

127321
14.089
7.453
5.598
4.773
4317
4.029
3.833
3.690
3.581
3.497
3.428
3.372
3.326
3286
3.252
3.222
3.197
3.174
3.153
3.135
3.119
3.104
3.091
3.078
3.067
3.057
3.047
3.038
3.030
1971
1915
1860
1823
2.807
.002

318.309
22327
10215
7.173
5.893
5.208
4.785
4.501
4.297
4.144
4.025
3.930
3.852
3.787
3.733
3.686
3.646
3.610
3.579
3.552
3.527
3.505
3.485
3.467
3.450
3.435
3.421
3.408
3396
3385
3.307
3232
3.160
3.111
3.090
A-l
-------
APPENDIX A: STATISTICAL TABLES
Table A.2 Tables of z for selected alpha
Use alpha to determine which column to read. Use the desired parameter, z^ or zlKX^, to
determine which row to use. Read the z value at the intersection of the row and column.
zi-a
0.674
.842
1.282
1.645
1.960
2.326
2J76
2.807
3.090
zi-a/2
1.150
1.282
1.654
1.960
2.326
2.576
2.807
3.090
3.29
A-2
-------
APPENDIX A: STATISTICAL TABLES
Table A.3 Tables of k for selected alpha, PQ, and sample size for use in a tolerance interval test
Use alpha to determine which table to read. The value k is found at the intersection of the column
with the specified PQ and the row with the sample size n. When testing tolerance intervals, let
T » x + ks. If T is less than the cleanup standard, the sample area attains the cleanup standard
based on the statistical test.
Alpha = 0.10
10%)
n
0.25
0.1
0.05
0.01
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
70
100
200
500
infinity
5.842
2.603
1.972
1.698
1.540
1.435
1.360
1.302
1.257
1.219
1.188
1.162
1.139
1.119
1.101
1.085
1.071
1.058
1.046
1.035
1.025
1.016
1.007
1.000
0.992
0.985
0.979
0.973
0.967
0.942
0.923
0.894
0.857
0.825
0.779
0.740
0.674
0.253
4.258
3.188
2.742
2.494
2.333
2.219
2.133
2.066
2.011
1.966
1.928
1.895
1.867
1.842
1.819
1.800
1.782
1.765
1.750
.737
.724
.712
.702
.691
.682
.673
.665
.657
.624
.598
.559
.511
.470
.411
.362
.282
3.090
5.311
3.957
3.400
3.092
2.894
2.754
2.650
2.568
2.503
2.448
2.402
2.363
2.329
2.299
2.272
2.249
2.227
2.208
2.190
2.174
2.159
2.145
2.132
2.120
2.109
2.099
2.089
2.080
2.041
2.010
1.965
1.909
1.861
1.793
1.736
1.645
8.500
7.340
5.438
4.666
4.243
3.972
3.783
3.641
3.532
3.443
3.371
3.309
3.257
3.212
3.172
3.137
3.105
3.077
3.052
3.028
3.007
2.987
2.969
2.952
2.937
2.922
2.909
2.896
2.884
2.833
2.793
2.735
2.662
2.601
2.514
2.442
2.326
A-3
-------
APPENDIX A: STATISTICAL TABLES
Table A. 3 Tables of k for selected alpha, Po, and sample size far use in a toleranace interval test
(Continued)
Alpha = 0.05 (i.e., 5%)
n
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
70
100
200
500
infinity
0.25
1.763
3.806
2.618
2.150
1.895
1.732
1.618
1.532
1.465
1.411
1.366
1.328
1.296
1.268
1.243
1.220
1.201
1.183
1.166
1.152
1.138
1.125
1.114
1.103
1.093
1.083
1.075
1.066
1.058
1.025
0.999
0.960
0.911
0.870
0.809
0.758
0.674
0.1
20.581
6.155
4.162
3.407
3.006
2.755
2.582
2.454
2.355
2.275
2.210
2.155
2.109
2.068
2.033
2.002
1.974
1.949
1.926
1.905
1.886
1.869
1.853
1.838
1.824
1.811
1.799
1.788
1.777
1.732
1.697
.646
.581
.527
.450
.385
.282
PO
0.05
26.260
7.656
5.144
4.203
3.708
3.399
3.187
3.031
2.911
2.815
2.736
2.671
2.614
2.566
2.524
2.486
2.453
2.423
2.396
2.371
2.349
2.328
2.309
2.292
2.275
2.260
2.246
2.232
2.220
2.167
2.125
2.065
1.990
1.927
1.837
1.763
1.645
0.01
37.094
10.553
7.042
5.741
5.062
4.642
4.354
4.143
3.981
3.852
3.747
3.659
3.585
3.520
3.464
3.414
3.370
3.331
3.295
3.263
3.233
3.206
3.181
3.158
3.136
3.116
3.098
3.080
3.064
2.995
2.941
2.862
2.765
2.684
2.570
2.475
2.326
A-4
-------
APPENDIXA: STATISTICAL TABLES
Table A.3 Tables of k for selected alpha, Po, and sample size far use in a tolerance interval test
(Continued)
n
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
70
100
200
500
infinity
0.25
>8.939
8.728
4.715
3.454
2.848
2.491
2.253
2.083
1.954
1.853
1.771
1.703
.645
.595
.552
.514
.481
.450
.423
.399
.376
.355
.336
.319
1.303
1.287
1.273
1.260
1.247
1.195
1.154
1.094
1.020
0.957
0.868
0.794
0.674
Alpha =
0.1
103.029
13.995
7.380
5.362
' 4.411
3.859
3.497
3.240
3.048
2.898
2.777
2.677
2.593
2.521
2.459
2.405
2.357
2.314
2.276
2.241
2.209
2.180
2.154
2.129
2.105
2.085
2.065
2.047
2.030
1.957
1.902
1.821
1.722
1.639
1.524
1.430
1.282
0.01 (i.*, 1%)
PO
0.05
131.426
17.370
9.083
6.578
5.406
4.728
4.258
3.972
3.738
3.556
3.410
3.290
3.189
3.102
3.028
2.963
2.905
2.854
2.808
2.766
2.729
2.694
2.662
2.633
2.606
2.581
2.558
2.536
2.515
2.430
2.364
2.269
2.153
2.056
1.923
1.814
1.645
0.01
185.61
23.896
12.387
8.939
7.335
6.412
5.812
5.389
5.074
4.829
4.633
4.472
4.337
4.222
4.123
4.037
3.960
3.892
3.832
3.777
3.727
3.681
3.640
3.601
3.566
3.533
3.502
3.473
3.447
3.334
3.249
3.125
2.974
2.850
2.679
2.540
2.326
A-5
-------
APPENDIX A: STATISTICAL TABLES
i aoic J\.<*
Cost ratio $R
Yearly cost
Sample cost
Kecomn
for asses
1
2
5
10
20
50
100
200
1000
2000
5000
10000
lenoea numoer 01 sam
sing attainment
Estimated Lag 1
0.05 0.1 0.15
8
10
12
15
18
23
30
36
61
73
91
183
7
8
10
12
15
20
24
30
52
61
91
91
6
7
9
10
13
17
21
26
46
61
73
91
pies per seasonal penoa \np) to minimize roau cost
serial correlation between monthly observations
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
5
6
8
9
11
15
19
24
40
52
73
91
4
5
6
8
9
13
16
20
34
40
61
73
4
4
5
6
8
10
13
16
28
36
46
61
4
4
4
5
6
9
11
14
23
30
40
52
4
4
4
4
5
7
9
11
19
24
32
40
4
4
4
4
4
6
7
9
15
19
25
32
4
4
4
4
4
4
5
6
11
14
19
23
4
4
4
4
4
4
4
4
7
8
11
14
A-6
-------
APPENDIX A: STATISTICAL TABLES
Table A.5 Variance factors F for determining sample size

Samples
per year or
seasonal
period

4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
28
30
32
34
36
40
46
52
61
73
91
183
365
0.05
4.00
4.99
5.97
6.92
7.83
8.69
9.48
10.22
10.89
11.51
12.07
12.57
13.03
13.44
13.81
14.15
14.45
14.72
14.97
15.20
15.41
15.59
15.76
16.06
16.32
16.53
16.71
16.87
17.13
17.40
17.59
17.79
17.95
18.08
18.27
18.31
0.1
3.99
4.96
5.89
6.74
7.53
8.23
8.85
9.40
9.88
10.30
10.67
11.00
11.28
11.53
11.75
11.95
12.12
12.27
12.41
12.53
12.65
12.75
12.84
12.99
13.12
13.23
13.32
13.40
13.52
13.66
13.75
13.84
13.91
13.98
14.06
14.08
Estimated Lag 1
0.15 0.2
3.97
4.90
5.75
6.50
7.15
7.71
8.19
8.60
8.95
9.24
9.50
9.72
9.90
10.07
10.21
10.33
10.44
10.54
10.62
10.70
10.77
10.83
10.88
10.98
11.05
11.12
11.17
11.21
11.29
11.36
11.42
11.47
11.51
11.54
11.59
11.60
3.94
4.80
5.55
6.19
6.73
7.17
7.53
7.83
8.09
8.30
8.47
8.62
8.75
8.86
8.96
9.04
9.11
9.17
9.23
9.28
9.32
9.36
9.39
9.45
9.50
9.54
9.58
9.60
9.65
9.70
9.73
9.76
9.79
9.81
9.84
9.85
serial correlation between monthly observations
0.3 0.4 0.5 0.6 0.7 0.8 0.9
3.80
4.49
5.04
5.46
5.80
6.05
6.26
6.42
6.55
6.66
6.75
6.82
6.88
6.93
6.97
7.01
7.05
7.07
7.10
7.12
7.14
7.16
7.17
7.20
7.22
7.24
7.25
7.26
7.28
7.30
7.32
7.33
7.34
7.35
7.36
7.37
3.53
4.03
4.38
4.64
4.83
4.97
5.08
5.16
5.23
5.28
5.32
5.35
5.38
5.41
5.43
5.45
5.46
5.47
5.49
5.50
5.50
5.51
5.52
5.53
5.54
5.55
5.56
5.56
5.57
5.58
5.58
5.59
5.60
5.60
5.60
5.61
3.13
3.44
3.64
3.78
3.88
3.95
4.00
4.04
4.07
4.09
4.11
4.13
4.14
4.15
4.16
4.17
4.18
4.18
4.19
4.19
4.20
4.20
4.20
4.21
4.21
4.22
4.22
4.22
4.22
4.23
4.23
4.23
4.24
4.24
4.24
4.24
2.61
2.77
2.87
2.93
2.97
3.00
3.03
3.04
3.06
3.07
3.07
3.08
3.09
3.09
3.09
3.10
3.10
3.10
3.10
3.11
3.11
3.11
3.11
3.11
3.11
3.12
3.12
3.12
3.12
3.12
3.12
3.12
3.12
3.12
3.13
3.13
1.99
2.05
2.09
2.11
2.13
2.14
2.15
2.15
2.16
2.16
2.16
2.17
2.17
2.17
2.17
2.17
2.17
2.17
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
1.31 0.64
1.33 0.64
1.34 0.64
1.35 0.64
1.35 0.64
1.35 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.37 0.64
1.37 0.64
1.37 0.64
1.37 0.64
1.37 0.64
1.37 0.64
A-7
-------
DRAFT 3
A-8
-------
APPENDIX B: EXAMPLE WORKSHEETS

The worksheets in this appendix have been completed to serve as an example in
understanding the forms and making the necessary calculations.

Please note that to maintain adquate precision in doing the computations appearing
in the worksheets, (particularly in the calculations of estimated variances, standard deviations, or
standard errors), the number of decimal places retained should be as high as possible, with a
minimum of four.

A Scenario

To help understand how to use the worksheets provided, a scenario has been
constructed with associated data concerning a site for which a cleanup effort has been undertaken.
In order that undue time is not spent on data manipulation and data entry, parameters were set in
such a way that the number of years for which data needed to be collected in the example was kept
artificially low. For example, in Worksheet 3, a and P were set higher than will generally be the
case in practice while \L\ and ft were set relatively low. As a consequence, the number of years
required for a fixed sample size test was limited to three years, which is highly unlikely to be the
case in practice.

The scenario involves a Superfund site with a treatment well and 5 monitoring
wells. Two of the' monitoring wells are close to the source of contamination and have been
monitored individually (involving Worksheets 2 through 7b). The remaining three wells are
relatively far from the source of contamination and have been analyzed as a group (Worksheets 8
through 14b). Two chemicals were of interest in monitoring for cleanup. The example
worksheets have been provided for one of the two chemicals for one of the two wells being
monitored individually and for the group of three wells. For illustrative purposes, for the single
well being examined, both a fixed sample test and a sequential test have been carried out.
However, in practice, a decision would be made before hand about which of the two approaches
would be used, and only that test would be employed. It is interesting to note that, for the example
data set, it rums out that the fixed sample size test indicates that the site is clean while the sequential
test indicates that more data are needed before a decision can be reached. On average, the
sequential test will yield a result more quickly, but since the parameters were specified so as to
require only-three years for the fixed sample test, which is the minimum amount of time required
B-l
-------
APPENDIX B: EXAMPLE WORKSHEETS

for a sequential test, it is not altogether surprising that a decision could not be made via the
sequential test

Worksheets 15 and 16 have been filled out with data independent of the five well
example. They were used simply to indicate how a serial correlation could be estimated via the
worksheets. The number of observations on which the estimated serial correlation is based,
twelve, is fewer than should normally be used in practice.

The number of samples per year used in the example was six. Note that in
Worksheet 3 the estimated serial correlation between monthly data was .2, so that the correlation
between observations obtained between two-month periods would be estimated to be .2 =.04.
Since .04 represents a rather low correlation between Observations, data could be reasonably
gathered on a bimonthly schedule without great concern about a lack of independence between
observations.

Worksheets 1R and 2R present the computation of regression coefficients and
related tests of significance using the three sample means obtained during the three years of data
collection for the test of the single well to serve as the three data observations from which a linear
model was to be constructed. Since the fixed sample test indicated that the cleanup effort was
successful, it is desirable to examine the trend of the data over time to make sure that there is no
evidence that the cleanup standard could be exceeded in the future. This could be indicated by
evidence of a statistically significant positive slope for the sample data (in this case, the three yearly
averages). Three observations is a rather small sample on which to base such decisions, but again
the chief purpose of these example worksheets is illustrative. The reader can more quickly
determine how the regression estimates were computed using a small data set. In practice, it is
quite likely that the number of years' worth of data resulting in a decision that the site is clean will
exceed three by several years.
B-2
-------
APPENDIX B: EXAMPLE WORKSHEET'S
Table B.I Summary of Notation Used in Appendix B
Symbol
Definition
m
N

index i
index k
index j

index c
index w
'm
Cs
Df

di
The number of years for which data were collected (usually the
analysis will be performed with full years worth of data)
The number of sample measurements per year (for monthly data, n
» 12; for quarterly data, n * 4). This is also referred to as the
number of "seasons" per year
The total number of sample measurements (if there are no missing
observations, N = mn)
Indicates the order in which the ground-water samples are collected
Indicates the year in which the ground-water samples are collected
Indicates the season or time within the year at which the
groundwater samples are collected
Indicates the chemical analyzed
Indicates the well sampled
Contaminant measurement for the ith ground-water sample
An alternative way of denoting a contaminant measurement, where k
= 1, 2, ..., m denotes the year, and j = 1, 2, .... n denotes the
sampling period (season) within the year. The subscript for x% is
related to the subscript for Xj in the following manner i = (k-l)n +
j-
The mean (or average) of the contaminant measurements for year k
(see Boxes 8.5 and 9.4)
The mean of the yearly averages for years k « 1 to m.
The standard deviation of the yearly average contaminant
concentrations from m years of sample collection (see Boxes 8.7
and 9.6)
The standard error of the mean of the yearly means (see Boxes 8.9
and 9.8)
The designated clean up standard
The degrees of freedom associated with the standard error of an
estimate (see Boxes 8.7 and 9.6)
The distance of the monitoring well from the treatment well
B-3
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 1 Sampling Wells

See Section 3.2 in "Methods for Evaluating the Attainment of Cleanup Standards", Volume 2

SHE: Site ABC

Sample
Well
Number
w
1
2
0
4
5

monitoring
monitoring
monitoring
monitoring
monitoring
well
well
well
well
well
d, feet
d2 feet
dc, feet
dd feet
ds feet
northeast of treatment
west of treatment well
north of treatment well
southwest of treatment
southeast of treatment
well

well
well

wells 1 and 2
wells 3, 4,
and
will be
assessed individually

5 will be assessed as a group

Decision Criteria: Wells assessed (Checked one) Individually ED As a Group d

Use the Sampling Well Number (w) to refer on subsequent sheets to the sampling wells described
above.

Attach a map showing the sampling wells within the waste site.

Date Completed: EXAMPTE Completed by RXAMF!-F

Use additional sheets if necessary. Page of

Continue to WORKSHEET 2 if wells are assessed individually.
Continue to WORKSHEET 8 if wells arc assessed as a group.
B-4
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 2 Attainment Objectives for Assessing Individual Wells
See Chanter 3 in "Methods for Evaluating the Attainment of Cleanuo Standards" Volume 2
SITE:
Site ABC
Numbers in square brackets [] refer to the Worksheet bom which the information may be obtained.

(for purposes of illustration, both methods will be used)
Sample Design (Check one): Fixed Sample Size BI Sequential Sampling BI
Probability of mistakenly declaring the well(s) dean = a =

Probability of mistakenly declaring the well(s) contaminated = P =
.1
.2
chemical
Number
£
Chemical
Name
If Mean,
Enter
Cleanup Parameter alternate
Standard to test: hypothesis
(with units) Check one mean
Cs Jii
If %rile. Enter
Critical
proportion for
alternate/null
hypothesis
null alternate
Pn Pi
1
2

Hazardous #1
Hazardous #2

100
60

Mean Q
%tikD
Mean HI
%tik D
MeanU
%tik D
Mean Q
%tite D
75
30

Sample Collection Procedures to be used (attach separate sheet if necessary):
Not specified for this example
Secondary Objectives/Other purposes for which the data is to be collected
Use the Chemical Number (c) to refer on other sheets to the chemical described above.
Attach documentation describing the lab analysis procedure for each chemical.

Date Completed: EXAMPLE Completed by EXAMPLE

Use additional sheets if necessary. Page _

Continue to WORKSHEET 3 if a fixed sample size test is used; or
Continue to WORKSHEET 4 if a sequential sample test is used.
Of.
B-5
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 3 Sample Size When Using a Fixed Sample Size Test for Assessing Individual Wells
See Sections 8.2 in "Methods for Evaluating the Attainment of Cleanuo Standards". Volume 2
SITE:
Site ABC
Numbers in square brackets {] refer to the Worksheet from which the information may be obtained.
Probability of mistakenly declaring the site dean [2] = a =

Probability of mistakenly declaring the site contaminated [2] = P =
From Table A.2.
Appendix A
.842
Number of samples per year = n = | 6
•l-fr,

(based on calculations
described in Section 8.2)
Variance factor from Table A.5, Appendix A = F1 = I 5.55 I
For testing the mean concentration
Chemical Cleanup Standard Deviation
Number [2] Standard[2] [2] of yearly mean Calculate:
Cs
_
B
ft2
1
2

100
60

75
30

23
6

138.53
199.50

2.69
2.03

For testing the proportion of contaminated wells or samples
Chemical Cleanup Calculate:
Number [2] Standard[2] [2] [2] B
c Cs , Pn Pi
B
• (l'?l\. "d'FCPn-Pi)2
Zi^VPo(l-Po))2 ( ° 1}
Column Maximum, (Maximum of 1114 values ) = C =
Round C to next largest integer=Number of years of sample collection^ nv
Total number of samples = nm = N =
Date Completed: EXAMPLE Completed by EXAMPLE
Use additional sheets if necessary. Page
Continue to WORKSHEET 4
2.69
3
«MM^

18
.of.
1 An estimate of $, the serial correlation, is necessary to determine the appropriate value of F. Worksheets IS and
16 can be used to estimate 0. 0 - .2 was assumed for this example.
B-6
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well.
andYear
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards", VoL 2
SITE:
NU14
CHEMICAL:
NUMI
WELL:
MUM
YEAR:
Site ABC
IER(C) AND DESCRIPTION (2J
»emwj AND DUOdr i lUN 1 1 J
IBB4K)

1. Hazardous #1
#1. di ft northeast of treatment well
1988, K = 1
Number* m square brackets [J peter to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size &1 Sequential Sampling &1
For purposes of illustration, both methods are used.

Parameter to be tested [2] (Check one)
n
Number of samples per year [3]
Number of samples with nonmissing data in year = nk=

Cleanup standaid[2] * Cs=

Concentration used for observations below the detection limit =
"Season"
Number
j within Sample
thisktn ID
year
Sample
Collection
date/time
Reported
Concen-
tration
Concentration
Corrected for
Detection Limit
A
Is A Greater
thanCs?
l-Yes
0»No
B
Mean EH
%tikD
100
10
Data for
analysis
XjfcsAifMean V
Xjk « B if %tile
1
2
3
4
5
6

11
21
31
41
51 •
61

Feb. 18, '88
April 12, '88
June 16, '88
Aug. 15, '88
Oct. 12, '88
Dec. 11, '88

88
123
98
78
89
65

Total of Xjk for this year = C = 1 541
Mean of xik for this k* vear = £- = xk = |_ 90.17
Date Completed: ^AMPT^.

Use additional sheets if necessary.
Completed bv EXAMPLE

Page.
of 3
Complete WORKSHEET 4 for other chemicals, years, and wells; otherwise.
Continue to WORKSHEET 5 if a fixed sample size test is used: or
Continue to WORKSHEET 7 if a sequential sample test is used.
B-7
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well.
and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards" Vol 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIPTION [2]
NUMBER(W) AND OESCXIFTION [1 J
NUMBER(K)

1. Hazardous #1
1. di ft northeast of treatment well
1989. K = 2
Numbers in square brackets Q refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size &] Sequential Sampling El
For purposes of illustration, both methods are used.

Parameter to be tested [2] (Check one)
Number of samples per year [3]
Number of samples with nonmissing data in year
Cleanup standard[2]
Concentration used for observations below the detection limit
n* =
Cs=
"Season"
Number
j within
thisk">
year
Mean GO
%tileD
100
10
Sample
ID
Sample
Collection
date/time
Reported
Concen-
tration
Concentration
Corrected for
Detection Limit
A
Is A Greater
thanCs?
l»Yes
0-No
B
Data for
analysis
x:k = A if Mean
Xjk = B if %tile
1
2
3
4
5
6

12
22
32
42
52
62

Feb. 15, '89
April 17, '89
June 14, '89
Aug. 18, '89
Oct. 15, '89
Dec. 13, '89

89
72
105
77
63
92

Total of Xjfc for this year = C =
Q
Mean of x^ for this k* year = — = xk =
89
72
105
77
63
92

498
83.00
Date Completed: EXAMPLE Completed by .

Use additional sheets if necessary.

Complete WORKSHEET 4 for other chemicals, years, and wells: otherwise.
Continue to WORKSHEET 5 if a fixed sample size test is used; or
Continue to WORKSHEET 7 if a sequential sample test is used.
Page.
B-8
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well.
andYear
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL;
WELL-
YEAR:
Site ABC
NUMBER(C) AND DESCRIPTION [2]
NUMBER(w) AND DESCRIPTION 11]
NUMBER(K)

1. Hazardous #1
1. di ft. northeast of treatment well
1990, K = 3
Numbers in square bracket* [] refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size Efl Sequential Sampling Efl
For purposes of illustration, both methods are used.

Parameter to be tested [2] (Check one) =
Number of samples per year [3] = n =

Number of samples with nonmissing data in year = nk =
Cleanup standard[2] = Cs=

Concentration used for observations below the detection limit =
"Season"
Number
j within
this k*
year
Sample
ID
Sample
Collection
date/time
Reported
Concen-
tration
Concentration
Corrected for
Detection Limit
A
Is A Greater
thanCs?
l-Yes
0-No
B
Mean El
%tiJe D
100
10
Data for
analysis
- A if Mean
= B if %tile
1
2
3
4
5
6

13
23
33
43
53 .
63

Feb. 16, '90
April 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90

71
62
88
43
62
73

Total of Xjjc for this year » C * | 399
Mean of xik for this k^vears — »xk= [ 66.50
Completed by EXAMPLE
Date Completed: EXAMPLE

Use additional sheets if necessary.
Complete WORKSHEET 4 for other chemicals, years, and wells: otherwise.
Continue to WORKSHEET 5 if a fixed sample size test is used; or
Continue to WORKSHEET 7 if a sequential sample test is used.
Page
of_3_
B-9
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEETS Data Calculations for a Fixed Sample Size Test When Assessing Individual Wells; by
Chemical and Well
See Chapter 8 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2]
1. Hazardous #1
WELL:
NUMBER( w) AND DESCRIPTION 11J
1. di ft. northeast of treatment well
Numbers in square brackets (] refer to the Worksheet horn which the information may be obtained.
Year
Number
Total from previous page
(if more than one Worksheet
5 used)

Column Totals:
Mean
for the
year [4]
1
2
3

90.17
83.00
66.50

8,130.63
6,889.00
4,422.25

A 239.67 I B 19.441.88 I
(xk)2
Date Completed: EXAMPLE

Use additional sheets if necessary.
Completed by FYAVTPT F

Page.
.of
Complete WORKSHEET 5 for other chemicals and wells or continue to WORKSHEET 6
B-10
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 6 Inference for Fixed Sample Sites Tests When Assessing Individual Wells, by Chemical
and Well
See Chanter 8 in "Methods for Evaluation the Attainment of Cleanup Standards". Volume 2
SITE:
Site ABC
CHEMICAL:
NUMBc AND DESuwilON [ZJ
1. Hazardous #1
WELL:
NUMBEX(w) AND OeSOUPnON [I J
1. di ft. northeast of treatment well
Numbers in square brackets [] refer to the worksheet trom which the information may be obtained.

[2]
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 7(1 Data Calculations for a Sequential Sample When Assessing Wells Individually; by
Chemical and Well
See Chanter 9 in "Methods for E valuatinff the Attainment of Cleanun Standards" Volume 2
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2]
1. Hazardous #1
WELL:
NUMBER(w) AND DESCRIPTION [ 1 ]
1. di ft. northeast of treatment well
Number* in square brackets (] refer to the Worksheet from which the information may be obtained.
Cleanup standard[2] = Cs
Alternate mean =
100
75
Probability of mistakenly declaring the well(s) dean [2]» a = |
Probability of mistakenly declaring the well(s) contaminated [2] = P =
.1
.2
Year Yearly Cumulative Cumulative Mean
Number Average Sumofxk Sum of x^ (average of
[4] [4] (Ao = 0) (Bo»0) yearly averages)

k or m xk Ak = Ak.t+xk Bk = Bk_!+xk2 xm
Standard
Error of Mean
Sjf
m
1
2
3

90.17
83.00
66.50

•

90.17
173.17
239.67

Car

8,130.63
15,019.63
19,441.88

y as many signi

90.1700
86.5950
79.8900

kant figures as

_
3.4622
7.0077

possible

Date Completed: EXAMPLE
Use additional sheets if necessary.
Completed by EXAMPLE
Page.
.of.
Complete WORKSHEETS 7a and 7b for other chemicals and wells
B-12
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 7u Data Calculations for a Sequential Sample When Assessing Wells Individually: by
Chemical and Well
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE: Site ABC
NUt&Ejtlc) AND DESdUriiON 12 J
CHEMICAL: 1. Hazardous #1
NUMBER( W) AND DBSCMPI toR 1 1 J
WELL: 1 . di ft northeast of treatment well
Number* in square
Year
Number „
[4] 5 = ^
m Sxm
1
2
3

*LR = ex

-3.5675

bracfceu [] refer U
t*
Sxm

-1.086

(Rm-2 /" ]
P[5 m 'Vm-l+t* J

> the Worksheet horn which the information may be obtained.
Critical Critical Decision:
value: value: clean LR > B,
Likelihood clean contaminated contaminated LR £ A,
ratio R in or no decision
LR* A--^ 8=-^ A
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 8 Attainment Objectives When Assessing Wells as a Group
See Chapter 3 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE:
Site ABC
Number* in square brackets [J refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size O Sequential Sampling Efl

Probability of mistakenly declaring the well(s) dean = a =

Probability of mistakenly declaring the well(s) contaminated = P =
.1
.2
Chemical
to be tested
number
Chemical
name
Cleanup Parameter alternate
standard to test: hypoth-
(with units) Check one esis
Cs u.
If mean, If mean,
enter the enter the
alternate
hypoth-
esis
Maxi
1
2

Hazardous #1
Hazardous #2

100
60

MeanGZI
%tileD
MeanES
%tilea
Mean U
Max a
McanU
Max D
75
30

Sample Collection Procedures to be used (attach separate sheet if necessary):
Not specified for this example
Secondary Objectives/ Other purposes for which the data is to be collected:
Use the Chemical Numb?- (c) to refer on other sheets to the chemical described above.
Attach documentation d ,. ribing the lab analysis procedure for each chemical.

EXAMPT.E
Date Completed: EXAMPLE

Use additional sheets if necessary.
Completed by,
Page.
of
Continue to WORKSHEET 9 if a fixed sample size test is used; or
Continue to WORKSHEET 10 if a sequential sample test is used.
B-14
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 9 Sample Size When Using a Fixed Sample Size Test for Assessing Wells as a Group
See Sections 8.2 in "Methods for Evaluating the Attainment of Cleanup Standards", Volume 2
SITE:
Site ABC
Numbers in squire brackets [] refer to the Worksheet from which the information may be obtained.
From Table A.2,
Appendix A
Probability of mistakenly declaring the site clean [8] = a =

Probability of mistakenly declaring the site contaminated [8] = p =

Number of samples per year « n » [j6

F1 =
Variance factor from Table A.5, Appendix A

For testing the mean concentration
j (based on calculations
J described in Section 8.2)
Chemical Cleanup
Number [8] Standard[8]
c Cs
Standard Deviation
[8] of mean
Calculate:
cs-m
-T
-p /
1
2

100
60

75
30

23
6

138.53
199.50

2.69
2.03

For testing the maximum concentration across all wells
Standard Deviation Calculate:
Number [8] Standaid[8] [8] of yearly mean
c Cs Maxi &

fCs-Maxif ft2
[zi -O+ZL-B J ""a - F*B + '

Column Maximum, (Maximum of md values) = C =
Round C to next largest integer=Number of years of sample collection= m=
Total number of samples = nm = N =

2.69
3
18
Date Complered: EXAMPLE Completed bv EXAMPLE.
Use additional sheets if necessary.
Continue to WORKSHEET 10
Page
of
1 An estimate of 0, the serial correlation, is necessary to determine the appropriate value of F. Worksheets 15 and
16 can be used to estimate 0. 0 was assumed to be .20 for this example.
B-15
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Group; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(c; AND DESCRIPTION 18]
NUMBEK^W) AND DESCRIPTION 11 1
NUMBER(K)
1988,

1. Hazardous #1
3. ds ft north of treatment well
k-1
Numbers in square brackets [J refer to the Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =

Concentration used for observations below the detection limit =
MeanE?
MaxD
10
"Season"
Number
J
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
tration Detection Limit
1
2
3
4
5
6

31
32
33
34
35
36

Feb. 18, '88
Apr. 12, '88
June 16. '88
Aug. IS, '88
Oct. 12, '88
Dec. 11/88

88.71
89.38
74.92
80.03
89.98
91.34

Date Completed: EXAMPLE

Use additional sheets if necessary.
Completed by EXAMPLE
Page_L.of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
B-16
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 10 Data Records for an Individual Weil and Calculations When Assessing Wells as a
Group; by Chemical, Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
N-UMBER(C) AND DESCRIPTION [8]

NUMBER(w) AND DESCRIPTION [1]
NUMBER(K)
1988,

. Hazardous
4. dj
k

s
1
ft.

#1
southwest

of treatment well

Numbers in square brackets [] refer to (he Worksheet from which the information may be obtained.
Parameter to be tested (Check one)'

Number of samples per year = n:

Concentration used for observations below the detection limit :
Meanla?
MaxD
10
"Season"
Number
j
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
oration Detection Limit
1
2
3
4
5
6

41
42
43
44
45
46

Feb. 18, '88
Apr. 12, '88
June 16, '88
AUK. 15, '88
OCL 12, '88
Dec. 11, '88

76.50
71.28
93.77
73.60
120.94
82.56

Date Completed: EYAMWP.

Use additional sheets if necessary.
Completed by EXAMPLE
Page_2_of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
B-17
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Group; by Chemical. Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanun Standards" Vol 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIn ION (8]
NUMBE1HW) AND DESCRIPTION 11]
NUMBERtK)
1988,

1.
5.
k =

Hazardous #1
ds ft. southeast
1

of treatment well

Numbers in square brackets [J refer to the Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =

Concentration used for observations below the detection limit =
Meanl^T
MaxD
10
"Season"
Number
j
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
nation Detection Limit
1
2
3
4
5
6

51
52
53
54
55
56

Feb. 18, '88
Apr. 12, '88
June 16, '88
Aug. 15, '88
Oct. 12, '88
Dec. 11/88

62.68
92.49
80.94
103.38
95.39
99.04

Date Completed: BXAMPI.E

Use additional sheets if necessary.
Completed by F.YAMPT.F
1 Page_l_of_2_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
B-18
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Croup; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIPTION [8]

NUMBER(w) AND DESCRIPTION (I J
NUMBER(K)
1989,

1.
3.
k-

Hazardous
d.
2
ft.

north

#1
of treatment

well

Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =

Concentration used for observations below the detection limit =
Mean &
MaxD
10
"Season"
Number
J
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
nation Detection Limit
1
2
3
4
5
6

31
32
33
34
35
36

Feb. 15, '89
Apr. 17, '89
June 14, '89
Aug. 18, '89
Oct. 15, '89
Dec. 13, '89

87.11
78.38
80.61
73.51
89.16
100.26

Date Completed: EXAMPLE

Use additional sheets if necessary.
Completed by EXAMPLE
Page_4_of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
B-19
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Group; by Chemical. Well and Year
See Charter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site
ABC

NUMBEXCc) AND DESCRIPTION [8]
NUMBER(W) AND DESCRIPTION [1]
NUMBHKK)

1989,

Hazardous
4. dd
k
SB
2
ft.

#1
southwest

of treatment well

Nurnben in squve brackets [] refer to the Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =

Concentration used for observations below the detection limit =
MeanGJ
MaxD
10
"Season"
Number
J
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
tration Detection Limit
1
2
3
4
5
6

41
42
43
44
45
46

Feb. 15, '89
Apr 17, '89
June 14, '89
Aug. 18, '89
Oct, 15, '89
Dec. 13, '89

82.34
85.69
96.72
108.61
95.75
66.77

Date Completed:
Completed by EXAMPLE
Use additional sheets if necessary. Page _5_ of _9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
B-20
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Croup; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESOUFTION (8J

NUMBEX(W) AND DESOIimON [1]
NUMBEXK)
1989,

1.
5.
k»

Hazardous
dIJ
2
ft.

#1
southeast of treatment

well

Numben in square brackets [] refer to the worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =

Concentration used for observations below the detection limit =
Meanla?
MaxD
10
"Season-
Number
J
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
tration Detection Limii
1
2
3
4
5
6

51
52
53
54
55
56

Feb. 15, '89
Apr. 17, '89
June 14, '89
Aug. 18, '89
Oct. 15, '89
Dec. 13, '89

80.05
81.44
92.89
93.87
95.82
78.39

Par* rnmpliti-H-
Completed by EXAMPLE
Use additional sheets if necessary. Page _6_ of _2_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue ID WORKSHEET II
B-21
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Group; by Chemical. Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards" Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCR1FTION (8]

NUMBER(W) AND DESCRIPTION [1 ]
NUMBER(K)
1990,1

1.
3.
C s

Hazardous
d^
3
ft

north

#1
of treatment

well

Numbers in square brackets [I refer to the Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =
Number of samples per year = n =
Concentration used for observations below the detection limit =
MeanEj
MaxD
10
"Season-
Number
j
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
tration Detection Limit
1
2
3
4
5
6

31
32
33
34
35
36

Feb. 16, '90
Apr. 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90

76.86
76.38
87.46
80.84
71.65
57.28

Date Completed: PYAMPI.F.
Use additional sheets if necessary.
Completed by EXAMPLE
Page_7_of_9_

Complete WORKSHEET 10 for other chemicals, yean, and wells or continue to WORKSHEET 11
B-22
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Group; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIPnON [8]

NUMUR(W) AND DESCRIPTION |1J
NUMBER^*)
1990,

.Hazardous
4. dd
k

X
3
ft.

#1
southwest of treatment

well

Numben in square bracket* [] refer to the Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =
Concentration used for observations below the detection limit =
MeanE?
MaxD
10
"Season"
Number
j
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
tration Detection Limit
1
2
3
4
5
6

41
42
43
44
45
46

Feb. 16, '90
Apr. 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90

87.85
87.08
97.84
105.95
81.58
87.76

Date Completed: EXAMPLE

Use additional sheets if necessary.
Completed by.
EXAMPLE
Page_8_of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
B-23
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
Croup; by Chemical. Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards" Vol 2
SITE:
CHEMICAL:
WELL-
YEAR:
Site ABC
NUMBER(c) AND DESCRIPTION [8]

NUMBER(W) AND DESCRIPTION [1]
NUMBER(K)
1990,

1.
5.
k =

Hazardous #1
di
3
ft.

southeast of treatment well

Numben in square bracken [] refer to (he Worksheet from which the information may be obtained.
Parameter to be tested (Check one) =

Number of samples per year = n =

Concentration used for observations below the detection limit =
MeanG?
MaxG
10
"Season"
Number
j
Sample
Sample Collection
ID time
Reported Concentration
Concen- Corrected for
(ration Detection Limit
1
2
3
4
5
6

51
52
53
54
55
56

Feb. 16, '90
Apr. 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90

79.70
59.32
66.64
52.48
91.63
35.08

Date Completed: EXAMPLE

Use additional sheets if necessary.
Completed by EXAMPLE
Page_9_of_9_

Complete WORKSHEET 10 for other chemicals, yean, and wells or continue to WORKSHEET 11
B-24
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical and
Year •*
See Chapter 8 or 9 in "Statistical Methods for Evaluating the Attainment of Superfund Cleanup Standards". Vol. 2
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION 18}
1. Hazardous #1
YEAR:
NUMBER(K)
1988, k -1
Numben in square brackets [] refer to (he Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size D Sequential Sampling ED
Parameter to be tested (Check one) =
Number of samples per year [9] = n =
Mean 1*7
MaxD
"Season" Well#l
NumbcrtlO] [10]
j xik
Well #4
[10]
Xik
Well#_i
[10]
Xik
Well*.
[10]
Measure for
analysis
Well #_ (row maximum
[10] or row mean)
x,k
1
2
3
4
5
6

88.71
89.38
74.92
80.03
89.98
91.34

76.50
71.28
93.77
73.60
120.94
82.56

62.68
92.49
80.94
103.38
95.39
99.04

75.96
84.38
83.21
85.67
102.10
90.98

Total of Xj for this year * A = } 522.30 \
- A
Mean of Xjk for this year = xk = — = j gy 05 |
Date Completed: EXAMPLE
Use additional sheets if necessary.
Completed by EXAMPLE
Page__L.of.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size lest is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
B-25
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical*and
Year
See Chanter 8 or 9 in "Statistical Methods for Evaluating the Attainment of Suoerfund Cleanuo Standards" Vol 2
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION (8]
1. Hazardous #1
YEAR:
NUMBER(K)
1989.
Numbers in square brackets [j refer to the Worksheet from which the information may be obtained.
Sample Design (Check one): Fixed Sample Size D Sequential Sampling &)

Parameter to be tested (Check one)

Number of samples per year [9] = n
MeanEj
MaxD
Measure for
analysis
"Season" Well#JL Well #4 Well #.5. Well#_ Well #_ (row maximun
Number! 10] [10] [10] [10] [10] [10] or row mean
J xik xik xik xik xik xi
1
2
3
4
5
6

87.11
78.38
80.61
73.51
89.16
100.26

82.34
85.69
96.72
108.61
95.75
66.77

80.05
81.44
92.89
93.87
95.82
78.39

83.17
81.84
90.07
92.00
93.58
81.81

Total of Xj for this year = A = | 522.47 I
_ A
Mean of Xj^ for this year = x^ = —
87.08
Date Completed: EJLAM£LE_
Use additional sheets if necessary.
Completed by EXAMPLE
Page_2_of.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size test is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
B-26
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical and
Year 4
See Chapter 8 or 9 in 'Statistical Methods for Evaluating the Attainment of Superfund Cleanup Standards",Vol. 2
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION IB]
1. Hazardous #1
YEAR:
NUMBERUt)
1990. k » 3
Numbers m square brackets [] refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size D Sequential Sampling Efl

Parameter to be tested (Check one) ••

Number of samples per year [9] = n
Mean 1?
MaxD
"Season" Well#_3_ Well #4
Number! 10] [101 [10]
j *ik
Well #5 Wcll#_
[101 [101
*ik xik
Measure for
analysis
Well #_ (row maximum
[10] or row mean) V
1
2
3
4
5
6

76.86
76.38
87.46
80.84
71.65
57.28

87.85
87.08
97.84
105.95
81.58
87.76

79.70
59.32
66.64
52.48
91.63
35.08

81.47
74.26
83.98
79.76
81.62
60.04

Total of Xj for this year = A = j 451.13 |
A r——
Mean of Xjk for this year = Xk= — = I ^Tl
Date Completed: EXAMPLE
Completed by EXAMPLE
Use additional sheets if necessary.
Page.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size test is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
.Of.
B-27
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 1 4(1 Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
Chemical
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2 _
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION (8]
1. Hazardous
Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.
Cleanup standard[8] = Cs •
Alternate mean = nt •

Probability of mistakenly declaring the well(s) clean [8] = a =
Probability of mistakenly declaring the well(s) contaminated [8] = $
100
75
Year
Number
[11]
k
Yearly
Average
[11]
Cumulative
Sum of x^
(A0 = 0)
Cumulative Mean
Sum of x^ (average of
(Bo = 0) yearly averages)
Standard
Deviation of Mean
(k-l)k
1
2
3

87.05
87.08
76.86

87.05
174.13
250.99

7,577.70
15,160.63
21,068.09

Carry as ma

87.0500
87.0650
83.6633

ly significant fi]

_
_
3.402

ures as possible

Date Completed:
Completed by EXAMPLE
Use additional sheets if necessary.

Complete WORKSHEET 14a and 14b for other chemicals and groups of wells
Page
of.
B-28
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 14b Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
Chemical
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE: Site ABC
NUMBER(C) AND DESCRIPTION [8]
CHEMICAL: 1. Hazardous #1
Numbers in squari
Year
Number
m l S5T
1
2
3

-7.349

• brackets [] refer
t =
Sxm

-1.128

to the Worksheet from which the information may be obtained.
Critical Critical Decision:
value: value: cleanLR>3,
Likelihood clean contaminated contaminated LR £ A
ratio n in or no decision
LR* A=-^- B=-^ A
-------
APPENDIX B: EXAMPLE WORKSHEETS

. WORKSHEET 15 Removing Seasonal Patterns in the Data (Use as First Step in Computing Serial
Correlations)
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
Site DEF (data independent of five-well example)
CHEMICAL:
NUMBEWC) AND DESCRIPTION [2 OR 8J
1. Chemical #1
WELL:
NUMBER(W) AND DESCRIPTION [1J
1. di ft. south of treatment well
Numbers in square brackets [] refer to the Worksheet from which (he information may be obtained.
Number of
"Season" Measurements for each "season" for year k years with Row
Number Yr=_L_ Yr=_2_ Yr=__ Yr»__ Yr*_ Data Total
Row
Mean
J
Xjk
m;
J
m;
k J
1
2
3
4
5
6

120
163
128
150
125
110

133
117
113
126
114
145

2
2
2
2
2
2

253
280
241
276
239
255

126.5
140
120.5
138
119.5
127.5

Corrected measurements with seasonal patterns removed
"Season" Corrected Measurements for each "season" for year k

Number Yr=_J_ Yr=_2_ Yr= Yr=__ Yr«

J Xft-Xj Xfc-Xj Xfr-Xj
1
2
3
4
5
6

-6.5
23
7.5
12
5.5
-17.5

6.5
-23
-7.5
-12
-5.5
17.5

Date Completed: EXAMPLE
Completed by EXAMPLE
Use additional sheets if necessary.

Complete WORKSHEET 15 Tor other chemicals
Continue to WORKSHEET 16 if serial correlations are being computed.
Page.
of
B-30
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 16 Calculating Serial Correlations
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SUE:
SiteDEF (data independent of five-well example)
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2 OR 8|
1. Chemical #1
WELL:
NUMBER(w) AND DESCRIPTION [1 ]
1. di ft. south of treatment well
Numbers in square brackets (] refer to the Worksheet from which the information may be obtained.
Year«k'

Period between well samples in months = t •
Data
Numbers
(season within
yeark)
Residual
[131
Product
11
21
31
41
51
61
-6.5
23.0
7.5
12.0
5.5
-17.5

-149.5
172.5
90.00
66.00
-96.25

42.25
529.00
56.25
144.00
30.25
306.25
Totals from previous page =
(if more than one
Worksheet 16 is used)
Column Totals = |A 82.75 I |B1 108

A A
Estimated Serial Correlation based on the data = g" = %bs =
Serial Correlation between monthly observations = $ = ($bbs)'
Date Completed: FVAMPT .F.

Use additional sheets if necessary.

Complete WORKSHEET 16 for other chemicals
Completed by EXAMPLE
Page.
of.
B-31
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 16 Calculating Serial Correlations
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanuo Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
Site DEF (data inde
NUMBCRIC^ AND OESuUrnuH [i Ol
NUMBER( W J AND Dcsudri luH [ 1 J
pendent of five-well example)
1. Chemical #1
1. di ft south of treatment well
numbers in square brackets [J refer to the worksheet from which the information may be obtained.
Data
Numbers
(season within
yeark)
12
22
32
42
52
62

Year =k =
Period between well samples in months = t =
Residual Product
[151

6.5
-23.0
-7.5
-12.0
-5.5
17.5

Totals from previous page =
(if more than one
Worksheet 16 is used)
Column Totals^

-149.50
172.50
90.00
66.00
-96.25
-

| 82.75 1
*
42.25
529.00
56.25
144.00
30.25
306.25

2
2

1,108

U 165.5 1
Estimated Serial Correlation based on the data = «• = $hhs a
•% ' W9
Serial Correlation between monthly observations « $ » (^
Hare rnmpleted! EXAMPIE Campl
B 2,216

| .0747 |
1
,Y-
etedby

1 .2733
EXAMPLE

•

Use additional sheets if necessary.

Complete WORKSHEET 16 for other chemicals
Page_2_of JL_
B-32
-------
APPENDIX B: EXAMPLE WORKSHEETS

WORKSHEET 1R Basic Calculations for a Simple Linear Regression
See Section 6.1 in "Methods for Evaluating the Attainment of Cleanup Standards", Vol. 2
SITE:
Site ABC
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2 OR 8]
1. Hazardous
WELL:
NUMBER(w) AND DESCRIPTION (1J
l.di ft northeast of treatment well
Numben in square bracket* (] refer to ihe Worksheet from which the information may be obtained.
Concentration
Sample Corrected for
Number Detection Limit
Concentration used when no concentration is reported •

Number of collectable samples = N •

Transformed
Time
Variable
1
2
3

90.17
83.00
66.50

8,130.63
6,889.00
4,422.25

1
2
3

1
4
9

90.17
166.00
199.50

Totals from orevious naffd s):

Column Totals:
A 239.67
I

B 19,441.88 C 6
1 1

D 14 IE
A » Zyn B = 2-yil c * £*n D = £xn
Corrected Sum of Sauares and Cross Products:
79.89 2
294.64
2 1
Date Completed! EXAMPLE Comnletedbv EXAMPLE
Use additional sheets if necessary.
Page
455.67 |
E = 2ynxn
-23.67 |
,£- —
of
Complete WORKSHEET 1R for other chemicals or continue to WORKSHEET 2R.
B-33
-------
APPENDIX B: EXAMPLE WORKSHEETS
WORKSHEET 2R Inference in a Simple Linear Regression
See Section 6.1 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
Site ABC
NUMBEWC) AND DESCRIPTION [2 OR 8J
CHEMICAL: 1. Hazardous #1
WELL:
NUMBERC") AND DESCRIPTION 11 J
l.di ft. northeast of treatment well
"~""^TJ3nb«sunquareTra3!eTrn^e^^hTWwkiheei from which the information may be obtained.
Estimating Regression Coefficients
Sw [1RJ • 294.64

S»11RJ= I

Svx 11RJ = -23.67
Type 1 error
probability
a= .1
>
Sum of sc
Critical value from table of t
for sped
'Stand
Upper Tw
U
Intc
Calculating Predictic
Value
Standard Error of Predicted >
Upper Two Sided
Lower Two Sided
Hate Completed: EXAMPL
Number of collectable
samples [1R] « N =
Mcanofyt[lR] y
Meanofx,[lR] = X =
sv_
Estimated slope [1R], bj » r" =
Estimated Intercept [1R], brj = y - (bi*X) =
(sM
uares due to error [1R], SSE = ^yy-IgM *
-distribution (Appendix A.1)
fied values of (1-j) and Df » t *
Mean Square Error, MSE = JJT2 s
VWCE
s =
o Sided Confidence Interval
for Slope: bt +• t * s(bj) =
)wer Two Sided Confidence
aval for Slope: bj - 1 * s(bj) =
»n Limits
of x, at which concentration is to be predicted -
Predicted value, 9 * bo + b^, =
/oln. CA — -\ 1 \jfCT7 f 1 _i_ * _i_ ' \ —
'aiue = by — \l Moc( i + ^ + ^ } —
v ^^ XX
Confidence Interval for Prediction = y + 1 *Sy =
1 Confidence Interval for Prediction = y - 1 *Sy =
f.F Comnletedbv EXA
3
79.89
2
-11.84

103.57
14.51
1

6.314
14.51
2.69
5.14
-28.82

2.5 1
73.97 I
4.6000
Upper 103.01
Low«r 44.93
MPTJi
Use additional sheets if necessary. Page of
Complete WORKSHEET 2R for other chemicals
B-34
-------
APPENDIX C: BLANK WORKSHEETS
The worksheets in this appendix can be photocopied when needed. Then the copies
may be us&i in their current form or modified, as appropriate. They may be employed to
document the objectives and decisions, record data, and make calculations to determine if the
ground water at the site attains the cleanup standard. These worksheets refer to in the main text of
this document. Appendix B provides examples of how to fill out the worksheets.

The initial appearance of a "Bold" letter in a worksheet represents an intermediate
computation, the result of which will be used in a later computation and will also be signified by
the letter in "Bold" script.
To maintain adequate precision in doing the computations appearing in the
worksheets, (particularly in the calculation of estimated variances, standard deviations, or standard
errors), the number of decimal places retained should be as high as possible, with a minimum of
four.
C-l
-------
APPENDIX C: BLANK WORKSHEETS
Table C. 1 Summary of Notation Used in Appendix C
Symbol
Definition
m
N

index i
index k
index j

index c
index w
*k
Sx
Cs
Df
The number of years for which data were collected (usually the
analysis will be performed with full years worth of data)
The number of sample measurements per year (for monthly data, n
= 12; for quarterly data, n = 4). This is also referred to as the
number of "seasons" per year
The total number of sample measurements (if there are no missing
observations, N * mn)
Indicates the order in which the ground-water samples are collected
Indicates the year in which the ground-water samples are collected
Indicates the season or time within the year at which the
groundwater samples are collected
Indicates the chemical analyzed
Indicates the well sampled
Contaminant measurement for the ith ground-water sample
An alternative way of denoting a contaminant measurement, where k
= 1, 2, .... m denotes the year, and j = 1, 2, .... n denotes the
sampling period (season) within the year. The subscript for x* is
related to the subscript for Xj in the following manner i = (k-l)n +
j-
The mean (or average) of the contaminant measurements for year k
(see Boxes 8.5 and 9.4)
The mean of the yearly averages for years k = 1 to m.
The standard deviation of the yearly average contaminant
concentrations from m years of sample collection (see Boxes 8.7
and 9.6)
The standard error of the mean of the yearly means (see Boxes 8.9
and 9.8)
The designated clean up standard
The degrees of freedom associated with the standard error of an
estimate (see Boxes 8.7 and 9.6)
C-2
-------
APPENDIX C: BLANK WORKSHEETS

WORKSHEET! Sampling Weils

See Section 32 in "Statistical Methods for Evaluating the Attainment of cleanup Standards". Volume 2
| SITE:

Sample -
Well
Number Describe each sampling well to be used to assess attainment
w
Decision Criteria: Wells assessed (Checked one) Individually D As a Group D

Use the Sampling Well Number (w) to refer on subsequent sheets to the sampling wells described
above.

Attach a map showing the sampling wells within the waste site.

Date Completed: Completed by

Use additional sheets if necessary. Page of

Continue 10 WORKSHEET 2 if wells are assessed individually.
Continue » WORKSHEET f if wells are assessed as a group.
C-3
-------
APPENDIX C: BLANK WORKSHEETS

WORKSHEET 2 Attainment Objectives for Assessing Individual Wells
See Chapter 3 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SUE:
Numbers in square brackets [] refer to die Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size D Sequential Sampling D
Probability of mistakenly declaring the well(s) clean = a =

Probability of mistakenly declaring the weU(s) contaminated = (3 =
If Mean,
Enter
Cleanup Parameter alternate
If %rile. Enter;
Critical
IQT
alternate/null
Chemical Chemical
Number Name
c

Standard to test
(with units) Check one
Cs

MeanU
%dle D
Mean a
%tile D
Mean D
%tile D
Mean D
% tile D
hypothesis
mean

hypothesis
null alternate
Po Pi

Sample Collection Procedures to be used (attach separate sheet if necessary):
•
Secondary Objectives/ Other purposes for which the data is to be collected:
Use the Chemical Number (c) to refer on other sheets to the chemical described above.
Attach documentation describing the lab analysis procedure for each chemical.
Date Completed:
Use additional sheets if necessary.
Completed by.
Page.
.of.
Continue to WORKSHEET 3 if a Fixed sample size test is used; or
Continue to WORKSHEET 4 if a sequential sample test is used.
04
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 3 Sample Sue When Using a Fixed Sample Sue Test for Assessing Individual Wells
Sections 8.2 in 'Methods for Evaluating the Attainment of Cleanup Standards". Volume 2 "
SITE:
Numbers in squire brackets (] refer to the Worksheet from which the
Probability of mistakenly declaring the site dean [2]» a ••
nay be obtained.
From Table A.Z
Probability of mistakenly declaring the site contaminated [2]» 0 = | \zl-b = I

Number of samples per year « n » j_

Variance factor from Table A.5, Appendix A = F1 »

For testing the mean concentration
Chemical Cleanup Standard Deviation CpMTUtiEr
Number [2] Standard[2] (21 of yearly mean
c Cs MI ft B»
I (based on calculations
described ir
in Section 8.2)
:T
For testing the proportion of contaminated wells or samples
Chemical Cleanup Calculate:
Sundard[2]
Cs
Number [2]
c
121
_Po_
12]
_Pj_
B
(1-PQ

Column Maximum, (Maximum of nu values ) = C = 1

Round C to next largest integer=Number of years of sample collections m=

Total number of samples = nm = N =

Date Completed: Completed by
Use additional sheets if
Continue to WORKSHEET 4
ary.
Page.
.of.
1 An estimate of +. the serial correlation, is necessary to determine the appropriate value of F. Worksheets IS and
16 can be used to estimate +.
C-5
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well,
and Year
Se« fhanier it or Q in "Methods for Evaluating the Attainment of Cleanim Standards" Vol 2
SITE:
CHEMICAL:
WELL:
YEAR:
NUMBBt(c) AND DESCRIPTION [2]
raWBER(w) AND DESCRIPTION [I]
NUMBER(K)
Numbers in square brackets Q refer to the Worksheet from which the information may be obtained.
Sample Design (Check one): Fixed Sample Size O Sequential Sampling D
Parameter to be tested [2] (Check one) =
Number of samples per year [3] =
Number of samples with nonmissing data in year •
Cleanup standard(2] =

Concentration used for observations below the detection limit •
Cs=
"Season"
Number
j within
dusk*
year
Sample
ID
Sample
Collection
date/time
Reported
Concen-
tration
Concentration
Corrected for
Detection Limit
A
Is A Greater
thanCs?
l-Yes
0-No
B
MeanU
%tik D
Data for
analysis
Xjfc = A if Mean
Xjk = B if %tile

Total of X£ for this year = C = |
Mean of x* for this k* yea
'-ST-^r-
Date Completed:
Completed by,
Use additional sheets if necessary.
Complete WORKSHEET 4 for other <
Page.
.of
, yean, and wells; otherwise.
Continue to WORKSHEET 5 if a fixed ample size test is used: or
Continue to WORKSHEET 7 if a sequential sample test is used.
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 5 Data Calculations for a Fixed Sample Site Test When Assessing Individual Wells; by
Chemical and Well
jee Chanter 8 in 'Methods for Evaluating the Attainment of Cleanup St

SHE:
CHEMICAL:
NUMBER(C) AND DEKBVTION [2]
WELL:
NUMHR(W) AND DBOOPnON [1J
NumlMn m Bqiura braduu U rote lo tf» W<
from which OIB
nuy
Ye
Number
Mean
for the
year [4]
Total fa
spage
(if man dun one Worksheet
5 wed)

Column Totals:
Date Completed:.
B
Use additional sheets if necessary.

Complete WORKSHEET 5 for other chenu
Completed by,
Page.

welb or continue to WORKSHEET *
.of.
C-7
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET O Inference for Fixed Sample Sizes Tests When Assessing Individual Wells, by Chemical
and Well
See Chapter 8 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE:
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2]
WELL:
NUMBBK(W) AND DBSdtirTlON [1]
Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.

[2] O-

[2] CS-
Number of Years 0]
Sum of the yearly means [5]

Sum of the squared yearly means [S]
Overall mean concentration ••
Standard Deviation of the yearly means •

Degrees of Freedom for sf=
Critical value from table of the t-distribution
(Table A. 1) for specified values of (1 -a) and Df
Standard Error for the overall mean
Upper One Sided Confidence Interval
m

(Xk)2

A
m
A

B
m-1
m-1
Df
Vm
tSx
m
If |iua< Cs then circle Clean, otherwise circle Contaminated: |^^™^^^^^^^^^^^™
Based on the mean concentration, the sampling well is: I Clean Contaminated
Date Completed:
Completed by
Complete WORKSHEET 6 for other chemicals and wells
Page of
C-8
-------
APPENDIX C: BLANK WORKSHEET'S
WORKSHEET 7d Data Calculations for a Sequential Sample When Assessing Wells Individually; by
Chemical and Well
See fhaitter 9 in "Vfethmta for F.vahiatinir the Attainment of C*leMum Snndmls" Volume 2
SITE:
CHEMICAL:
NUMBOt(c) AND DESCRIPTION [2]
WELL:
NUMMKw) ANDOaOUPnON [1}
Numben in *yun bnck>u Q rate to the Woctoh
nuy be obtained.
Cleanup standard!?]« Cs >
Almnatc mean • lit:

Probability of mistakenly declaring the well(s) dean [2]» a =
Probability of mistakenly declaring die well(s) contaminated [2] = P >
Year
Number
[4]

korm
Yearly
Average
[4]
Cumulative
Sum of x^
(Ao-0)
Cumulative Mean
Sum of x^ (average of
(Bo»0) yeariy averages)
Standard
Error of Mean
Date Completed:
Completed by,
Use additional sheets if necessary.

Complete WORKSHEETS 7a rad 7b for other chemicab «d wells
Page.
.of.
C-9
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 7u Data Calculations for a Sequential Sample When Assessing Wells Individually; by
Chemical and Well
or Evaluating the Attainment of Cleanun Standards'* Volume 2
SITE:
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2]
WELL:
NUMUBXw) AND DESCXDT10N [1]
Numbers in iquve brackeu Q refer to UM Workshi
vhichthe
nuy be obtained.
Year
Number
[4]
m
Critical
,
Hi-Cs xm.
Likelihood
ratio
LR*
clean
R
«-=-
1-a
Critical Decision:
value: clean LR > B,
contaminated contaminated LR £ A,
i.ft or no decision
*m (9 ro-2^ I m "1
*LR = ex^8—tA/^l-5j
If "no decision", collect another years' allotment of samples and test the hypothesis again.

Date Completed: - Completed by ______=-

Use additional sheets if necessary.

Complete WORKSHEETS 7a and 7b for other chemicals and wells
Page.
.of.
C-10
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 8 Attainment Objectives When Assessing Wells as a Group
See Chapter 3 in "Methods for Evaluating the Attainment of CTfanup Sfr*¥frffds". Volume 2
SITE:
Numbm in xpun
U rate to HH World
which
nuy t» obtained.
Sample E
Pi
Cbemica
tobeteste
number
c

tetign (Check one): Fixed Sample Size LJ Sequential Sampling LJ

Probability of mistakenly declaring the well(s) don ™ a ™

obability of mistakenly declaring the well(s) contaminated — P -

If me
enter
I Cleanup Parameter alten
sd Chemical standard to test: hypo
name fwith nnitx^ Ctttrlr m^ Mti
CS \L]
MeanU
MaxD
MeanU
MaxD
MeanU
MaxD
MeanU
MaxD

san. If mean,
the enter nie
inflp alternate
A- hypoth-
s esis
Max,

Sample Collection Procedures to be used (attach separate sheet if necessary):

•

Secondary Objectives/ Other purposes for which die **nf* is to be collected'

Use the Chemical Number (c) to refer on other sheets to the chemical described above.
Attach documentation describing the lab analysis procedure for each chemical
Date Completed: ^^^^^^^^_ Completed by ^^^^^_^^___
Use additional sheets if necessary. Page_
Continue to WORKSHEET 9 if a fixed sample size test is used; or
Continue to WORKSHEET 10 if a sequential sample test is used.
.Of.
C-ll
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 9 Sample Sue When Using a Fixed Sample Size Test for Assessing Wells as a Group
See Section 8.2 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE:
tJA AArf. ^ __^_.^_^_
"NufnooTin square brackets [] refer to the Worksheet from which die information may bTobttmed.
Probability of mistakenly declaring the site dean [8]» a=

Probability of mistakenly declaring the site contaminated 18]«(J s

Number of samples per year « n '

Variance factor from Table A.5, Appendix A = F1 =

For testing the mean concentration
Chemical Cleanup Standard Deviation Calcul
Number [8] Standard[8] [8] of mean
c Cs Ui & B
Prom Table Ai
Appendix A
•l-a=[
tttmmfA rift
ibed in Section &2)
a2
For testing the maximum concentration across all wells
~~ ' Standard Deviation Calculate:
Number [8] Standard[8] . [8] of yearly mean

/Cs-Maxif
V2l-a+zi^j

«V,-^ + 2

Column Maximum, (Maximum of m^ values) = C =

Round C to next largest intcger=Number of years of sample collections m=

Total number of samples = run = N =

Date Completed: _ Completed by
Use additional sheets if necessary.
Continue to WORKSHEET 10
Page.
.of.
1 An estimate of +, the aerial correlation, is necessary to determine the appropriate value of F. Worksheets IS and
16 can be used to estimate+.
C-12
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 10 Data Records for a* Individual WcU and Calculations When Assessing Wells as a
Group; by Chemical. Well and Year
SM Chanter 8 or 9 in "Methods for Evaluating the Attainment of CfeannD Standards". VoL 2
SITE:
CHEMICAL:
WELL
YEAR:
Nombn in iquare bnckett 0
I nuy bt obcunod.
Parameter to be tested (Check one) •
Number of samples per year = n
Concentration used for observations below the detection limit •
Menu
MaxD
Sample Reported Concentration
"Season" Sample Collection Concen- Corrected for
Number ID time nation Detection Limit

Date Completed:.
Completed by,
Use additional sheets if necessary. Page of.
Complete WORKSHEET 10 for other chemicals, yean, and wdls or continue ID WORKSHEET 11
C-13
-------
APPENDIX C: BLANK WORKSHEETS

WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical and
Year
SM Chant** R or 9 in "Methods for Evaluating the Attainment of CteaniiD Standards" VoL 2
SITE:
CHEMICAL;
NUMBER(c) AND DESOUPnON [8]
YEAR:
NUMBOI(K)
Numben in square brackets Q refer to (he Worksheet ten which the infonnatioa nay be obtained.

Sample Design (Check one): Fixed Sample Size O Sequential Sampling D
Parameter to be tested (Check one)
Number of samples per year [9] = n
Mean U
MaxD
"Season" Well#__ Well#_
NumbefllO] [101 [10]
j
Well#_ WeU#_
[101 [101
Measure for
analysis
Well # (row maximum
[101 or row mean)
Total of Xj for this year » A
_ A
Mean of x^ for this year = xk » — = I
Date Completed:
Use additional sheets if necessary.
Completed by.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size lea is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
page-of-'
C-14
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 12 Data Calculations for a Fixed Sample Site Tea When Assessing Wells as a Group;
by Chemical
See ChaMer 8 in "Methods for Evaluating the Attainment of Cleanup St"vfardi'. Volume 2
SITE:
CHEMICAL:
i Buy to
Yea-
Number
Total from p
(if more dm one copy of
Woriuheet 12 is necessary)

Column Totals:
Date Completed:
Man
for the
yetr[ll]
(Xk)2
Completed by.
Use additional sheets if necessary. Page.

Complete WORKSHEET 12 for other chemicals or continue to WORKSHEET 13
.of.
C-15
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 13 Inference for Fixed Sample Sizes Tests When Assessing Wells as a Group; by
Chemical
See Chapter 8 in "Methods for Evaluating the Attainment of Cleanup Standards'. Volume 2
SITE:
CHEMICAL:
NUMBER(c) AND DESCRIPTION [8]
Numben in square brackets Q refer to die Worksheet from which the information mey be obtained.
I
[8] C§>
Number of Years [9] >
Sum of the yearly means [12] >
Sum of the squared yearly means [12] >
Overall mean concentration=

Standard Deviation of the yearly means •
Degrees of Freedom for s» =
Value from table of T-distribution (Appendix A.1)
for specified values of (1 - a) and Df=
Standard Error for the overall mean =
Upper One Sided Confidence Interval
then circle Qean, otherwise circle Contaminated:
m
A
B
x
Vm
'm
us then circle dean, otherwise circle Contaminated: ;
Based on the mean concentration, the sampling well is: I Clean Contaminated
Date Completed:
Completed by.
Complete WORKSHEET 13 for other chemicals
Page.
.of.
C-16
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 14(1 Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
Chemical
See Chanter 9 in "Methods for Evaluating the Attainment of Cleanun Standards". Volume 2
SUE:
CHEMICAL:
NUMBBI(C) AND DESCRIPTION [8]
Numben in square brackets Q refer to the Worksheet bom which the information may be obtained.
Cleanup standard[8] - Cs •
Alternate mean »|it •

Probability of mistakenly declaring the well(s) dean [8] = a
Probability of mistakenly declaring the well(s) contaminated [8] = P
Year
Number
[11]

k or m
Yearly
Average
[11]

xk
Cumulative
Sumofxk
(Ao = 0)
Cumulative Mean
Sum of x? (average of
(Bo«0) yearly averages)

xm
Standard
Error of Mean
(k-l)k
Date Completed:
Completed by.
Use additional sheets if necessary.

Complete WORKSHEET 14a and 14b for other chemicals and groups of wells
Page.
of
C-17
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 14b Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
Chemical
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE:
CHEMICAL:
NUMBER(C) AND DESCRIPTION [VJ
Numben in square brackets U refer to the Worksheet from winch the mfomtuion nuy be obtained.
Year
Number
[4]
m
'm
s*m
Likelihood
ratio
LR*'
Critical Critical Decision:
value: value: clean LR > B,
clean contaminated contaminated LR £ A,
P l_jj or no decision

1-a a
A
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 13 Removing Seasonal Patterns in the Data (Use as First Step in Computing Serial
Correlations)
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
NUMBHU.C) AND DESCRIPTION (2 OR Bj
WELL:
NUMBER(w) AND DESOUfTION [1]
Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.
Number of
"Season" Measurements for each "season" for year k years with Row Row
Number Yr=_ Yr=__ Yr*__ Yr»__ Yr=_ Data Total Mean
Corrected measurements with seasonal patterns removed
aeasoi
Nurnbe
j

i ixnrecied
r Yr=
Xfc-Xj

iivieasurcni
Yr=
Xft-Xj

cnc ror eac
Yr=
Xfc-X,

n season
•Yr*
Xfc'Xi

tor year K
Yr«
Xft-Xi

Completed by.
Date Completed:
Use additional sheets if necessary.
Complete WORKSHEET 15 for other chemicals
Continue to WORKSHEET It if serial correlations are being computed.
Page.
.of.
C-19
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 16 Calculating Serial Correlations
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanim Standards" Vol. 2
SITE:
CHEMICAL:
NUMBER(c) AND DESOUPnON [2 OR 8]
WELL:
NUMBER(w) AND DESCRIPTION [I ]
Number* in square brackets [] refer to the Worksheet bom which the information may be obtained.
Year*k

Period between well samples in months = t •
Data
Numbers

jk
(season within
yeark)
Residual
[15]
Product
Totals from previous page •
(if more than one
Worksheet 16 is used)

Column Totals •
Estimated Serial Correlation based on the data •
!B =
B
Serial Correlation between monthly observations * $ = ($obs)' x L
Date Completed:

Use additional sheets if necessary.

Complete WORKSHEET 16 for other chemicals
Completed by.
Page.
.of.
C-20
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 1R. Basic Calculations for a Simple Linear Regression
See Section 6 1 in "Methods for Evaluating the Attainment of Cleanun Standards" Vol 2
SITE:
CHEMICAL:
NUMBER(C) AND DESdUPriON [2 Oft 8]
WELL:
NUMBE&4W) AND DESCRIPTION (IJ
Numbers in squere bracken [] refer to the Worksheet boon which the mfacnution m«y be obtained.
Concentration
Sample Corrected for
Number Detection Limit
Concentration used when no concentration is reported ••

Number of collectable samples * N ••

Transformed
Time
Variable
Toads
from orevious orals):
1
1 1 1

Column Totals:
JA.
B
A « Zyn B = Zyn
Corrected Sum of Squats and Cross Products:
C»Zx.
= Iynxr
_ A
- C
S .B
Syy - B- J^
S,,»D-
xx- "-T"
AC
Date Completed:
Completed by
Use additional sheets if necessary. Page of.

Complete WORKSHEET 1R for other chemicals or continue to WORKSHEET 2R.
C-21
-------
APPENDIX C: BLANK WORKSHEETS
WORKSHEET 2 R Inference in a Simple Linear Regression
See Section 6.1 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
NUMBER(C) AND DESCRIPTION [2 OR 5]
WELL:
NUMBER(w) AND DESOtimON11J
Syy [1R] -
S« [1R] =
Syx[lR] =
Type 1 error
probability
a =

Numben in square brackets [] *efer "> *« Worksheet from which the information may be obtained.

Estimating Regression Coefficients
Number of collectable I
samples [1R] = N = |

Meanofyt[lR]

Meanofxt[lR]

Estimated slope [1R], b}

Estimated Intercept [1R], bo

Sum of squares due to error [1R], SSE

Degrees of freedom, Df

Critical value from table of t-distribution (Appendix A.1)
for specified values of(l-j) and Df

Mean Square Error, MSE
y

Syx

y-(bi*x)
N-2
t =
SSE
N-2 -
MSB
• Standard Error of the Slope, s(bt) =
Upper Two Sided Confidence Interval
for Slope: fy * t * sO^) =

Lower Two Sided Confidence
Interval for Slope: bj -1 * s(b}) =

Calculating Prediction Limits
Value of x, at which concentration is to be predicted;

Predicted value, $ =b0 + bx,:
Standard Error of Predicted Value
'XX
Upper Two Sided Confidence Interval for Prediction = y +1 *Sy

Lower Two Sided Confidence Interval for Prediction = y -1 *Sy

Date Completed: Completed by _^__

Use additional sheets if necessary.
Complete WORKSHEET 2R for other chemicals
Unner
Lower
Page.
.of.
C-22
-------
APPENDIX D: MODELING THE DATA
A model is a mathematical description of the process or phenomenon from
which the data arc collected. A model provides a framework for extrapolating from the
measurements obtained during the data collection period to other periods of time and
describing the important characteristics of the data. Perhaps most importantly, a model
serves as a formal description of the assumptions which are being made about the data.
The choice of statistical method used to analyze the data depends on the nature of these
assumptions.

The results of the statistical analysis may be sensitive to the degree to which
the data adhere to the assumptions of the analysis. If the statistical results arc quite
insensitive to the validity of a particular assumption, the statistical methods arc said to be
"robust" to departures from that assumption. On the other hand, if the results are sensitive
to an assumption so that the results may be substantially incorrect if the assumption does
not hold, the validity of that assumption should be checked before the results of the
analysis arc used or given credence.

After steady state conditions have been reached, the model assumed to
describe the ground water data is the equation in Box D.I.

The laboratory measurement, xtcw, will be expressed in measurement units
selected by either the lab or the management of the cleanup effort. All terms in the model
equation must have the same units. The samples on which the measurements are made can
be identified by the time and location of collection. In the model above, the location is
indicated by the well identifier w. For wells in which samples are collected at different
depths or by different sampling equipment, a more extensive set of identifiers and
subscripts will be required. If the parameter being tested represents a group of wells (e.g.,
an average concentration in several wells), xtcw represents the combined measure and w
refers to the group of wells.
D-l
-------
APPENDIX D: MODELING THE DATA
Box D.I
Modeling the Data

The model assumed to describe ground water data after steady-state
conditions have been reached is:

xicw "^cw + Su(t)cw +ztcw +£&?, (D.I)
where
xtcw = kb measurement of chemical c for the sample collected at
rime t for well w.

|ACW = long- term (or short-term) average concentration for chemical
c in well w.

^u(t)cw = a seasonal pattern in the data for concentration of chemical c
in well w, assumed to repeat on a regular cycle. The
subscript u(t) designates the point in time within the cycle
when the sample was collected. In most situations the term
Su(tvw will correspond to a yearly cycle associated with
yearly patterns in temperature and precipitation.

ztcw = serially correlated normal error following an auto-regressive
model of order one (Box and Jenkins, 1970). (Note:
seasonal auto-correlations are assumed to be negligible after
the seasonal cycles (Su/t\cw) have been removed). The
correlation, p, between two measurements separated by time
t (in months) is assumed to be p » Rl where R is the
correlation for measurements separated by one month.

Etcw = independent normal errors.
This model for the data assumes that the average level of contamination is
constant over the period, of concern (either a short or very long period). However, the
actual measurements may fluctuate around that level due to seasonal differences, lab
measurement errors, or serially correlated fluctuations (described below). The purpose of
the statistical test is to decide if there is sufficient evidence to conclude that \if^If is less than

the cleanup standard in the presence of this variability.
D-2
-------
APPENDIX D: MODELING THE DATA

Because the primary cyclical force affecting the ground water system is
climatic, in most situations the seasonal term will have a period of one year. In some
climates there are two rainy seasons and two dry seasons, possibly resulting in a seasonal
pattern of a half year. The connection between the seasonal pattern in the ground water
concentrations and the climatic changes may be-complex such that both patterns may have
the same period; however, the shape of the patterns, the relative times of maximum rainfall
or the maximum or minimum concentration may differ.

Ground water concentrations at points close together in time or space are
likely to be more similar than observations taken far apart in time or space. There are
several physical reasons why this may be the case. In statistical terms, observations taken
close together are said to be more correlated than observations taken far apart

The serial correlation of observations separated by a time difference oft can
be denoted by p(t), where p is the Greek letter rho (p). A plot of the serial correlation
between two observations versus the time separating the two observations is called an auto-
correlation function. The model above assumes that the autocorrelation function has the
shape shown in Figure D.I, which is described by the equation in Box D.2.
Box D.2
Auto&relation Function
Rl (D.2)
where R is the serial correlation for measurements separated by a month,
and t is the time between observations in months.
If the serial correlation of the measurements is zero, the data behave as if
they were collected randomly. As the correlation increases, the similarity of measurements
taken close together relative to all other measurements becomes more pronounced. Figure
D.2 shows simulated data with serial correlations of 0.0, 0.4 and 0.8. Serial correlations
are always between -1 and 1. However, for most environmental data, serial correlations are
usually between 0 and 1, indicating that measurements taken close together in time will be
more alike than measurements taken far apart.
D-3
-------
APPENDIX D: MODELING THE DATA
Figure D. 1 Theoretical Autocorrelation Function Assumed in the Model of the Ground
Water Data
The between observations
Many common statistical procedures will provide incorrect conclusions if an
existing correlation in the data is not properly accounted for. For example, the variability in
the data may be inappropriately estimated. Proper selection of a simple random sample for
estimating the mean guarantees that the errors are uncorrelated. However, when using a
systematic sample (such as for ground water samples collected at regular intervals), the
formulae based on a random sample provide a good estimate of the standard error of the
mean only if there is no serial correlation. With serial correlation, a correction term is
required. For the autocorrelation function assumed above, the correction term increases the
standard error of the long-term mean and decreases it for the short-term mean.

The autocorrelation function can have many different shapes; however, in
general, correlations will decrease as the time between observations increases. If the
samples are taken farther apart in time, the correction becomes less important.

The error term, etcw, represents errors resulting from lab measurement
error and other factors associated with the environment being sampled and the sample
handling procedures.
-------
APPENDIX D: MODELING THE DATA
Figure D.2
Examples of Data with Serial Correlations of 0, 0.4, and 0.8. The higher
the serial correlation the more the distribution dampens out
Serial Correlation «0
Tune
Serial Correlation = 0.4
f
Time
Serial Correlation = 0.8
Tune
D-5
-------
APPENDIX D: MODELING THE DATA
Different models may be used to describe the data collected during the
treatment phase and the post-treatment assessment phase because either (1) the
characteristics of the data will be different, or (2) different information about the measured
concentrations is of interest. The statistical procedures discussed in Chapter 6 to be used
during treatment are therefore different from those discussed in Chapters 8 and 9 for
assessing attainment of the cleanup standards.

There are two terms which have been excluded from the model above and
could be used to model ground water concentrations in some situations. These are a slope
(or trend) term and a spatial correlation term.

In many situations it is reasonable to assume that the general level of
contamination is either gradually decreasing or gradually increasing. It may be desirable to
assume a functional form for this change in concentration. For example, the concentration
may be considered to be decreasing linearly a exponentially. A revised model with a linear
trend term is presented in Box D.3.

If the slope is not zero, as in the model in Box D.3, then the ground water is
not at steady state. If the slope is positive, the concentrations are increasing over time. If
the slope is negative, the concentrations are decreasing over time. If concentrations are
below the cleanup standard and are increasing over time, the ground water may be judged
to attain the cleanup standard; however the cleanup standard may not be attained in the
future as concentrations increase. Therefore, the ground water in the sampled wells will be
judged to attain the cleanup standard only if (1) the selected parameter is significantly less
than the cleanup standard, and (2) the concentrations are not increasing. This decision
criteria is presented in Table D.I.

The model in Box D.3 does not include spatial correlation. In this
guidance, it is assumed that the results from different wells (or different depths in the same
well) are combined using criteria developed based on expert knowledge of the site rather
than by fitting statistical models. For this reason a spatial correlation has not been
included.
D-6
-------
APPENDIX D: MODELING THE DATA
Box D.3
Revised Model for Ground Water Data

A revised model with a linear trend term would be:
where
Jcw
xtcw
(D.3)
the change in concentration over time for measurements of
chemical c in well w.

the concentration of chemical c in well w at time zero, usually at

the beginning of sampling. Note that o^ * ^w if Pcw • 0.
Table D.I Decision criteria for determining whether the ground water concentrations
attain the cleanup standard
Test for parameter (mean or
percentile) less than the cleanup
standard (Equation D.2)
Parameter is significantly less
than the cleanup standard
Parameter is not significantly less
than the cleanup standard
Test for significant slope Pew (Equation D.3)
Pew significantly greater
than zero
Ground water is
contaminated
Ground water is
contaminated
Ptw not significantly
greater than zero
Ground water from the
tested wells attains the
cleanup standard
Ground water is
contaminated
D-7
-------
APPENDIX D: MODELING THE DATA
D-8
-------
APPENDIX E: CALCULATING RESIDUALS AND SERIAL CORRELATIONS
USING SAS1
Several statistical programs can be used-to make the calculations outlined in this
guidance document. Although these programs can be used to perform the required calculations,
they were not specifically designed for the application addressed in this document. Therefore,
they can only be used as a partial aid for the procedures presented here. Only one of the many
available statistical packages, SAS, will be discussed below in the example. This example makes
no attempt to thoroughly introduce the SAS system, and no endorsement of SAS is implied. Help
from a statistician or programmer familiar with any software being used is strongly recommended.

The basic quantities discussed in the Sections 5.2.3 and 5.2.4 can be calculated
using one of several statistical procedures available in SAS. Among them ate PROC GLM, PROC
ANOVA, and PROC REG (see SAS Users Guide: Statistics, SAS Institute, 1985). All of these
procedures require specifying a linear model and requesting certain options in the MODEL
statement A SAS data set containing the data to be used in the analysis should first be created (see
SAS Users Guide: Basics, SAS Institute, 1985). In the data set, the observations should be listed
or sorted in time order, otherwise the calculated serial correlations will be meaningless.

Given below is an example of a SAS program using PROC REG that will subtract
seasonal means from the observed concentration measurements and calculate the required first
order serial correlation of the residuals.

PROC REG DATA = CHEM1;
MODEL CONC = SEAS1 SEAS2 SEAS3 SEAS4/NOINT,DW;

In the program, CHEM1 is the SAS data set containing the following variables:
CONC, the concentration measurement of the ground water sample; TIME, a sequence number
indicating the time at which the sample was drawn; YEAR, the year the sample was drawn, and
PER, the period within the year in which the sample was drawn. For this illustration, data were
collected quarterly so that PER = 1, 2, 3, or 4. The variables SEAS 1 through SEAS4 are indicator
variables defined at a previous DATA step. For each observation, these indicator variables are
defined as follows: SEAS1 = 1 if PER = 1, and is 0, otherwise; SEAS2 = 1 if PER = 2, and is 0
'Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

E-l
-------
APPENDIX E: CALCULATING RESIDUALS AND SERIAL CORRELATIONS USING SAS

otherwise; SEAS3 = 1 if PER = 3, and is 0, otherwise; and SEAS4 = 1 if PER = 4, and is 0,
otherwise. Creation of these indicator or "dummy" variables is required if PROC REG is used.
On the other hand, dummy variables arc not required for PROC ANOVA or PROC GLM. Note
that in this example, the variable TIME is not included as an independent variable in the model.

The model statement specifics the form of the linear model to be fitted. In the
example, CONC is the dependent variable and SEAS1 through SEAS4 are the independent
variables. The reason for specifying this particular model is to have the seasonal means subtracted
from the observed concentrations. NOINT is an option that specifics that a "no-intercept model" is
to be estimated. Other models can also be used to produce the required residuals, but they will not
be discussed here. Finally, DW is the "Durbin-Watson" option, which requests that the Durbin-
Watson test (see Section 5.6.1) and the serial correlation of the residuals be calculated. The output
from the above computer run will look like:
DEP Variable: CONC
SOURCE
MODEL
tfKJH
OF
4
12
SUM OF
SQUARES
580.455
1.656
MEAN
SQUARE
145.114
0.138
FVALUE

1051.355
PROB>F

0.000
ROOTMSE
DEP MEAN
C.V.
0.3715
5.995
6.197
R-SQUARE
AOJR-SQ
0.997
0.996
VARIABLE
SEAS1
SEAS2
SEAS3
SEAS4
CF
1
1
1
1
PARAMETER
ESTIMATE
6.778
6.025
5.134
6.042
STANDARD
ERROR
0.186
0.186
0.186
0.186
T FOR HO:
PARAMETER-0
36.490
36.490
36.490
36.490
PROB>|T|
0.000
0.000
0.000
0.000
DURBIN-WATSON D 2.280
1ST ORDER AUTOCORRELATION -.184
E-2
-------
APPENDIX E: CALCULATING RESIDUALS AND SERIAL CORRELATIONS USING SAS
The first part of the output (identified by the heading SOURCE, DF, SUM OF
SQUARES, etc.) is referred to as the "analysis of variance table." In the "MEAN SQUARE"
column of the able corresponding to the row titled "ERROR," is the mean square error, s*. In the
example output, s£ » 0.138.

The second pan of the output gives the "PARAMETER ESTIMATES" for each of
the four indicator variables, SEAS1 to SEAS4. Because of the way these variables were defined,
the parameter estimates are actually the seasonal means, Xt, X2» *3> and X4, respectively. These
seasonal means are used to calculate the residuals, e,, as defined in equation (5.8). The last line of
the output shows the serial correlation of the residuals as computed from equation (5.14), viz.,
$obc = --184. From Neter, Wasserman, and Kutner (1985), du - 1.73, for N * 16 (16
observations) and p -1 = 3 (where p is the number of variables in the model). Since D » 2.28 >
1.73, it can be assumed that there is no autocorrelation in the error terms of the model.

As mentioned earlier, PROC GLM or PROC ANOVA can also be used to compute
the required statistical quantities. The interested reader should refer to the SAS users manual for
more information.
E-3
-------
APPENDIX E: CALCULATING RESIDUALS AND SERIAL CORRELATIONS USING SAS
E-4
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
This appendix provides background fop several equations presented in the
document. This background is provided only far equations which cannot be easily verified
in a standard statistical text. A simulation study provides the background for the sequential
tests presented in Chapter 9. The simulation study was supported by Westat. The last
section of this appendix incorporates a technical paper prepared for publication which
summarizes the simulations.
F. 1 Derivation of Tables A. 4 and A.5

This section outlines the derivation of Table A.4 for determining a
recommended number of samples to take per year and Table A.5 for obtaining variance
factors for use in determining sample size. Table A.4 is based on the assumption that the
number of samples per year will be chosen to minimize the total sampling costs while still
achieving the desired precision. The assumptions on which the derivation is based arc
explained below. The values in Table A.5 follow directly from the calculations used to
obtain Table A.4.

For a fixed sample size test, the cost of the sampling program can be
approximated by:
C = E + (Y+nS)m (F.I)
where
c = the total cost of the sampling program;
E = the cost to establish the sampling program;
Y = the yearly cost to maintain the program;
S = the incremental cost to collect each sample;
n = the number of samples per year and
m = the number of years of sampling

This can also be written as:

C = E + S(R+n)m (F.2)
F-l
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
Where R = -5-. Since E and S are constants, the total sampling cost can be minimized by
minimizing (R + n)m subject to the constraint that the choices of n and m achieve the
desired precision. The total number of samples collected is:
nm
(F.3)
Consider the hypothesis test where a mean is being compared to a standard
and assume that 1) the measurements are independent and 2) a normal approximation can
be used. Then the following equation can be used to determine the required sample size:
Where:
a2 = variance of the individual measurements;
Cs = the cleanup standard to which the mean is being compared;

Hi = the concentration on which the alternate hypothesis and (5 are based;

o = the probability of a false positive decision if the true mean is Cs;

3 = the probability of a false negative decision if the true mean is m ;

zl-ct = .I"6 1 "a percentile point of the normal distribution; and
Neff = the required number of independent observations.

o2
Noting that v, — is the standard error of the mean based on independent
measurements, equation (F.4) can be rewritten as:

(F.5)
°nm I

Where:
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

The problem is to select the combination of n and m such that equation (F.5)
is satisfied and the sampling costs are minimized.

The values of n and m which satisfy equation (F.5) depend only slightly on
the values of a, p\ Cs, jit, and a2. For the purposes of estimating the values in Table A.4
and A.5, the following assumptions were used: a = .10, f) = .10, Cs = 1, m = .5, and
o2 * 1.0, resulting in Neff = 26.3.

The following equation (derived in section F.2) can be used for Neff for the
mean of n observations per year collected over m years with a lag 1 serial correlation of .
N(l-<>2) J

Note that the serial correlation in equation (F.6) is the serial correlation
between successive observations. As the number of observations per year changes, $ will
also change. If $ is the serial correlation between monthly observations, then <(>

The values in Tables A.4 and A.5 were calculated using the following
procedures:
(I) For selected values of O and n, calculate $ and use a successive
approximation procedure to determine m such that the criteria in
equation (F.6) are met
(2) The values in Table A.5 are -, or the effective number of samples
per year,
(3) For each calculation in step (1) and for selected values of R,
calculate the sampling cost using equation (F.2).

(4) Using all the sampling costs calculated for the selected values of 4>,
n, and R, determine the value of n which has the minimum sampling
cost Show this value in Table A.4.
F-3
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

F.2 Derivation of Equation (F.6)

A series of periodic ground water measurements following an auto-
regressive (AR(1)) process can be described by the following equation (see Box and
Jenkins (1970) for details):
-i-H + Zt (F.7)

where:
xt = the measurement at rime t;

|i = the long-term (attainment) mean concentration ;

0 = the serial correlation between successive measurements;
at = a random change from the measurement at time t-i to time t such that
xt • ^XM = at- The &t are assumed to be independent and have a
mean of zero and a variance of e2; and
Zt = the difference between the mean being estimated and the measurement
at time L The values z^ will have a mean of zero.

The mean of N successive observations is

. N-l . N-l
t-k = H + - (F.8)
k-0 k-0
The variance of zl and z are derived below. Note that the variance of xt and
Zt are the same, written V(xt) = V(zt); also, V(x) = V(z).

The following relationships are used in the derivation of the variance:

l+«> + «>2.K>3 + ... (F.9)
and
F-4
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

F.2.1 Variance of zt

The variance of zt is:

(F.ll)
Here E[ ] indicates the expected value of the term inside the brackets.

Since Efz,] is zero, the variance can be written as;

V(zt)»E[zt2] (F.12)

(F.13)

Since the expected value of all the cross product terms are zero (i.e.,
for 1*0), they have been dropped from the summation.
Since Efa^,] = e2,

~~ 1+$2+ $* + ..) (F.17)
F-5
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
Using equation (F.9):
(F. 18)
F.2.2 Variance of z
Note that z can be expressed as

N-l N-l -

NW) ' NW> M>
(F.19)
(F.20)
This last relationship is illustrated in the Table F.I for the case where N = 3.
The variance of z is:
= E[i2-E[z]2]
Since E|_zJ is zero, the variance can be written as;
(F.21)
V(z) =
f
(F.22)
(F.23)
F-6
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
Table F.I Coefficients for the terms at, at-i, etc., in the sum of three successive
correlated observations
observation
z =
term
^ at., at.2 a,., a,^ a,_5
1
1
1
1
3
4 iP" 43 4*
4 4* 43 ...
1-42 1-43 (1-43)4 d^X* (HW
3(1-4) 3(1-4) 3(1-4) 3(1-4) 3(1^)
Since the expected value of all the cross product terms are zero (i.e.,
for wO) they have been dropped from the summation.
V(£)
Since Eag2/,] = e2,
V(z):
P2 rt£.l
N2(1-4)2I

(F.27)
Using equations (F.9) and (F.10):
(1-4)
(F.28)
F-7
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

This can be simplified to:
Combining equations (F.5), (F.18). and (F.29):
V(£)
N(H>2)
x
J
Note that the denominator in equation (F.30) has the termj—^j multiplied

by a "correction term" which is usually close to 1.0 and approaches 1.0 as the sample size
increases.
F.3 Derivation of the Sample Size Equation

When the variance is known, the sample size for a hypothesis test of the
mean is shown in equation (FA). When the variance, o2, is to be estimated from the data,
use of the t statistic is recommended, as shown below, where 62 is the estimate of o2:
3
N = ft2 fafil-B + tal-ttl
I CHI, J
(F.31)
To use this equation, the recommended procedure is to substitute the normal
statistic for the t statistic (e.g., zj.p for tN.1;i_p), calculate a preliminary sample size from
which the degrees of freedom can be estimated, and use this to determine t and a new
estimate of the sample size. For small sample sizes, a third or fourth estimate of the sample
size may be required.

Using equation (F.31) the exact sample size satisfies die following equation:

Samplesize(t)-Nt-&2
F-8
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
Using the conditions which satisfy equation (F.32), the calculated sample
size using (F.4) would be:
Sample size (z) * Nz >
(F.33)
The difference between these two sample size estimates where a = .10 and
(J = .10 is shown in figure F.I.
Figure F. 1 Differences in Sample Size Using Equations Based on a Normal Distribution
(Known Variance) or a t Statistic, Assuming a ».10 and fi».10
2.5
Sample size (t) •
Sample size (z)
1.5
0.5

•*•
-t-
•+•
10 IS 20
Sample size (t)
25 30
Note that the difference in the sample sizes using equations (F.4) and
(F.31) is fairly constant over a wide range of possible sample sizes. This property can be
used to estimate the samples size based on equation (F.31) from equation (F.4). Thus:
CS-HJ
(F.34)
F-9
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

where K is a constant which will depend on on a and P. Table F.2 tabulates K at a sample
size of 20, for selected values of a and P.

- The equations for sample size in the text use equation (F.34) with K = 2.
Table F.2 Differences between the calculated sample sizes using a t distribution and a
normal distribution when the samples size based on the t distribution is 20,
for selected values of a (Alpha) and P (Beta)
Beta
.25
.10
.05
.025
.01
.25
0.8
1.2
1.6
2.1
2.7
.10
1.2
1.4
1.7
2.0
2.6
Alpha
.05
1.6
1.7
1.9
2.2
2.7
.025
2.1
2.0
2.2
2.5
2.9
.01
2.7
2.6
2.7
2.9
3.2
F.4
Effective Df for the Mean from an AR1 Process
The following formula is appropriate for estimating the variance of the mean
of n observations from an AR1 series, assuming a large sample size:
7.35)
if the serial correlation is assumed to be zero then, s, the estimated variance
of the data, has a scaled chi-square distribution with n-1 degrees of freedom. The mean of
a chi-square distribution is v, the degrees of freedom, with a variance of 2v. Thus, the
_ . , . . .. , 2v 2
coefficient of variation squared is cv2 * — * -.
v2 v
F-10
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

With zero serial correlation, $ will have a mean of zero and variance of -

(Box and Jenkins, 1970). The term — -~- - 1 + 2 9 (for small $) has a mean of roughly
4 4
1 and a variance of approximately -. The cv2 is also approximately - since the mean - 1.

Assuming a large sample size, the cv of the product of two estimates is
equal to the square root of the sums of the squares of the cv's for each term if the terms are
independent (which will be true if the serial correlation is zero). Thus, the cv2 of s2mean is
roughly the sum of two cv2's:l) the chi-square distribution, and 2) the correction term
based on $. Thus the
(F.36)
Assuming that the distribution of s2mean is roughly chi-square, then the
effective number of degrees of freedom for S2mean is v1 where — * ^jy , or v' » 3-^
Simulations appear to be consistent with this result when 0 = 0, and suggest
that the number of degrees of freedom drops further when <|» 0.
F.4 Sequential Tests for Assessing Attainment

The following paper, prepared by Westat, has been included in this
appendix as it was submitted for publication.
F-ll
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
Assessing Attainment of Ground Water Cleanup Standards Using
Modified Sequential t-Tests

By John Rogers, Westat, Rockville Maryland

Assessing the attainment of Superfund cleanup standards in ground water can be complex
due to measurements with skewed distributions, seasonal or periodic patterns, high
variability, serial correlations, and censoring of observations below the laboratory detection
limit. The attainment decision is further complicated by trends and transient changes in the
concentrations as a result of the cleanup effort. EPA contracted Westat to prepare a
guidance document recommending statistical procedures for assessing the attainment of
ground water cleanup standards. The recommended statistical procedures were to require a
minimum of statistical training. The recommended procedures included a sequential t-test
based on yearly average concentrations.

Further research and simulations by Westat indicate that modifications of the sequential t-
test have better performance and are easier to use than the originally proposed sequential t-
test, particularly with highly skewed data. This paper presents three modified sequential
tests with simulation results showing how the sequential t-test and the modifications
perform under a variety of situations similar to those found in the field. The modified tests
use an easy-tocalculate approximation for the log likelihood ratio and an adjustment to
improve the power of the test for small sample sires. Using the log transformed yearly
averages improves the test performance with skewed data. Expected sample sizes and
practical considerations for application of these tests are also discussed.

Key words: Sequential t-test, Simulations, Ground water, Superfund.

1. Introduction

EPA contracted Westat to prepare a draft guidance document recommending sampling and
statistical methods for evaluating the attainment of ground-water cleanup standards at Superfund
sites. The recommended statistical methods were to be applicable to a variety of site conditions and
be able to be implemented by technical staff with a minimum of statistical training.

The draft document included an introduction to basic statistical procedures and recommended a
variety of statistical methods including a sequential t-test. Although the sequential t-test has several
advantages for testing ground water, one significant disadvantage is the relative complexity of the
calculations, requiring use of the noncentral t distribution. Additional research was undertaken by
Westat to find an alternative to the standard sequential t-test which is easier to implement. As part
of this research, simulations have been used to evaluate the performance of the sequential t-test and
several modifications of it.

This paper presents these simulation results showing how the sequential t-test and the modified
tests perform under a variety of situations similar to those found in the field.

The Problem of Assessing Ground Water at a Superfund Site

The history of contamination and cleanup at a Superfund site will result in ground water
contaminant concentrations which generally (1) increase during periods of contamination, (2)
'This research was supported by Westat.
2EPA contract 68-01-7359.
F-12
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

decrease, during remediation, and (3) settle into dynamic equilibrium with the surrounding
environment after remediation, at which point the success of the remediation can be determined.

Specifying the attainment objectives and assessing attainment of cleanup standards can be
complicated by many site specific factors, including: multiple wells, multiple contaminants, and
data which have seasonal patterns, serial correlations, significant lab measurement variation, non-
constant variance, skewed distributions, long-term trends, and censored values below the detection
limits. The general characteristics of ground water quality data have been discussed by Loftis et al.
(1986). All of these factors complicate the specification of an appropriate statistical test. Figure 1
illustrates the variation which might be found in monthly ground water measurements, using
simulated observations.

The Statistical Problem to be Discussed

The following statistical problem is addressed in this paper. Suppose remediation is complete and
any transient effects of the remediation on the ground water levels and flows have dissipated. We
then wish to determine if the mean concentration of a contaminant, \i, is less than the relevant
cleanup standard, W). The ground water will be judged to attain the cleanup standard if the null
hypothesis, HO: Ji £ MO. can be rejected based on a statistical test The power of the test, the
probability of rejecting the null hypothesis, is to be a when H = HO- For a specified alternate
hypothesis, HI: |i = \i\ (0 < \i\ < Mo) the power is to be 1-fJ, where f) is the probability of a false
negative decision (the probability of incorrectly accepting the null hypothesis).

The statistical tests considered in this paper are the sequential t-test for comparing means and
modifications of this test. Using a sequential procedure, a test of hypothesis is performed after
each sample, or set of samples, is collected. The test of hypothesis results in three possible
outcomes, (1) accept the null hypotheisis, (2) reject the null hypotheisis, or (3) continue sampling.
The hypothesis is tested based on the n ground water samples, x\ to xn, collected prior to the test
of hypothesis. The sample size at the termination of the test is a random variable. The power and
sample size distribution of the sequential tests were evaluated using monte carlo simulations. For
the simulations the following parameters were varyed: the mean, standard deviation, detection
limit, proportion of the variation which is serially cotrelated versus independent, lag. 1 serial
correlation, alpha and beta, distribution (normal or lognormal), and \i\. For all simulations |iQ is
set at 1.0. 1000 simulations were made for each set of parameters tested, unless otherwise noted.
Simulations were performed using SAS version 6.

Section 2 reviews and compares the fixed sample size and sequential t-tests. Sections 3 and 4
discuss the performance of the t-test and several modifications when applied to normally
distributed and independent observations. The performance of the sequential tests when applied to
simulated ground water data is evaluated in Sections. Section 6 discusses the results and presents
the conclusions.

2. Fixed Versus Sequential Tests

The fixed sample size test and sequential t-test are reviewed briefly below, emphasizing factors
which are relevant to the development of a modified test and for selecting a test for assessing
ground water.
F-13
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

Fixed Sample Size t-Test

The fixed sample size t-test, familiar to many users of statistics, requires the following steps:

(1) Estimate the variance of the future measurements, &2, based available data;
(2) Determine sample size n, such that,
where taji-i is the a percentile of the t distribution with n-1 degrees of freedom.
(3) Collect n samples and measure the contaminant concentrations;
(4) Calculate the test statistic t, with n-1 degrees of freedom,
where : t = , x = Y'r1 and sr =
(5) Conclude that the ground water attains the cleanup standard if t < lajn-l otherwise, accept
the null hypothesis that the ground water does not attain the cleanup standard.

The t-test does well to preserve the power of the test at the null hypothesis when the data have a
roughly normal distribution. However .the power at the alternate hypothesis depends on the the
accuracy of the initial variance estimate, o2. Thus the fixed sample size test fixes a and n, leaving
P variable.

Standard Sequential t-Test

With normally distributed independent observations and known a2, an optimal sequential test is the
sequential probability ratio test (SPRT) (Wald 1947). When O2 is unknown, as here, one
approach is provided by the sequential t-test which states the null hypothesis in terms of the
unknown standard deviation (Rushton 1950, Ghosh 1970, and others). For testing hypotheses
about means, an alternative heuristic solution replaces the unknown variance by the sample
estimate at each step in the sequential test (Hall 1962, Havre 1983). This second version of the
sequential t-test can be used to compare the mean to an established cleanup standard. Liebetrau
(1979) discussed the application of this test to water quality sampling.

The steps in implementing the sequential t-test for comparing the mean to a standard are:

(1) Collect k-1 samples without, testing the hypothesis.
(2) Collect one additional sample for a total of n samples collected so far and calculate:

0 e n e IM * 1*0 ,~
. So=0. Si= *l KU; eq.(2)
s;
(3) Calculate the likelihood ratio:
(cq.3)
fn-l(tl5 = 50)
where fn-i(t I 6 ) is the density of the noncentral t distribution with n-1 degrees of freedom,
and noncentrality parameter 5.
F-14
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
(4) If L > —- then reject the null hypothesis and conclude that the ground water attains the
a
cleanup standard
if L < -— then accept the null hypothesis that the groundwater does not attain the cleanup
f - a
Standard,
otherwise, return to step (2) and collect additional samples until a decision is reached.

Unlike the fixed sample size test, for the sequential test, a and (J are fixed and n is variable.

Comparison of the Sequential and Fixed Sample Size Test

Table 1 compares the sequential and fixed sample size tests based on several characteristics. The
choice of which test to use depends on the circumstances in which the test is to be applied.
Table 1
Comparison of the fixed sample size and sequential t-test.
Characteristic
Power
Sample Size
Sampling
Estimate: of the
mean
Ease of
Calculation
Sequential t-Test
Fixed at the null and alternate
hypothesis
Subject to variation, often less
than for a fixed sample size test
with the same power
Works well if the time between
collection of samples is long
relative to the analysis time.
Biased •
Standard test requires tables of the
non-central t distribution which
are not generally available.
Modified test reported here can be
easily calculated.
Fixed Sample Size t-Test
Fixed at the null hypothesis.
Power at the alternate hypothesis
depends on the estimate of
measurement variance used for
calculating sample size.
Fixed
Works well if the sample
collection period is short relative
to the analysis time.
Unbiased
Uses widely available tables
Application of the Sequential Test to Ground Water Data

For testing contaminant concentrations against a cleanup standard, the sequential t-test has some
distinct advantages: (1) ground water sample collection is sequential with sample analysis time
often short compared to the sample collection period, (2) a good estimate of measurement variance
for calculating the sample size for the fixed test may not be available, (3) for assessing attainment,
the objective is to test a hypothesis rather than to obtain an unbiased estimate of the mean or
construct a confidence interval, (4) reducing sample size can be important when the cost of
laboratory sample analysis is high, and (5) if the concentrations at the site are indeed below the
cleanup standard, maintaining the power at the alternate hypothesis can protect against incorrectly
concluding that additional costly cleanup is required. For many users, the main disadvantage of
using the standard sequential t-test is the relative complexity of the calculations.
F-15
-------
APPENDIX F: DERIVATIONS AND EQUATIONS
3. Power and Sample Sizes for the Sequential t-Test with Normally
Distributed Data

For the purpose of describing the simulation results used to determine the power of the sequential
t-test, define the scale factor as the ratio of the standard deviation of the measurements to the
difference between the means for the null and alternate hypotheses:

Scale factor = .
Also let nfued designate the sample size for a fixed sample size test with the same nominal power as
the sequential test being discussed, where nfiw
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

HQ: H = Ho against HI: (I = Hi, power at Ho = <*. Ml = 1-P (i-e. ho = Mo);
HQ: (I = (ii against HI: (I = MO. power at Mo = «, Ml = 1-P (i.e. ho = Mi).

Based on this symmetry, the nominal power of the sequential t-test is the same whether ho = MO or
ho = Hi- In practice, ho serves as the zero point around which the parameters for the non-central
t distribution are calculated rather than the mean value at which the power is maintained, as in the
fixed sample size test. If the equations for the sequential test are modified to put the zero point
mid-way between MO and Ml. then (1) 81= -So, (2) only one non-central t distribution needs to be
evaluated, and (3) the power of the test is symmetric around ho when a = |5, i.e. the false positive
and false negative rates are equal. Although Rushton (1950) considered null hypotheses other than
zero and ho = Mo. in this paper ho is called the zero point rather than the null hypothesis. To avoid
confusion, the terms null and alternate hypothesis will be used as defined in Section 1, reflecting
the intentions of those performing the test

Define the centered sequential t-test by replacing equations (2) by equations (4) and setting
the zero point for the calculations mid-way between Mo and Hi. i.e.:

1.0=^^. eq.(5)
This centered test is used in the following simulations to determine the relationship between power
and sample size.

Changes in Power with Increasing Sample Size

Figure 3 shows the false decision rate (false positive or false negative rate) and average sample size
for the centered sequential t-test with a and p set at .05. and the scale factor ranging from 0.4 to
3.6. For this symmetric test, the false positive and false negative rates are equal. The false
decision rate at very low sample sizes is smaller than the nominal level of .05. As the scale factor
increases, resulting in increasing sample sizes, the false decision rate increases to a maximum of
roughly three times the nominal level and then decreases slowly. The average sample size is
roughly half of that for the corresponding fixed sample size test except at very low sample sizes.
Similar patterns were seen in the false negative rates when the zero point was set at the null
hypothesis.

The good performance of the test at low samples sizes is in part due to the discrete nature of the
sampling. From the sample just before the termination of the test to the sample which terminates
the test, the likelihood ratio jumps from inside the decision limits to outside. With small sample
sizes, the likelihood ratio may be considerably beyond the decision limits on the last sample. This
is equivalent to having more information than is necessary to make the decision, resulting in
improved performance.

Distribution of Sample Sizes

Simulations were used to look at the distribution of sample sizes at the termination of the test, for
selected values of p. and scale factors of 1.0 and 3.0. Figure 4 shows the distribution of sample
sizes, using a log scale, when M = Ml and the scale factor equals 1.0. The sample sizes are
displayed separately for simulations which rejected the null hypothesis (correct decision) and those
which did not. For both decisions a relatively large proportion of the simulations terminate at a
sample size of two. The false decision rate is greater than the nominal value by roughly the
proportion of simulations terminating with only two samples. The modified sequential test, for
which the distribution of samples sizes is also shown in Figure 4, is discussed in the next section.
F-17
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

The general characteristics of the sample size distributions are the same regardless of the conditions
simulated. Samples sizes for the sequential t-test are highly skewed. For many simulations, the
test terminated with two samples. For those simulations not terminating with two or three
samples, the distribution of sample sizes was roughly log-normal.

4. Modifications to Simplify the Calculations and Improve the Power

The poor performance of the centered sequential t-test at the alternate and null hypotheses and the
observation that many of the simulations which terminate at two samples contribute to the large
false decision rates, suggest that a modification to the test might improve the performance. Other
authors have noted this problem and suggested alternate procedures. In particular, Hayre f 1983)
suggested changing the test boundaries. Hayre's suggestion is equivalent to multiplying the the
log likelihood ratio by the adjustment factor (na)/(n+c) where d < k and c > -d. Based on
heuristic arguments, Hayre concluded that k, the minimum number of samples, should be at least 5
if a large sample size is expected.

When small sample sizes are expected, requiring as many as 5 samples before the first test of
hypothesis can result in an overly conservative test. In this research decision rules requiring a
minimum of 2, 3, or 4 samples were considered. In addition, the performance of the centered
sequential t-test was simulated using adjustment factors of: 1, (n-l)/n, (n-2)/n, (n-3)/n. The
simulations used a and P set at 0.10,0.05, and 0.01.

The false decision rates for the four adjustment factors, with (a,P) = (0.05,0.05), are shown in
Figure 5. All of the adjustment factors improved the performance of the test by reducing the
maximum probability of a false decision to values closer to the nominal value. The selection of an
optimal adjustment factor requires specification of the conditions under which the test is to be used.
One adjustment factor might be chosen if small sample sizes are expected, another if large sample
sizes are expected. In all cases, the test is conservative for low sample sizes, possibly liberal for
intermediate sample sizes, and approaches the nominal values for large sample sizes. Over the
range of the scale factor considered in the simulations, the average false decision rate for the
adjustment factor (n-2)/n was closest to the nominal value. Therefore, this adjustment factor, (n-
2)/n, with k=3 was chosen for evaluation in subsequent simulations.

Approximation for Non-central t

Calculation of the likelihood ratio using the noncentral t-distribution is difficult because the tables
are not generally available and are difficult to use. The use of the sequential t-test can therefore be
simplified by using an approximation to the log likelihood ratio of the two non-central t-
distributions. Rushton (1950) published three approximations for the log of the likelihood ratio.
Westat's analysis showed that the approximations performed well, particularly when the zero point
for the test was set mid-way between the null and alternate hypotheses. Using Rushton's simplest
approximation and the adjustment factor selected above, the equations for the modified
sequential t-test become:

. m + Hl . x - hp K m -lip ...
ho =—5 ,t = , 5 = and eq. (5)
L Sx Sx
L.«,(t,-^-i3). „.«,
F-18
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

Figure 4 shows the distribution of samples sizes for the modified test compared to that for the
standard sequential t-test. Figure 6 shows the power curve and average sample sizes for the
modified test with cc=|3 and scale factor =1.6. Figure 6 can be compared directly with Figure 2
for the standard sequential t-test.

Termination of the Test Before a Decision Has Been Reached

Figure 7 shows the distribution of sample sizes for selected values of n, the mean of the simulated
measurements, using the modified test with scale factor of 1.6. As noted before, the distribution
of the sample sizes is roughly log-normal. The minimum sample size is 3 because a minimum of
three samples are required before the first test of hypothesis. The mean sample size is generally
similar to or less than nfaed. The 95th percentile of the sampje sizes is less than three times %xed
and, for values of |i close to the null and alternate hypothesis, is generally similar to or less than
Several authors, including Wald, have suggested that, for practical purposes, the sequential test
can be terminated after some fixed large number of samples if the test has not otherwise terminated,
with the decision going to which ever hypothesis is more favored at termination. Figure 7
suggests that a decision rule terminating the test with a maximum sample size of three times nfixed
is reasonable because very few tests would be terminated early when the true mean is close to the
null or alternate hypothesis. When the mean is mid-way between the null and alternate hypothesis,
acceptance of the null hypothesis is essentially random, and early termination will not affect the
power of the test.

Simulations were performed to evaluate different termination rules. One hundred simulations were
run for all combinations of: termination at 1, 2, 3, 4, and 5 times nfixed; four scale factors from .4
to 3.6; a = P = 0.1, 0.05, 0.01; and m = 0.5. In addition, 100 simulations were run for all
combinations of: 11 values of jo. from .35 to 1.15; termination at 1, 2, 3, and 4 times the fixed
sample size; scale factor = 1.6; and a = P = 0.05. The differences in the power due to early
termination were not statistically significant. Early, termination resulted in a decrease in the average
sample size with p. mid-way between the null and alternate hypotheses; however, with \i at the null
or alternate hypothesis, changes in the average sample size were, practically speaking,
insignificant.

These results indicate that early, termination of the sequential test will have little effect on the .power
of the test. Because the fixed sample size is estimated from 62 based on data available before
sampling and is therefore subject to error, it is recommend that sequential tests not be terminated
until the samples size is at least twice the estimated sample size for an equivalent fixed sample size
test. For the simulations reported in other sections of this paper, the sequential tests were
terminated if the sample size exceeded 5 times nfixed.

5. Application to Ground Water Data from Superfund Sites

The modified sequential t-test performs well with normally distributed data, having average sample
sizes below those for equivalent fixed sample size tests and power close to the nominal power.
However, ground water measurements may be skewed, serially correlated, censored, and have
seasonal patterns. How well does the modified test perform with ground water data? Simulations
were used to determine how four sequential tests performed when assessing ground-water data.

For all statistical tests, the following sequential sample design is assumed: m ground water samples
are collected at periodic intervals throughout the year, with at least 4 samples per year. The
samples are analyzed and the test of hypothesis is performed once per year starting after three years
of data are collected. The number of years of data collection is n.
F-19
-------
APPENDIX F: DERIVATIONS AND EQUATIONS

The four statistical tests evaluated using the simulations are:

1) Standard sequential t-test described in section 2 using the yearly averages;

2) Modified sequential t-test using the yearly averages;

3) Modified sequential t-test with adjustments for seasonal variation and serial correlation:

Remove seasonal patterns from the data using one-way analysis of variance. Calculate
the standard error, Se, and the lagl serial correlation of the residuals, r. Estimate the
standard error of the mean as:
9 1+r • UT^
Se2 J7 withDf=
The effective sample size is assumed to be one more than the number of degrees of
freedom. Therefore:

L = exp

4) Modified sequential t-test with an adjustment for skewness:

Calculate y = ln(y early average). Estimate the log transformed mean and its standard
error using the following equations:
t-1
The test statistic for the sequential t-test uses:

ln(no) + ln(m) ln(x) - hn . s ln(m) - ln(Uo)
ho = 2 — , t — 1 ' " .and o = 1! " * *'
Sln(x)
The first, second and fourth tests use the yearly average concentrations, averaging across the
within year seasonal patterns. The serial correlation between the yearly averages is less than
between individual observations, reducing the influence of correlation on the test results. The third
test removes the seasonal patterns. The standard error of the mean is adjusted by a factor which
accounts for the serial correlation, assuming an AR(1) model and many observations per year.
Although this assumption may not be correct, the lag 1 correlation is expected to dominate the
correlations for higher lags, making the AR(1) model a reasonable approximation to the data. The
effective degrees of freedom for the standard error is based on asymptotic approximations. The
fourth test is based on the assumption that the yearly averages have a log normal distribution. For
highly skewed data this assumption is more reasonable than assuming a normal distribution. The
mean and standard error of the mean are first order approximations based on a lognormal
distribution.
The secpnd test was expected to perform well with data which has an approximately norr
distribution. The third test was expected to perform best with highly skewed data. The fourth
normal
test
was expected to perform best with data with significant correlation and little skewness.
Simulations were performed to test these assumptions.
F-20
-------
Figure 8 Range of False Positive Rates for Scale Factors from
1.6 to 4.8 for Four Sequential Tests, by Data Type
71
OJ
D,
D.
III
Oi
•I
D

Normal Bas!f, 12 samp/yr
cv = 0.5 vJ
4 samp/yr
Skewed Censored
cv = 1.5 30%
-------
Figure 9 Range of False Negative Rates for Scale Factors from
1.6 to 4.8 for Four Sequential Tests, by Data Type
0.35 i
0.3
| 0.25
V
£ 0.2
C8
^ - 01
cs u>1
fa
0.05
0

> -

n
ll

i.l

n.l D_ ...

i-
•
| i| ^a.

Normal Bas!f_ 12samp/yr Skewed Censored Correlated Skewed &
cv = 0.5 cv = 1.5 30%
-------
APPENDIX G: GLOSSARY

Alpha (a) -In the context of a statistical test, a is probability of a Type I error.

Alternative Hypothesis See hypothesis.

Analysis Plan The plan that specifics how the data are to be analyzed once they have
been collected, includes what estimates are to be made from the data, how the
estimates are to be calculated, and how the results of the analysis will be
reported.

Autocorrelation See serial correlation

Attainment This term by itself refers to the successful achievement of the attainment
objectives. In brief, attainment means that site contamination has been reduced
to or below the level of the cleanup standard.

Attainment Objectives The attainment objectives refer to a set of site descriptors and
parameters together with standards as to what the desired level should be for the
parameters. These are usually decided upon by the courts and the responsible
parties. For example, these objectives usually include the chemicals to be
tested, the cleanup standards to be attained, the measures or parameters to be
compared to the cleanup standard, and the level of confidence required if the
environment and human health are to be protected (Chapter 3).

Beta (P) In the context of a statistical test, p is the probability of a Type II error.

Binomial Distribution A probability distribution used to describe the number of
occurrences of a specified event in n independent trials. In this manual, the
binomial distribution is used to develop statistical tests concerned with testing
the proportion of ground water samples that have excessive concentrations of a
contaminant (see Chapters 8 and 9). For example, suppose the parameter of
interest is the portion (or percent) of the ground water wells that exceed a level
specified by the cleanup standard, Cs. Then one might estimate that portion by
taking a sample of 10 wells and counting the number of wells that exceed the
Cs. Such a sampling process results in a binomial distribution. For additional
details about the binomial distribution, consult Conover (1980).
G-l
-------
APPENDIX G: GLOSSARY
Central Limit Theorem If X has a distribution with the mean p. and variance a2, then
the sample mean X, based on a random sample of size n has an approximately
a2
normal distribution with mean (0. and variance —. The approximation becomes
increasingly good as n increases. In other words, no matter what the original
distribution of X (so long as it has a finite mean and variance), the distribution
of X from a large sample can be approximated by a normal distribution. This
fact is very important since knowing the approximate distribution of X allows
us to make corresponding approximate probabilistic estimates. For example,
reasonably good estimates for confidence intervals on X can frequently be given
even though the underlying probabilistic structure of Y is unknown.

Chain of Custody Procedures Procedures for documenting who has custody of and
the condition of samples from the point of collection to the analysis at the
laboratory. Chain of custody procedures are used to insure that the samples are
not lost, tampered with, or improperly stored or handled.

Clean Attains the cleanup standard. That is, a judgment has been made that the site has
been cleaned or processed to the point that in the attainment objectives, as
defined above, have been met.

Cleanup Standard (Cs) The criterion set by EPA against which the measured
concentrations are compared to determine whether the ground water at the
Super-fund site is acceptable or not (Sections 2.2.4 and 3.4). For example, the
Cs might be set at 5 parts per million (5 ppm) for a site chemical. Hence, any
water that tests out at greater than 5 ppm is not acceptable.,
SSK
Coefficient of Determination (R2) A descriptive statistic, R2 = 1 - -5— and 0 < R2
byy
< 1, that provides a rough measure of the overall fit of the model. A perfect fit;
i.e., all of the observed data points fall on the fitted regression line, would be
indicated by an R^ equal to 1. Low values of R2 can indicate either a relatively
poor fit of the model or no relationship between the concentration levels and
time. R^ is just the square of the well-known correlation coefficient. For more
information, see any standard text book.
G-2
-------
APPENDIX G: GLOSSARY
Coefficient of Variation (cv) The ratio of the standard deviation to the mean (j*) for a
set of data or distribution. For data which can only have positive values, such
"as concentration measurements, the coefficient of variation provides a crude
measure of skewness. Data with larger cv's usually are more skewed to the
right The cv provides a relative measure of variation (i.e., relative with respect
to the mean). As such, it can be used as a rough measure of precision. It is
useful to know if the cv is relatively constant over the range of the variable of
interest.

Comparison-wise Alpha 'For an individual statistical decision on one compound or
well, the maximum probability of a false positive decision.

Compositing Physically mixing several samples into one larger sample, called a
composite sample. Then either the entire composite is measured or one or more
random subsamples from the composite are measured Generally the individual
samples which are composited must be the same size or volume, and the
composite sample must be completely mixed. Composite samples can be useful
for estimating the mean concentration. If appropriate, compositing can result in
substantial savings where the cost of analyzing individual samples is high.

Confidence Interval A sample-based estimate of a population parameter which is
expressal as a range or interval of values which will include the true parameter
value with a known probability or confidence. For example, instead of giving
an estimate of the population mean, say x = 15.3, we can give a 95 percent
confidence interval, say [x-3, x+3] or [12.3 to 18.3] that we are 95 percent
confident contains the population mean.

Confidence Level The degree of confidence associated with an interval estimate. For
example, with a 95 percent confidence interval, we would be 95 percent certain
that the interval contains the true value being estimated. By this, we mean that
95 percent of independent 95 percent confidence intervals will contain the
population mean. In the context of a statistical test, the confidence level is equal
to 1 minus the Type I error (false positive rate). In this case, the confidence
level represents the probability of correctly concluding that the null hypothesis
is true.
G-3
-------
APPENDIX G: GLOSSARY
Conservative Test A statistical test for which the Type I error rate (false positive rate) is
actually less than that specified for the test. For a conservative test there will be
a greater tendency to accept the null hypothesis when it is not true than for a
non-conservative test. In the context of this volume, a conservative test errs on
the side of protecting the public health. That is to say, the mistake (i.e.. error)
of wrongly deciding that the site is clean will be less than the stated Type I Error
Rate.

Contaminated A site is called contaminated if it does not attain the cleanup standards. In
other words, the contamination level on the site is higher than that allowed by
the cleanup standard.

Degrees of Freedom (Df) The degrees of freedom of an estimate of variance, standard
deviation, or standard error is a measure of the amount of information on which
the estimate is based or the precision of the estimate. Usually, high degrees of
freedom are associated with a large sample size and a corresponding increase in
accuracy of an estimation.

Dependent Variable (y;) An outcome whose variation is explained by the influence of
independent variables. For example, the contamination level in ground water
(i.e., the dependent variable y) may depend on the distance (i.e., the
independent variable x) from the site incinerator.

Detection Limit The level below which concentration measurements cannot be reliably
determined (see Section 2.3.7). Technically, the lowest concentration of a
specified contaminant which is unlikely to be obtained when analyzing a sample
with none of the contaminant

Distribution The frequencies (either relative or absolute) with which measurements in a
data set fall within specified classes. A graphical display of a distribution is
referred to as a histogram. Formally, a distribution is defined in terms of the
underlying probability function. For example, the distribution of x, say Fx(t),
may be defined as the probability that x is less than t (i.e., P(x
-------
APPENDIX G: GLOSSARY
test is "statistically" large then the decision rule is to declare that we do not
believe that serial correlation is present If
-------
APPENDIX G: GLOSSARY
Independent Variable (x;) The characteristic being observed or measured that is
hypothesized to influence an event (the dependent variable) within the defined
area of relationships under study. The independent variable is not influenced by
the event but may cause it or contribute to its variation.

Inference The process of generalizing (extrapolating) results from a sample to a larger
population. More generally, statistical inference is the art of evaluating
information (such as samples) in order to draw reliable conclusions about the
phenomena under study. This usually means drawing conclusions about the
distribution of some variable.

Interquartile Range The difference between the 75th and 25th percentiles of the
distribution.

Judgment Sample A sample of data selected according to non-probabilistic methods;
usually based on expert judgment.

Kriging Kriging is the name given to the least squares prediction of spatial processes. It
is a form of curve fitting using a variety of techniques from regression and time
series. Statistically, kriging is best linear unbiased estimation using generalized
least squares. This statistical technique can be used to model the contours of
water and contaminant levels across wells at given points in time (see Chapter 7
of this guidance and Volume I, Chapter 10). Kriging is not appropriate for
assessing attainment in ground water.

Laboratory Error See measurement error.

Lag 1 Serial Correlation See serial correlation.

Least Squares Estimates This is a common estimation technique. In regression, the
purpose is to find estimates for the regression curve fit. The estimates are
chosen so that the regression curve is "close" to the plotted sample data in the
sense that the square of their distances is minimized (i.e., the least). For
example, the estimates Po and Pi of the y-intercept Po and the slope Pi are least
square estimates (see Section 6.1.2).

Less-than-Detection Limit A concentration value that is reported to be below the
detection limit with now measured concentration provided by the lab. It is

G-6
-------
APPENDIX G: GLOSSARY
generally recommended that these values be included in the analysis as values at
the detection limit.

Lognormal Distribution A family of positive-valued, skewed distributions commonly
used in environmental work. See Gilbert (1987) for a detailed discussion of
lognormal distributions.

Mean The arithmetic average of a set of data values. Specifically, the mean of a data set,
n x.
xlt x2,.... x,,, is defined by X » £ •£.

Mean Square Error (MSE) The sum of squares due to error divided by the
appropriate degrees of freedom which provides an estimate of the variance
about the regression.

Measurement Error Error or variation in laboratory measurements resulting from
unknown factors in the handling and laboratory analysis procedures.

Median The values which separates the lowest 50 percent of the observations from the
upper 50 percent of the observations. Equivalently, the "middle" value of a set
of data, after the values have been arranged in ascending order. If the number
of data points is even, the median is defined to be the average of the two middle
values.,

Mode The value with the greatest probability, i.e., the value which occurs more often
than any other.

Model A mathematical description of the process or phenomenon by which the data arc
generated and collected.

Non-Central t-Distribution Similar to the t-distribution with the exception that the
numerator is a normal variate with mean equal to something other than zero (see
also t-distribution).

Nonparametric Test A test based on relatively few assumptions about the underlying
process generating the data. In particular, no assumptions arc made about the
exact form of the underlying probability distribution. As a consequence,
nonparametric tests are valid for a fairly broad class of distributions.
G-7
-------
APPENDIX G: GLOSSARY
'Normal Distribution A family of "bell-shaped" distributions described by the mean and
variance, n and a2. Refer to a statistical text (e.g., Sokal and Rohlf, 1973) for a
formal definition. The standard normal distribution has M. * 0 and o2 - 1.

Normal Probability Plot A plot of the ordered residuals against their expected values
under normality (see Section 5.6.2).

Normality See normal distribution (see also Section 5.6).

Null Hypothesis See hypothesis.

Outlier Measurements that are (1) very large or small relativeto the rest of the data, or (2)
suspected of being unrepresentative of the true concentration at the sample
location.

Overall Alpha When multiple chemicals or wells are being assessed, the probability that
all chemicals in all wells are judged to attain-the cleanup standard when in
reality, the concentrations for at least one well or chemical do not attain the
cleanup standard.

Parameter A statistical property or characteristic of a population of values. Statistical
quantities such as means, standard deviations, percentiles, etc. are parameters if
they refer to a population of values, rather than to a sample of values.

Parameters of the Model See regression coefficients.

Parametric Test A test based on assumptions about the underlying process generating
the data. For example, most parametric tests assume that the underlying data
are normally distributed. Although parametric tests are strictly not valid unless
the underlying assumptions are met, in many cases parametric tests perform
well over a range of conditions found in the field. In particular, with
reasonably large sample sizes the distribution of the mean will be approximately
normal. See robust test, and Central Limit Theorem.

Percentile The specific value of a distribution that divides the set of measurements in
such a way that P percent of the measurements fall below (or equal) this value,
and 1-P percent of the measurements exceed this value. For specificity, a
percentile is described by the value of P (expressed as a percentage). For
G-8
-------
APPENDIX G: GLOSSARY
example, the 95th percentile (P=0.95) is that value X such that 95 percent of the
data have values less than X, and 5 percent have values exceeding X. By
definition, the median is the 50th percentile.

Physical Sample A portion of ground water collected from a well at the waste site and
used to make measurements. This may also be called a water sample. A
water sample may be mixed, subsampled, or otherwise handled to obtain the lab
sample of ground water which is sent for laboratory analysis.

Point Estimate See estimate.

Population The totality of ground water samples in a well for which inferences
regarding attainment of cleanup standards are to be made.

Population Mean Concentration The concentration which is the arithmetic average
for the totality of ground water units (see also mean and population).

Population Parameters See parameter.

Power The probability that a statistical test will result in rejecting the null hypothesis
when the null hypothesis is false. Power = 1 - p, where p is the Type II error
rate associated with the test. The term "power function" is more accurate
because it reflects the fact that power is a function of a particular value of the
parameter of interest under the alternative hypothesis.

Precision Recision refers to the degree to which repeated measurements are similar to
one another. It measures the agreement (reproducibility) among individual
measurements, obtained under prescribed similar conditions. Measurements
which are precise are in close agreement. To use an analogy from archery,
precise archers have all of their arrows land very close together. However, the
arrows of a precise archer may or may not land on (or even near) the bull's-eye.

Predicted Value In regression analysis, the calculated value of y,, under the estimated
regression line, for a particular value of Xj.

Proportion The number of ground water samples in a set of ground water samples that
have a specified characteristic, divided by the total number of ground water
samples in the set.
G-9
-------
APPENDIX G: GLOSSARY
Random Error (EI) Represents "random" fluctuations of the observed chemical
measurements around the hypothesized mean or regression model.

Random Sample A sample of ground water units selected using the simple random
sampling procedures described in Section 4.1.

Range The difference between the maximum and minimum values of measurements in a
data set.

Regression Analysis The process of finding the "best" mathematical model (within
some restricted class of models) to describe the dependent variable, y;, as a
function of the independent variable, x;, or to predict y; from x;. The most
common form is the linear model.

Regression Coefficients The constants J3o and Pi in the simple linear regression
model which represent the y-intercept and slope of the model.

Residual In regression analysis, the difference between the observed value of the
concentration measurement yj and the corresponding fitted (predicted) value, yj,
from the estimated regression line.

Response Variable See dependent variable.

Robust Test A statistical test which is approximately valid under a wide range of
conditions.

Sample Any collection of ground water samples taken from a well.

Sample Design The procedures used to select the ground water samples.

Sample Mean See mean.

Sample Residual See residual.

Sample Size The number of lab samples (i.e., the size of the statistical sample). Thus, a
sample of size 10 consists of the measurements taken on 10 ground water
samples or composite samples.

Sample Standard Deviation See standard deviation.

G-10
-------
APPENDIX G: GLOSSARY
Sample Statistics Numerical quantities which summarize the properties of a data set

Sampling Error Variability in sample statistics between different samples that is used to
characterize the precision of sample-based estimates

Sampling Frequency (n) The number of samples to be taken per year or seasonal
period.

Sampling Plan See sample design..

Sampling Variability See sampling error.

Sequential Test A statistical test in which the decision to accept or reject the null
hypothesis is made in a sequential fashion. Sequential tests are described in
Chapters 4, 8, and 9 of this manual.

Serial Correlation A measure of the extent to which successive observations are
related.

Significance Level The probability of a Type I error associated with a statistical test.
In the context of the statistical tests presented in this manual, it is the probability
that the ground water from a well or group of wells is declared to be clean when
it is contaminated. The significance level is often denoted by the symbol a
(Greek letter alpha).

Simple Linear Regression A regression analysis where there is only one independent
variable and the equation for the model is of the form yj * fa + p^, where PQ
is the intercept and fij is the slope of the regression (see Section 6.1).

Simple Linear Regression Model A linear model relating the concentration
measurements (or some other parameter) to time (see Section 6.1).

Size of the Physical Sample The volume of a physical ground water sample.

Skewness A measure of the extent to which a distribution is symmetric or asymmetric.

Skewed Distribution Any asymmetric distribution.
G-ll
-------
APPENDIX G: GLOSSARY
Standard Deviation A measure of dispersion of a set of data. Specifically, given a set
of measurements, xlt x2,.... x,,, the standard deviation is defined to be the
VS (x{ - x)2
'*L^i—'
quantity, s» V . , where X is the sample mean.

Standard Error A measure of the variability (or precision) of a sample estimate.
Standard errors are often used to construct confidence intervals.

Statistical Sample A collection of chemical concentration measurements reported by the
lab for one or more lab samples where the lab samples were collected using
statistical sampling methods, Collection of a statistical sample allows estimation
of precision and confidence intervals.

Statistical Test A formal statistical procedure and decision rule for deciding whether the
ground water in a well attains the specified cleanup standard.

Steady State A state at which the residual effects of the treatment process (or any other
temporary intervention) on general ground water characteristics appear to be
negligible (see Section 7.1).

Sum of Squares Due to Error (SSE) A measure of how well the model fits the data
necessary for assessing the adequacy of the model. If the SSE is small, the fit
is good; if it is large, the fit is poor.

Symmetric Distribution A distribution of measurements for which the two sides of its
overall shape are mirror images of each other about a center line.

Systematic Sample Ground water samples that are collected at equally-spaced intervals
of time.

t-Distribution The distribution of a quotient of independent random variables, the
numerator of which is a standardized normal variate with mean equal to zero
and variance equal to one, and the denominator of which is the positive square
root of the quotient of a chi-square distributed variate and its number of degrees
of freedom. For additional details about the t-distribution, consult Resnikoff
and Lieferman (1957) and Locks, Alexander, and Byars (1943).
G-12
-------
APPENDIX G: GLOSSARY
Tolerance Interval A confidence interval around a percentile of a distribution of
concentrations.

Transformation A manipulation of either the dependent of independent variable, or
both, to normalize a distribution or linearize a model. Useful transformations
include logarithmic, inverse, square root, etc.

Trends A general increase or decrease in concentrations over time which is persistent and
unlikely to be due to random variation.

True Population Mean The actual, unknown arithmetic average contaminant level for
all ground water samples in the population (see also mean and population).

Type I Error The error made when the ground water in a well is declared to be clean
based on a statistical test when it is actually contaminated. This is also referred
to as a false positive.

Type II Error The error made when the ground water in a well is declared to be
contaminated when it is actually clean. This is also referred to as a false
negative.

Variance The square of the standard deviation.

Waste Site The entire area being investigated for contamination.

Z Value Percentage point of a standard normal distribution. Z values are tabulated in
Table A.2 of Appendix A.
G-13
-------