United States        Pofes, Planning      EPA 230-R-i2«014
             And EvalwrtiGft
Etwirarmwntel Pratadion    And EvalwrtiGft      Juty 1OT2
             (PM-222)
Methods For Evaluating The
Attainment t Of Cleanup
Standards

Volume 2: Ground  Water

-------
    Methods for Evaluating the
Attainment of Cleanup Standards
     Volume 2: Ground Water
   Environmental Statistics and Lifommtioii Division (PM-222)
         Office of Policy, Planning, and Evaluation
         U. S, Environmental Protection Agency
               401 M Street, S.W.
              Washington, DC 20460

                  July, 1992

-------
                                   DISCLAIMER
This report was prepared under contract to an agency of the United States Government.
Neither  the United States Government nor  any of  its employees,  contractors,
subcontractors, or their employees makes any warranty, expressed or implied, or assumes
any legal liability or responsibility for any third party's  use or the results  of such use of any
information, apparatus, product, model, formula, or process disclosed in this report, or
represents that its use by such third party would not infringe on privately owned rights.


Publication of the data in this document does not signify that the contents necessarily reflect
the joint or separate views and policies of each co-sponsoring agency. Mention of trade
names or commercial products does not constitute endorsement or recommendation for  use.

-------
                           TABLE OF  CONTENTS

                                                                     Pag?

EXECUTIVE SUMMARY	xxi

1.     INTRODUCTION,	1

       1.1     General Scope and Features of the Guidance Document	 1-1
              1.1.1    Purpose	  1-1
              1.1.2    Intended Audience and Use	  1-3
              1.1.3    Bibliography, Glossary, Boxes, Worksheets,
                      Examples, and References to "Consult a
                      Statistician"	  1-4
       1.2     Use of this  Guidance  in Ground-Water Remediation
              Activities	  1-5
              1.2.1    Pump-and-Treat   Technology	  1-5
              1.2.2    Barrier  Methods  to Protect Ground Water	 1-6
              1.2.3    Biological  Treatment	  1-6
       1.3     Organization of this Document	  1-7
       1.4     Summary	 1-8

2.     INTRODUCTION TO  STATISTICAL CONCEPTS AND
       DECISIONS	2-1

       2.1     A Note on Terminology	  2-2
       2.2     Background for the Attainment Decision	2-2
              2.2.1    A  Generic Model of Ground-Water Cleanup
                      Progress 	  2-3
              2.2.2    The Contaminants to be Tested	  2-5
              2.2.3    The Ground-Water System to be Tested	2-6
              2.2.4  The  Cleanup Standard	2-6
              2.2.5    The Definition of Attainment	  2-7
       2.3     Introduction  to Statistical  Issues  For  Assessing Attainment	2-8
              2.3.1    Specification of the Parameter to be Compared to
                      the Cleanup Standard	2-8
              2.3.2    Short-term Versus Long-term Tests	2-13
              2.3.3    The Role of Statistical  Sampling and Inference in
                      Assessing  Attainment	 2-15

-------
                             TABLE OF CONTENTS


                                                                    Page

             2.3.4    Specification of Precision and  Confidence Levels
                      for Protection Against Adverse Health and
                      Environmental Risks	,2-17
            2.35    Attainment Decisions Based on Multiple Wells	2-20
             2.3.6    Statistical Versus Predictive Modeling	2-24
             2.3.7    Practical Problems with the Data Collection and
                      Their Resolution	,2-25
       2.4    Limitations and Assumptions of the Procedures Addressed
             in this Document	2-28
       2.5    Summary	2-28

3.      SPECIFICATION OF ATTAINMENT OBJECTIVES	3-1

       3.1    Data Quality  Objectives	  3-3
       3.2    Specification of the Wells to be Sampled	  3-3
       3.3    Specification of Sample  Collection and Handling
             Procedures	  3-3
       3.4    Specification of the Chemicals to  be  Tested and Applicable
             Cleanup Standards	  3-4
       3.5    Specification of the Parameters to Test	  3-4
             3.5.1    Selecting the  Parameters  to Investigate	3-5
             3.5.2    Multiple Attainment Criteria	  3-8
       3.6    Specification of Confidence Levels for Protection  Against
             Adverse Health  and  Environmental Risks	  3-8
       3.7    Specification 'of the  Recision  to  be Achieved	3-9
       3.8    Secondary Objectives	  3-10
       3.9    Summary	3-10

4.      DESIGN OF THE SAMPLING AND ANALYSIS PLAN	.4-1

       4.1    The  Sample Design	  4-1
             4.1.1    Random Sampling	  4-2
             4.1.2    Systematic Sampling	.4-2
             4.1.3    Fixed versus Sequential  Sampling	4-4
       4.2    The  Analysis   Plan..	  4-5
       4.3    Other Considerations  for  Ground Water Sampling and
             Analysis Plans	  4-6
       4.4    Summary	 4-7

5.      DESCRIPTIVE  STATISTICS AND HYPOTHESIS  TESTING	5-1

       5.1    Calculating the Mean, Variance, and Standard Deviation of
             the  Data	  5-6
       5.2    Calculating the  Standard  Error of the Mean	5-7
             5.2.1    Treating the Systematic Observations as  a Random
                      Sample	  5-8
                                       IV

-------
                              TABLE OF  CONTENTS
              5.2.2    Estimates  From  Differences  Between Adjacent
                      Observations .................................................  5-9
              5.2.3    Calculating the Standard Error After Correcting for
                                      ............................................  5-10
              5.2.4    Calculating the Standard Error After Correcting for
                      Serial  Correlation ...........................................  5-13
       5.3     Calculating Lag 1 Serial Correlation ................................. 5-14
       5.4     Statistical Inferences: What can be Concluded from Sample
              Data [[[  5-16
       5.5     The Construction and Interpreati on of Confidence Intervals
              about  Means [[[  5-18
       5.6     Procedures for Testing for Significant Serial Correlation ......... 5-21
              5.6.1    Durbin-Watson Test ........................................  5-21
              5.6.2    An Approximate  Large-Sample  Test ..................... 5-23
       5.7     Procedures for  Testing the Assumption of Normality ............. 5-23
              5.7.1    Formal Tests for Normality  ............................... 5-24
              5.7.2    Normal Probability Plots .................................. 5-24
       5.8     Procedures  for Testing Per cent lies  Using Tolerance
              Intervals  [[[  5-25
              5.8.1    Calculating a Tolerance  Interval ........................... 5-25
              5.8.2    Inference:  Deciding if the  True Percentile is Less
                      than the Cleanup standard ................................. 5-26
       5.9     Procedures for Testing proportions ................................. 5-27
              5.9.1    Calculating  Confidence  Intervals for Proportions ...... 5-28
              5.9.2    Inference:  Deciding Whether  the Observed
                      Proportion Meets the  Cleanup Standard              .5-29
       5.9.3   Nonparametric  Confidence Intervals  Around  a Median .................. 5-30
       5.10   Determining  Sample Size  for Short-Term Analysis  and  Other
              Data Collection Issues .................................................  5-33
              5.10.1   Sample Sizes for Estimating a Mean ..................... 5-34
              5.10.2  Sample Sizes  for Estimating a Percentile  Using
                      Tolerance Intervals .........................................  5-38
              5.10.3  Sample Sizes  for  Estimating Proportions ................ 5-39
              5.10.4  Collecting the Data .......................................... 5-40
              5.10.5  Making Adjustments for Values Below the
                      Detection  Limit ..............................................  5-41
       5. 1 1   Summary [[[ 5-41

6.      DECIDING TO TERMINATE TREATMENT USING
       REGRESSION ANALYSIS.... ............................................... 6-1


-------
                             TABLE OF CONTENTS
                                                                     Page

             61.3    Assessing the Fit of the Model	6-11
             61.4    Inferences in Regression	6-13
       6.2    Using Regression to Model the Progress of Ground Water
             Remediation 	  6-26
             6.2.1    Choosing a Linear or  Nonlinear Regression	6-29
             6.2.2  Fitting  the  Model	 6-32
             6.2.3    Regression  in  the Presence  of Nonconstant
                      Variances	  6-32
             6.2.4    Correcting for Serial Correlation	6-33
       6.3    Combining  Statistical Information with Other Inputs to  the
             Decision Process	  6-38
       6.4    Summary	6-39

7.      ISSUES TO BE CONSIDERED BEFORE STARTING
       ATTAINMENT SAMPLING	  7-1

       7.1.  The  Notion of  "Steady  State"	  7-2
       7.2    Decisions to be Made in Determining When a Steady State is
             Reached	  7-3
       7.3    Determining When  a Steady State Has Been Achieved	7-3
             7.3.1    Rough Adjustment  of Data for Seasonal Effects	7-5
       7.4    Charting  the  Data	  7-6
             7.4.1    A Test for  Change of Levels Based on Charts	7-7
             7.4.2    A Test  for Trends Based  on Charts	7-7
             7.4.3    Illustrations  and  Interpretation	 7-8
             7.4.4    Assessing  Trends via Statistical Tests	7-13
             7.4.5    Considering the Location of Wells	7-14
       7.5    Summary	7-14
8.      ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE
       TESTS 	    8-1

       8.1    Fixed Sample Size Tests	   8-4
       8.2    Determining  Sample Size  and  Sampling Frequency	8-4
             8.2.1     Sample Size for  Testing Means	  8-5
             8.2.2    Sample Size for  Testing Proportions	8-9
             8.2.3    An Alternative Method  for Determining Maximum
                      Sampling Frequency	  8-11
       8.3    Assessing Attainment of the Mean Using Yearly Averages	8-12
       8.4    Assessing Attainment of the Mean After Adjusting for
             Seasonal  Variation	   8-20
       8.5    Fixed Sample Size Tests for Proportions	  8-25
       8.6    Checking for Trends in Contaminant Levels  After Attaining
             the Cleanup Standard	   8-26
       8.7    Summary	8-26
                                       VI

-------
                          TABLE OF CONTENTS

                                                               Page

9.     ASSESSING ATTAINMENT USING SEQUENTIAL TESTS	9-1
      9.1   Determining Sampling Frequency for Sequential Tests	9-5
      9.2   sequential Procedures for sample Collection and  Dam
            Handling	  9-7
      9.3   Assessing Attainment of the Mean Using Yearly Averages	9-7
      9.4   Assessing Attainment of the Mean After Adjusting for
            Seasonal Variation	 9-18
      9.5   Sequential Tests for Proportions	 9-22
      9.6   A Further Note on Sequential Testing	9-23
      9.7   Checking for Trends  in Contaminant Levels After Attaining
            the Cleanup standard	 9-23
      9.8   Summary	9-24
BIBLIOGRAPHY	BIB-1

APPENDIX A: STATISTICAL TABLES	A-l

APPENDIX B: EXAMPLE WORKSHEETS	B-l

APPENDIX C: BLANK WORKSHEETS	C-l

APPENDIX D: MODELING THE DATA	D-l

APPENDIX E: CALCULATING RESIDUALS AND SERIAL
      CORRELATIONS USING SAS	E-l

APPENDIX F: DERIVATIONS AND EQUATIONS	F-l
      F.I   Derivation of Tables A.4 and A.5	F-l
      F.2   Derivation of Equation (F.6)	F-4
            F.2.1   Variance of zt	F-5
            F.2.2   Variance of z	'.	F-6
      F.3   Derivation of the Sample Size Equation	F-8
      F.4   Effective Df for the Mean from an AR1 Process	F-10
      F.5   Sequential Tests for Assessing Attainment	F-ll
      Assessing Attainment of Ground Water Cleanup Standards Using
            Modified  Sequential t-Tests	F-12
            1.      Introduction	F-12
            2.      Fixed Versus Sequential Tests	F-13
            3.      Power and Sample Sizes for the Sequential t-Test
                    with Normally Distributed Data	F-16
                                    VII

-------
                           TABLE OF CONTENTS


                                                                 Page

             4.      Modifications to Simplify the Calculations and
                    Improve the Power	  F-18
             5.      Application  to Ground Water Data from  Superfund
                         	  F-19
             6.      Conclusions and Discussion	  F-22
             Bibliography	   F-23

APPENDIX G: GLOSSARY	   G-l

-------
                             TABLE OF CONTENTS

                              LIST OF FIGURES
                                                                     Pagq
Figure 1.1     Steps in Evaluating Whether a Ground Water Well Has
              Attained the Cleanup Standard	1-2
Figure 2.1     Example scenario for contaminant measurements in one well
              during successful remediation action	2-3
Figure 2.2     Measures of location: Mean, median, 25th percentile, 75th
              percentile, and 95th percentile for three hypothetical
              distributions	2-10
Figure 2.3     Illustration of the difference between a short- and long-term
              mean concentration	.2-14
Figure 2.4     Hypothetical  power curve	.2-20
Figure 3.1     Steps in defining the attainment objectives	3-2
Figure 5.1     Example scenario for contaminant measurements during
              successful remedial action	5-2
Figure 5.2     Example of data from a monitoring well exhibiting a
              seasonal pattern	5-11
Figure 6.1     Example Scenario for Contaminant Measurements During
              Successful Remedial Action	6-1
Figure 6.2     Example of a Linear Relationship Between Chemical
              Concentration Measurements and Time	6-3
Figure 6.3     Plot of data for from Table 6.1	6-10
Figure 6.4     Plot of data and predicted values for from Table 6.1	6-10
Figure 6.5     Examples of Residual Plots (source: adapted from figures in
              Draper and Smith, 1966, page 89)	6-12
Figure 6.6     Plot of residuals for from Table 6.1	6-13
Figure 6.7     Examples of R-Square for Selected Data Sets	6-15
Figure 6.8     Plot of Mercury Measurements as a Function of Time (See
              Box 6.16)	6-21
Figure 6.9     Comparison of Observed Mercury Measurements and
              Predicted Values under the Fitted Model (See Box 6.16)	6-22
                                       IX

-------
                              TABLE OF  CONTENTS


                                                                       Page

Figure 6.10   Plot of Residuals Against Time for Mercury Example  (see
              Box 6.17)	6-24

Figure 6.11   -Plot of Mercury Concentrations Against x = 1/VT, and
              Alternative Fined Model (see Box 6.17)	6-24

Figure 6.12   Plot of Residuals Based on Alternative  Model (see Box
              6.17)	6-25

Figure 6.13   Plot of Ordered Residuals Versus Expected Values for
              Alternative Model (see Box 6.17)	6-25

Figure 6.14   Examples  of Contaminant Concentrations that Could Be
              Observed  During  Cleanup	6-27

Figure 6 15   Steps for Implementing Regression Analysis at Superfund
              Sites	6-28

Figure 6.16   Example  of a  Nonlinear Relationship Between Chemical
              Concentration  Measurements and Time	6-30

Figure 6.17   Examples  of  Nonlinear  Relationships	6-30

Figure 6.18   Plot of Benzene Data and Fitted Model (see  Box 6.22)	6-37

Figure 7.1     Example  Scenario for Contaminant Measurements During
              Successful Remedial Action	7-1

Figure 7.2    Example of Time Chart for Use in Assessing Stability	7-6

Figure 7.3    Example  of Apparent Outliers	 7-10

Figure 7.4    Example  of a Six-point Upward Trend in the  Data	7-10

Figure 7.5    Example  of a Pattern in the Data that May Indicate an
              Upward  Trend	7-11

Figure 7.6    Example  of a Pattern in the Data that May Indicate a
              Downward Trend	7-11

Figure 7.7    Example of Changing Variability in the Data Over Time	7-12

Figure 7.8    Example  of a  Stable Situation with Constant Average and
              Variation	7-12

Figure 8.1     Example  Scenario for Contaminant Measurements During
              Successful Remedial Action	8-1

Figure 8.2    Steps in  the Cleanup Process  When Using a  Fixed Sample
              Size Test	8-3

-------
                             TABLE  OF CONTENTS


                                                                     Page

Figure 8.3     Plot of Arsenic Measurements for 16 Ground Water Samples
              (see Box 8.21)	8-24

Figure 9.1     Example Scenario for  Contaminant Measurements During
              Successful  Remedial Action	9-1

Figure 9.2     Steps in the Cleanup Process  When Using a Sequential
              Statistical  Test	9-3

Figure D. 1     Theoretical Autocorrelation Function Assumed in the Model
              of the Ground  Water  Data	D-4

Figure D.2    Examples of Data with Serial Correlations of 0,0.4, and
              0.8. The higher the serial correlation, the more the
              distribution dampens out	D-5

Figure F. 1     Differences in Sample Size Using Equations Based on a
              Normal Distribution (Known Variance) or a t Statistic,
              Assuming a = .10 and P  = .10	F-9
                                        XI

-------
                             TABLE OF CONTENTS
                              LIST OF TABLES
Table 2.1     False  positive and negative decisions	2-18

Table 3.1     Points to consider when trying to choose among the mean,
             upper proportion/percentile, or median	3-6

Table 3.2     Recommended parameters to test when comparing the
             cleanup standard to the concentration of a chemical with
             chronic effects..,	3-7

Table 4.1     Locations in this document of discussions of sample designs
             and analysis for ground water sampling	4-6

Table 5.1     Summary of notation used in Chapters 5  through 9	5-4

Table 5.2     Alternative formulas for the standard error of the mean	5-20

Table 5.3     Values of M and N+l-M and confidence  coefficients for
             small  samples	5-32

Table 5.4     Example contamination data used in Box 5.19 to generate
             nonparamctric confidence interval	5-33

Table 6.1     Hypothetical Data for the Regression Example in Rgure 6.3	6-9

Table 6.2     Hypothetical concentration measurement for mercury (Hg) in
             ppm for'20 ground water samples taken at monthly intervals... .6-21

Table 6.3     Benzene concentrations in 15 quarterly samples (see Box
             6.22)	6-37

Table 8.1     Arsenic measurements (ppb) for 16 ground water samples
             (see Box 8.21)	8-25

Table A. 1     Tables of t for selected alpha and degrees of freedom	A-l

Table A.2    Tables of z for selected alpha	A-2

Table A. 3    Tables of k for selected alpha, PO, and sample size for use in
             a tolerance interval test	A-3

Table A. 4    Recommended number of samples per seasonal period (np)
             to minimize total cost for assessing attainment	A-6

Table A.5    Variance factors F for determining sample size	A-7

Table D. 1     Decision criteria for determining whether the ground water
             concentrations attain the cleanup standard	..D-7
                                       Xii

-------
                            TABLE OF  CONTENTS


                                                                    Page

Table F.I     Coefficients for the terms at, at-1, etc., in the sum of three
             successive correlated observations	F-7

Table F. 2     Differences between the calculated sample sizes using a t
             distribution and a normal distribution when the samples size
             based on the t distribution is 20, for selected values of a
             (Alpha)  and  |3 (Beta)	F-10
                                      XIII

-------
                            TABLE OF  CONTENTS




                              LIST  OF BOXES


                                                                    Page

Box 2.1      Construction of Confidence Intervals Under Assumptions of
             Normality	2-16

Box 4.1      Example of Procedure for Specifying a Systematic Sample
             Design	4-4

Box 5.1      Calculating Sample Mean, Variance, and Standard Deviation	5-6

Box 5.2      Calculating the Standard Error Treating the Sample 	5-9

Box 5.3      Calculating the Standard Error Using Estimates Between
             Adjacent Observations	5-10

Box 5.4      Calculating Seasonal Averages and Sample Residuals	5-12

Box 5.5      Calculating the Standard Error After Removing Seasonal
             Averages	5-13

Box 5.6      Calculating the Standard Error After Removing Seasonal
             Averages	5-14

Box 5.7      Calculating the Serial Correlation from the Residuals After
             Removing Seasonal Averages	5-15

Box 5.8      Estimating the Serial Correlation Between Monthly
             Observations	5-16

Box 5.9      General 'Construction of Two-sided Confidence Intervals	5-18

Box 5.10     General Construction of One-sided Confidence Intervals	5-19

Box 5.11     Construction of Two-sided Confidence Intervals	5-19

Box 5.12     Comparing the Short Term Mean to the Cleanup Standard
             Using Confidence Intervals	5-21

Box 5.13     Example: Calculation of Confidence Intervals	5-22

Box 5.14     Calculation of the Durbin-Watson Statistic	5-22

Box 5.15     Large Sample Confidence Interval for the Serial Correlation	5-23

Box 5.16     Tolerance Intervals: Testing for the 95th Percentilc with
             Lognormal Data	5-27

Box 5.17     Calculation of Confidence Intervals	5-30
                                      xiv

-------
                             TABLE OF CONTENTS
                                                                    Page
Box 5.18     Calculation of M	5-32
Box 5.19     Example of Constructing Nonparametric Confidence
             Intervals	5-34
Box 5.20     Estimating a from Data Collected Prior to Remedial Action	5-36
Box 5.21     Example of Sample Size Calculations	5-38
Box 5.22     (hollaring Sample Size for Tolerance Intervals	.5-39
Box 5.23     Sample Size Determination for Estimating Proportions	5-40
Box 6.1      Simple Linear Regression Model	6-4
Box 6.2      Calculating Least Square Estimates	6-6
Box 6.3      Estimated Regression Line	6-6
Box 6.4      Calculation  of Residuals 	6-7
Box 6.5      Sum of Squares Due to Error and the Mean Square Error 	6-7
Box 6.6      Five Basic Quantities for Use in Simple Linear Regression
             Analysis	6-8
Box 6.7      Calculation of the Estimated Model Parameters and SSE	6-8
                                                   *
Box 6.8      Example of Basic Calculations for  Linear Regression	..6-9
Box 6.9      Coefficient of Determination	6-14
Box 6.10     Calculating the Standard Error of the Estimated Slope	6-16
Box 6.11     Calculating a Confidence Interval Around the Slope	6-16
Box 6.12     Using the Confidence Interval for the Slope to Identify a
             Significant Trend	6-18
Box 6.13     Calculating the Standard Error and Confidence Intervals for
             Predicted Values	6-18
Box 6.14     Using the Simple Regression Model to Predict Future
             Values	6-19
Box 6.15     Calculating the Standard Error and Confidence Interval a
             Predicted Mean	6-20
Box 6.16     Example of Basic Regression Calculations	6-22
Box 6.17     Analysis of Residuals for Mercury Example	6-23
                                       xv

-------
                              TABLE  OF CONTENTS
                                                                       Page
Box 6.18      Suggested  Transformations	 6-31
Box 6.19      Transformation to "New" Model	 6-34
Box 6.20      "New" Fitted Model  for  Transformed Variables	6-34
Box 6.21      Slope and Intercept of Fitted Regression Line in Terms of
              Original Variables	 6-35
Box 6.22      Correcting  for  Serial  Correlation	 6-36
Box 6.23      Constructing  Confidence Limits  around an Expected
              Transformed Value	 6-38
Box 7.1       Adjusting for Seasonal Effects	 7-5
Box 8.1       Steps for Determining Sample  Site for Testing the Mean	8-6
Box 8.2       Example of Sample Size Calculations for Testing the Mean	8-9
Box 8.3       Determining Sample Size for Testing Proportions	8-10
Box 8.4       Choosing a Sampling Interval Using the Darcy Equation	8-11
Box 8.5       Steps  for Assessing Attainment  Using Yearly  Averages	8-13
Box 8.6       Calculation of the Yearly  Averages	 8-14
Box 8.7       Calculation of the Mean and Variance of the Yearly
              Averages	 8-14
Box 8.8       Calculation of Seasonal Averages and the Mean of the
              Seasonal  Averages	 8-15
Box 8.9       Calculation of Upper  One-sided Confidence Limit for the
              Mean	 8-16
Box 8.10      Deciding if the Tested Ground Water Attains the  Cleanup
              Standard	8-16
Box 8.11      Example of Assessing Attainment of the Mean  Using Yearly
              Averages	 8-17
Box 8.12      Steps for Assessing Attainment Using the Log Transformed
              Yearly  Averages	8-18
Box 8.13      Calculation of the Natural  Logs  of the Yearly  Averages	8-18
Box 8.14      Calculation of the Mean and Variance of the Natural Logs of
              the Yearly Averages	 8-19
                                        XVI

-------
                            TABLE OF  CONTENTS

                                                                    Page
Box 8.15     Calculation of the Upper Confidence Limit for the Mean
             Based on Log Transformed Yearly Averages	8-20
Box 8.16  ~   Steps for Assessing Attainment Using the Mean After
             Adjusting for Seasonal Variation	8-21
Box 8.17     Calculation of the Residuals	8-22
Box 8.18     Calculation of the Variance  of the Residuals	8-22
Box 8.19     Calculating the Serial Correlation from the Residuals After
             Removing Seasonal Averages	8-22
Box 8.20     Calculation of the Upper Confidence Limit for the Mean
             After Adjusting for Seasonal Variation	8-23
Box 8.21     Example Calculation of Confidence Intervals...	8-24
Box 9.1      Steps for Determining Sample Frequency for Testing the
             Mean	9-5
Box 9.2      Steps for Determining Sample Frequency for Testing a
             Proportion	9-6
Box 9.3      Example of Sample Frequency Calculations	9-6
Box 9.4      Steps for Assessing Attainment Using Yearly Averages	9-9
Box 9.5      Calculation of the  Yearly Averages	 9-10
Box 9.6      Calculation of the Mean and Variance of the Yearly
             Averages	9-10
Box 9.7      Calculation of Seasonal Averages and the Mean of the
             Seasonal Averages	9-11
Box 9.8      Calculation of t and 6 When Using the Untransformed
             Yearly  Averages	9-12
Box 9.9      Calculation of the Likelihood Ratio for the Sequential Test	9-12
Box 9.10     Deciding if the Tested Ground Water Attains the Cleanup
             Standard	9-13
Box 9.11     Example Attainment Decision Based on a Sequential Test	9-14
Box 9.12     Steps for Assessing Attainment Using the Log Transformed
             Yearly  Averages	9-15
Box 9.13     Calculation of the Natural Logs of the Yearly Averages	9-16
                                      xvn

-------
                            TABLE OF  CONTENTS
                                                                  Page
Box 9 14     Calculation of the Mean and Variance of the Natural Logs of
             the Yearly Averages	9-16
Box 9.15     Calculation of t and 5 When Using the Log Transformed
             Yearly  Averages	9-17
Box 9.16     Steps for Assessing Attainment Using the Mean After
             Adjusted for Seasonal Variation	9-19
Box 9.17     Calculation of the Residuals	9-19
Box 9.18     Calculation of the Variance  of the Residuals	9-20
Box 9.19     Calculating the Serial Correlation from the Residuals After
             Removing Seasonal Averages	9-20
Box 9.20     Calculation of t and 5 When Using the Mean Corrected for
             Seasonal Variation	9-21
Box 9.21     Calculation of the Likelihood Ratio for the Sequential Test
             When Adjusting for Serial Correlation	9-21
Box 9.22     Example Calculation of Sequential Test Statistics after
             Adjustments for Seasonal Effects and Serial Correlation	9-22
Box D.I      Modeling  the  Data	D-2
Box D.2      Autocorrelation  Function	D-3
BoxD.3      Revised Model for Ground Water Data	D-7
                                     xviii

-------
                       AUTHORS  AND  CONTRIBUTORS
          This manual represents the combined effort of several organizations and many
individuals. The names  of the primary contributors, along with the role of each
organization, are summarized below.
Westat, Inc., 1650 Research Boulevard, Rockville, Md 20850, Contract No.  68-01-
7359, Task 11 -- research, statistical procedures, draft and final draft report. Key Westat
staff included:
                    John Rogers            Adam  Chu
                    Ralph  DiGaetano        Ed Bryant
                    Contract Coordinator: Robert Clickner
Dynamac Corporation, 11140 Rockville Pike, Rockville, MD 20852 (subcontractor to
Westat) -- consultation on the sampling of ground water, treatment alternatives, and
chemical analysis. Key Dynamac staff included:
                    David  Lipsky            Richard Dorrler
                                  Wayne Tusa
EPA, OPPE, Statistical Policy Branch - Project management, technical input, peer
review. Key EPA staff included:
                    Barnes Johnson         Herbert Lacayo
SRA Technologies, Inc., 4700 King Street, Suite  300,  Alexandria,  VA 22302,
Contract No. 68-01-7379, Delivery Order 16 ~ editorial and graphics support, technical
review, and preliminary draft preparation. Key SRA staff included:
                    Marcia  Gardner .      Karl Held
                    LoriHidinger           Alex Polymenopoulos
                    Mark  Ernstmann         Jocelyn Smith
                                      xix

-------

-------
                           EXECUTIVE SUMMARY

          This document provides regional project managers, on-site coordinators, and
their contractors with sampling and analysis methods for evaluating whether ground water
remediation has met pm-established cleanup standards for one  or more chemical
contaminants at a hazardous waste site. The verification of cleanup by evaluating a site
relative to a cleanup standard or an applicable or relevant and appropriate requirement
(ARAR) is mandated in Section 121 of the Superfund Amendments and Reauthorization
Act (SARA). This document, the second in a series, provides sampling and data analysis
methods for the purpose of verifying attainment of a cleanup standard in  ground water.
The fast volume addresses evaluating attainment in soils and solid  media.

          This document presents statistical methods which can be used  to address the
uncertainty of whether a site has met a cleanup standard. Sup&fund managers face the
uncertainty of having to make a decision about the entire site based only on samples of the
ground water at the  site, often collected for only a limited time period.

          The methods in this document approach cleanup standards as having three
components that influence the overall stringency of the standard: first, the magnitude,
level, or concentration deemed to be protective of public health and the  environment;
second, the sampling performed to evaluate whether a site is above or below the standard;
and third, the method of comparing sample data to the standard to  decide whether the
remedial action was successful.  All three of these  components are important. Failure to
address any one these components can result in insufficient levels of cleanup. Managers
must look beyond the cleanup level and explore the sampling and analysis methods which
will allow confident assessment of the site relative to the cleanup standard

          A site manager is likely to confront two major questions in evaluating the
attainment of the cleanup standard: (1) is the site really  contaminated because a few
samples are above the cleanup standard?   and (2) is the site really "clean" because the
sampling shows the majority of samples to be below the cleanup standard?  The statistical
methods demonstrated  in this guidance  document allow for decision making under
uncertainty and permit valid extrapolation of information that can be defended and used
with confidence to determine whether the site meets  the  cleanup standard.

                                       xxi

-------
          The presentation of concepts and solutions to potential problems in assessing
ground water attainment begins with an introduction to the statistical reasoning required to
implement these methods.   Next, the planning activities, requiring input from both
statisticians and nonstatisticians, are described. Finally, a series of methodological
chapters are  presented to address  statistical procedures applicable to successive stages in the
remediation effort. Each chapter will now be considered in detail.

          Chapter 1 provides a brief introduction to the  document,  including its
organization, intended use, and applications for a variety of treatment technologies. A
model for the sequence of ground water remediation activities at the site is described.
Many areas  of expertise must be involved in any remedial action process. This  document
attempts to  address only statistical procedures relevant to evaluating the attainment of
cleanup goals.

          The cleanup  activities at the site  will include site investigation, ground water
remediation, a post-treatment period allowing the ground water to reach steady state,
sampling and analysis to assess attainment, and possible post-cleanup monitoring.
Different statistical procedures  are applicable  at different stages  in the cleanup process. The
statistical procedures used must account for the changes in the ground water system over
time due to natural or man-induced causes. As a result, the discussion makes a distinction
between short-term estimates which might be used during remediation and long-term
estimates which are used to assess attainment. Also, a slack period of time after treatment
and before assessing attainment is strongly recommended to allow any transient effects of
treatment to dissipate.

           Chapter 2 addresses statistical concepts as they might relate to the evaluation of
attainment. The chapter discusses the form of the null and alternate hypothesis, types of
errors, statistical power curves, the handling  of outliers and values below detection limits,
short- versus long-term tests, and assessing wells individually or as a group. Due to the
cost of developing new wells, the assessment decision is assumed to be based on
established wells. As a result, the statistical conclusions strictly apply only  to the water in
the sampling  wells rather than the ground  water  in  general.  The expertise  of  a
hydrogeologist can be useful for making conclusions about the ground water at the site
based on the statistical results from the  sampled wells.
                                        xxn

-------
          The procedures in this document favor protection of the environment and
human health. If uncertainty is large or the sampling inadequate, these methods conclude
that the sample area does not attain the cleanup standard,  Therefore, the null hypothesis, in
statistical terminology, is that the site does not attain the cleanup standard until sufficient
data are acquired to prove otherwise.

          Procedures used to combine data from separate wells or contaminants to
determine whether the site as a whole attains all relevant cleanup standards are discussed.
How the data from separate wells are combined affects the interpretation of the results and
the probability of concluding that the overall site attains the cleanup standard. Testing the
samples from individual wells or groups of wells is also discussed.

          Chapter 3 considers the steps involved in specifying the attainment objectives.
Attainment objectives  must be specified before the evaluation of whether a site has attained
the cleanup standard can be made. Attainment objectives are not specified by statisticians
but rather must be provided by a combination of risk assessors,  engineers, project
managers, and hydrogeologists. Specifying  attainment objectives  includes specifying the
chemicals of concern, the cleanup standards, the wells to be sampled, the statistical criteria
for defining attainment, the parameters to be tested, and the precision and confidence level
desired.

          Chapter 4 discusses the specification of the sampling and analysis  plans. The
sampling and analysis plans are prerequisites  for the statistical methods presented  in the
following chapters. A discussion of common sampling plan designs and approaches to
analysis are presented. The  sample designs discussed include simple random sampling,
systematic sampling, and sequential sampling.  The analysis plan is developed in
conjunction  with the  sample  design.

          Chapter 5 provides methods which are appropriate for describing ground water
conditions during a specified period of time. These methods are useful for making a quick
evaluation of the ground water conditions, such as during remediation. Because the  short-
term confidence intervals reflect only variation within the sampling period and not long-
term trends or shifts between periods, these methods are not appropriate for assessing
attainment of the cleanup standards after the planned remediation has been completed.
However, these descriptive procedures can be used to estimate means,  percentiles,
                                       xxiii

-------
confidence intervals, tolerance intervals and variability. Equations are also provided to
determine the sample size required for each statistical test and to adjust for seasonal
variation  and serial  correlation.

           Chapter 6 addresses statistical procedures which are useful during remediation,
particularly in deciding when to terminate treatment. Due to the complex dynamics of the
ground water flow in response to pumping, other remediation activity, and natural forces,
the decision to  terminate treatment  cannot easily  be based on statistical procedures.
Deciding when  to terminate treatment should be based on a combination of statistical
results,  expert knowledge, and policy  decisions.  This chapter  provides some  basic
statistical procedures which can be used to help guide the termination decision, including
the use of regression methods for helping to decide when to stop treatment. In particular,
procedures are  given  for estimating the trend in contamination levels and  predicting
contamination levels at future points in time. General methods for fitting simple linear
models and assessing the adequacy of the model ate also discussed.

           Chapter  7  discusses general statistical methods  for evaluating whether the
ground water system has reached  steady state and therefore  whether sampling to assess
attainment can begin.  As a result of the treatment  used  at the  site, the ground water system
will be disturbed from its natural level of steady state. To reliably evaluate whether the
ground water can be expected to attain the cleanup standard after remediation, samples  must
be collected under conditions similar to those which  will exist in the future. Thus, the
sampling for assessing attainment can only occur when the residual effects of treatment on
the  ground water are small compared to those of natural forces.

           Finding that the ground water has returned to a steady state after terminating
remediation efforts is an essential step in establishing of  a meaningful test of  whether or not
the cleanup standards have been attained. There are uncertainties in the process, and to
some extent it is judgmental.  However, if an adequate amount of data is carefully  gathered
prior to beginning remediation and after ceasing remediation, reasonable decisions can be
made as  to whether  or not the ground water can be considered to have reached a state of
stability. The decision on whether the ground water has reached steady state will be based
on a combination of statistical calculations, plots of data, ground water modeling using
predictive models, and  expert  advice from  hydrogeologists familiar with the site.
                                        xxiv

-------
          Chapters 8 and 9 present the statistical procedures which can be used to evaluate
whether the contaminant concentrations in the sampling wells attain the cleanup standards
after the ground water has reached steady state. The suggested methods use either a fixed
sample size test (Chapter 8) or a sequential statistical test (Chapter 9). The testing
procedures can be applied to either samples from individual wells or wells tested as a
group. Chapter 8 presents fixed sample size tests for assessing attainment of the mean:
using yearly averages or after adjusting for seasonal variation; using a nonparametric test
for proportions; and using a nonparametric confidence interval about the median.  Chapter
9 discusses sequential statistical tests for assessing attainment of the mean using yearly
averages,  assessing attainment of the mean after adjusting  for seasonal variation,  and
assessing attainment using a nonparametric test for proportions. In both fixed sample  size
tests and sequential tests, the ground water at the site is judged to attain the cleanup
standards, if the contaminant levels  are below the standard and  are not increasing over time.
If the ground water at the site attains the cleanup standards, follow-up monitoring is
recommended  to ensure that the steady state  assumption holds.

          Although the  primary focus of the document  is the procedures presented in
Chapters 8 and 9 for evaluating attainment, careful consideration of when to terminate
treatment and how long to wait for steady state are important in the overall planning. If the
treatment is terminated prematurely, excessive time may be spent in evaluating attainment
only to have to restart treatment to complete the remediation, followed by a second period
of attainment sampling and  decision.  If the ground water is not at steady state, the
possibility  of incorrectly determining the attainment  status of the  site increases.

          As an aid to the reader, a glossary of commonly-used terms is provided in
Appendix G; calculations and examples are presented in  boxes within the text;  and
worksheets with examples  are  provided in Appendix B.
                                        XXV

-------

-------
                           1.   INTRODUCTION
              Congress revised the Superfund legislation in the Superfund Amendments
and Reauthorization Act of 1986 (SARA). Among other provisions of SARA, section 121
on  Cleanup Standards discusses  criteria for selecting applicable or relevant and  appropriate
requirements (ARAR's) for cleanup and includes specific language that requires EPA
mandated remedial action to attain the  ARAR's.

              Neither SARA nor EPA regulations or guidances specify how to determine
whether the cleanup  standards  have been attained. This document offers procedures that
can be used to determine whether a site has attained the  appropriate cleanup standard after a
remedial  action.
 1.1           General Scope and Features of the Guidance Document

 1.1.1        Purpose

              This document provides a foundation for decision-making regarding site
cleanup by providing methods that statistically compare risk standards with field data in a
scientifically defensible manner that allows for uncertainty. Statistical procedures can be
used for many different purposes in the process of a Superfund site cleanup.  The purpose
of this document is to provide statistical procedures which can be used to  determine if
contaminant concentrations measured in selected ground-water wells  attain (i.e., are less
than)  the cleanup standard. This evaluation requires specification of  sampling protocols
and statistical analysis methods. Figure 1.1 shows the steps involved in the evaluation
process to determine whether the cleanup standard has been attained in a selected ground
water  well.
                                       1-1

-------
                   INTRODUCTION
Figure 1.1
Steps in Evaluating Whether a Ground Water Well Has Attained the
cleanup standard
                                       C   Sun  J
                                Define Attainment Objectives
                                         Chapters
                              Specify Sampling and Analysis Plan
                                      Chapters 4 and 5
                                Decide to Terminate Treatment
                                         Chapter 6
                                   Determine Steady state
                                         Chapter 7
                             Assess the Attainment of the Cleanup
                                         Standard
                                      Chapters 8 & 9
                            Declare that the Well Attains the Cleanup
                              Standard and Contine to Monitor as
                                         Necessary
                                  Is the Cleanup Standard
                                        Attained?
                                      Do Concentrations
                                     increase over Time?
                          1-2

-------
                                   INTRODUCTION
                 Consider the situation where several samples were taken and the results'
   indicated that one or two of the samples exceed the cleanup standard. How should this
   information be used to decide whether the standard has been attained? The mean of the
   samples might be  compared  with the standard. The magnitude of the measurements that are
   larger than the standard might be taken into consideration in making a decision. The loca-
   tion where  large measurements   occur might provide  some  insight.

                When specifying how attainment  is to be defined and  deciding  how statisti-
   cal procedures can be used, the following  factors are  all important:

                       The location of the sampling wells and the associated relationship
                       between concentrations in  neighboring  wells;
                       The number  of samples to be taken;
                       The sampling procedures  for  selecting and  obtaining water samples;

                       The data analysis procedures used to test  for attainment.

                Appendix D lists relevant EPA guidance documents on sampling and
   evaluating ground water. These documents address both the statistical and technical
   components of asampling and analysis program. This document is intended to extend the
   methodologies they  provide  by addressing statistical issues in the evaluation of the
   remediation process. This document does not attempt to suggest which standards apply or
   when they apply (i.e., the "How clean  is  clean?" issue).  Other Superfund guidance
   documents  perform that function.


1.1.2        Intended Audience and Use

                This document is  intended  primarily for Agency  personnel (primarily  on-site
   coordinators and regional project managers),  responsible parties, and their contractors who
   are involved with monitoring the progress of ground-water remediation at Superfund sites.
   Although selected introductory statistical  concepts arc reviewed, this document is directed
   toward readers that have had some prior training  or experience applying quantitative
   methods.
                                          1-3

-------
                                INTRODUCTION
              It must be emphasized that this document is intended to provide general
direction and assistance to individuals involved in the evaluation of the attainment of
cleanup standards. It is not a regulation nor is it formal guidance from the Superfund
Office. This manual should not be viewed as a "cookbook" or a replacement for good
engineering or statistical judgment
1.1.3        Bibliography,  Glossary,  Boxes, Worksheets, Examples,  and
              References to "Consult a Statistician"
              This document includes a bibliography which provides a point of departure
for the more sophisticated or interested user. There are references to primary textbooks,
pertinent journal  articles, and related guidances.

              The glossary (Appendix F)  is included to provide short, practical definitions
of terminology used in this guidance.  Words and phrases appearing  in bold within the text
are listed in the glossary. The glossary does not use theoretical explanations or formulas
and,  therefore,  may not be as precise as the text or alternative sources of information.

              Boxes  are used throughout  the document to separate and highlight equations
and example applications of the methods presented.  For a quick reference, a listing of all
boxes and their page numbers is provided in the index.

              A series of worksheets is included (Appendices B and C) to help order and
structure the calculations. References to the pertinent sections of the document are located
at the top of each worksheet. Example data and calculations are presented in the boxes and
the worksheets in Appendix B. The data and sites are hypothetical, but elements of the
examples correspond closely to several existing sites.

              Finally, the document often directs the reader to "consult a statistician"
when more difficult and complicated situations are encountered. A directory of Agency
statisticians is available from the Environmental Statistics and Information Division (PM-
222) at EPA Headquarters (FTS 260-2680, 202-260-2680).
                                         1-4

-------
                                INTRODUCTION
 1.2          Use of this Guidance in Ground-Water Remediation Activities

              Standards that apply to Superfund activities normally fall into the category
 of risk-based standards which arc developed using risk assessment methodologies.
 Chemical-specific  ARARs adopt from other programs often include at least a generalized
 component of risk.  However, risk standards may be specific to a site, developed using a
 local  endangerment  evaluation.

              Risk-based standards are expressed as a concentration value and, as applied
 in the Superfund program axe not associated with a standard method of interpretation.
 Although statistical methods arc used to develop elements of risk-based standards, the
 estimated uncertainties are not carried through the analysis or used to qualify the standards
 for use in a field sampling program. Even though risk standards are not accompanied by
 measures of uncertainty, decisions based on field data collected for the purpose  of repre-
 senting the entire site and validating cleanup will  be subject to uncertainty. This document
 allows decision-making regarding site cleanup  by providing methods that statistically
 compare risk standards with field data in a scientifically defensible manner that allows for
 uncertainty.

               Superfund activities where risk-based standards  might apply are highly
varied. The following discussion provides suggestions for the use of procedures &scribed
 in this document when implementing or evaluating Superfund activities.
1.2.1         Pump-and-Treat Technology

              Ground water is  often  treated by pumping contaminated ground water out of
the ground, treating the water, and  discharging the water into local surface waters or
municipal treatment plants. The contaminated ground water is gradually replaced by
uncontaminated water from the surrounding aquifer or from surface recharge. Pump and
treat systems may use a few or many wells. The progress of the remediation depends on
where the wells arc placed and the schedule for pumping. Pumping is often planned to
extend over many years.
                                       1-5

-------
                                INTRODUCTION
              Statistical methods presented in this manual can be used for monitoring the
contaminants in both the effluent from the treatment system and the ground water in order
to monitor the progress of the remediation.

              Project managers must decide when to terminate treatment based on avail-
able data, advice from hydrogeologists, and the results of ground-water monitoring and
modeling. This manual provides guidance on statistical procedures to help decide when to
terminate  treatment.

              The remediation may temporarily alter ground water levels and flows,
which in turn will affect the contaminant concentration levels.  After termination of treat-
ment and after the transient effects of the remediation have dissipated, the statistical proce-
dures presented in this manual can be used to assess if the ground-water contaminant
concentrations remain  at levels  which will  attain and continue to  attain the cleanup standard.


1.2.2         Barrier Methods to Protect Ground Water

              If the contamination is relatively immobile and cannot effectively be
removed from the ground water using extraction, it is sometimes handled by containment.
In such cases, establishing barriers at the surface or around the contamination source may
reduce contaminant input to the aquifer, resulting in the reduction of ground-water concen-
trations to a level which attains the cleanup standard. The barriers include soil caps to
prevent surface infiltration, and slurry walls and other structures to force ground water to
flow away from contamination sources.

              The procedures in this manual can be used to establish whether the contam-
ination levels attain the relevant standards after the ground water has established its new
levels as a result of changes in ground-water flows.


1.2.3         Biological Treatment

              In many situations natural bacteria will  adapt to the contamination in the soil
and ground water and consume the contaminants, releasing metabolic products. These
bacteria will be most effective in consuming the contaminant if the underground environ-

                                        1-6

-------
                                 INTRODUCTION
ment can be controlled, including controlling the dissolved oxygen and nutrient levels.
Biological treatment of ground water usually involves pumping ground water from down-
gradient @cations and injecting enriched ground water at upgradient locations. The
changes in the water table levels produce an underground flow carrying the nutrients to and
throughout the contaminated soil and aquifer. Progress of the treatment can be monitored
by sampling the water being pumped from the ground and measuring contaminant and
nutrient concentrations. Biological treatment can also be accomplished above ground  using

a bioreactor   as a component of a  pump-and-treat system


              Monitoring wells are placed in various patterns throughout, and possibly

beyond, the area of contamination. These wells can be used to sample ground water both
during treatment to monitor progress and after treatment to assess remediation success
using the  statistical methods discussed in this document.

1.3           Organization of  this Document


              The topics covered in each chapter of this document are outlined below.

              Chapter 2. Introduction to Statistical Concepts and Decisions:  introduces
                     terminology and concepts useful for understanding statistical tests
                     presented  in later chapters.

              Chapter 3. Specification of Attainment Objectives: discusses specification
                     of the attainment objectives in a way which allows selection of the
                     statistical procedures to be used.

              Chapter 4. Design of the  Sampling and Analysis Plan: discusses common
                     sampling plan designs and approaches  to the  analysis.

              Chapter 5. Descriptive Statistics: provides basic  statistical procedures
                     which are useful in all stages of the  remedial  effort.  The procedures
                     form a basis for the statistical procedures  used  for assessing
                     attainment.

              Chapter 6.  Deciding to Terminate Treatment Using. Regression Analysis:
                     discusses  statistical procedures which can aid  the decision-makers
                     who must decide when to terminate  treatment.

              Chapter 7. Approaching a Steady State After Terminating Remediation:
                     discusses statistical and nonstatistical  criteria for determining
                     whether the ground water system is at steady state and/or if
                     additional  remediation  might be required.
                                        1-7

-------
                                INTRODUCTION
              Chapter  8. Assessing Attainment Using  Fixed Sample Size Tests:
                     discusses statistical procedures based on fixed sample sizes for
                     deciding whether the concentrations in the ground water attain the
                     relevant cleanup standards
              Chapter  9. Assessing Attainment Using Sequential Tests: discusses
                     sequential  statistical  procedures  for  deciding  whether the
                     concentrations  in ground water attain the relevant cleanup standards.
              Worksheets: Provided for both practical  use at Superfund sites and as
                     examples of the procedures which in being recommended.
1.4           Summary

              This document provides a foundation for decision-making regarding site
cleanup by providing methods that statistically compare risk standa& with field data in a
scientifically defensible manner that allows for uncertainty. In particular, the document
provides statistical procedures for assessing whether the Superfund Cleanup Standards for
ground water have been attained. The document is written primarily for agency personnel,
responsible parties and contractors.  Many areas of expertise must be involved in any
remedial action process. This  document attempts to address only the statistical input
required for the  attainment decision.

              The statistical procedures presented in this document provide methods for
comparing risk based standards with field data in a manner that allows for assessing uncer-
tainty. The procedures allow  flexibility to  accommodate site-specific  environmental
factors.

              To aid  the reader, statistical calculations and examples arc provided in  boxes
separated from the text,  and  appendices  contain a glossary of commonly-used terms; statis-
tical  tables and detailed  statistical  information; worksheets for implementing procedures  and
calculations explained in the text.
                                        1-8

-------
  2. INTRODUCTION  TO STATISTICAL  CONCEPTS  AND
                                DECISIONS
              This document provides statistical procedures to help answer an important
question that will arise at Superfund sites  undergoing ground water remediation:

           "Do the contaminants  in the ground water in designated
                wells at the site attain the cleanup standards?"

The cleanup standard is attained if, as a result of the remedial effort, the previously unac-
ceptably high contaminant  concentrations are reduced to a level which is acceptable and can
 be expected to remain acceptable when judged relative to the cleanup standard.

              In order to answer the question above, the following more specific ques-
tions must be answered:

                    What contaminant(s) must attain the designated cleanup standards?
                    How is attainment of the cleanup standards to be defined?
                    What is the  designated cleanup standard for the contaminant(s) being
                    assessed? and
                    Where and when should samples of the ground water be  collected?

              This chapter discusses  each of these topics  briefly, followed by an intro-
duction to statistical procedures for  assessing the  attainment of cleanup standards in ground
water at Superfund sites. Also discussed are terminology and statistical concepts which are
useful for understanding the statistical tests presented in later chapters. Basic statistical
principles and topics which have  particular applicability to ground water  at Superfund sites
are also considered.

             Later chapters discuss in detail the specification of attainment objectives and
the implementation of statistical procedures required to determine if those objectives have
been met at the Superfund site.
                                       2-1

-------
   CHAPTER 2:  INTRODUCTION TO  STATISTICAL CONCEPTS  AND  DECISIONS

2.1          A Note  on Terminology

              This guidance document assumes that the reader is familiar with statistical
procedures and terminology, particularly the concepts of random sampling and hypothesis
testing, and the calculation of descriptive statistics such as means, standard deviations, and
proportions. An introduction to these statistical procedures can be found in statistical
textbooks such as Sokal and Rohlf (1981), and Neter, Wasserman, and Whitmore (1982).
The glossary provides a description of the terms and procedures used  in this document.

              In this document we will use the word clean as a short hand for "attains the
cleanup standard" and contaminated for "does not attain the cleanup standard."

              The term sample can be used in two different ways.  One refers to  a
physical water  sample collected  for laboratory analysis while the other refers to a collection
of data called a statistical sample. To avoid confusion, the physical water sample will be
called a physical sample or water sample.  Otherwise, the word  sample will refer to
a statistical sample i.e. a collection of randomly selected physical samples obtained for
assessing  attainment of the cleanup  standard.


2.2          Background for the Attainment  Decision

              In general, over time, a Superfund site will go through the following
phases:

                    Contamination;
                    Realization that a  problem exists;
                    Investigation to determine the extent of the problem;
                    Selection of a  remediation plan to alleviate the problem;
                    Cleanup (which may occur in  several steps);
                    Termination  of cleanup;
                    Final  determination that the cleanup has achieved the  required  goals;
                    and
                    Termination  of the remediation effort.
                                       2-2

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS:


              This document focuses on the post-cleanup phase and particularly on the
sampling and statistical procedures for determining if the site has attained the required

cleanup standards.
2.2.1
A Generic Model of Ground-Water Cleanup Progress
             During the planning and execution of remedial  action and the sampling and
analysis for assessing attainment, numerous activities must take place as indicated in the
following scenario and illustrated in Figure 2.1. This figure will be used throughout the
document to indicate to the reader at which step in the remedial process the procedures
being discussed in a chapter ate applicable. A discussion of each step follows Figure 2.1.
Figure 2.1    Example  scenario for contaminant measurements in one well during
             successful remediation action
                1.2

                  1

    Measured   °'8
     Ground    Q6
      Water      .
  Concentration  04

                0.2
                           Start
                        Treatment
                                Bid      Start
                            Treatment  Sampling
 End Sampling
Declare Clean or
 Contaminated
                                                  Date
(1)     Evaluate the site;      Although evaluation of the site and selection of the cleanup
       determine the         technology may require the use of several statistical
       remedial action to be  procedures, this document does not address this aspect of
       used                the remedial effort
                                       2-3

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND  DECISIONS
(2)
Perform remedial
cleanup
(3)
Decide when  to
terminate remedial
treatment
(4)
Assess when the
ground water
concentrations  reach
steady  state
(5)
Sample to assess
attainment
During a successful remedial cleanup, the concentrations of
contaminants can be expected to have a decreasing trend.
Due to seasonal changes, natural fluctuations, changes in
pumping schedules, lab measurement error, etc., the
measured concentrations will fluctuate around the trend.
Some statistical procedures that could be used to analyze
data during treatment are discussed  in Chapter 5.

Based on both expert knowledge of the ground-water
system and data collected during treatment, it must be
decided when to terminate treatment and prepare for the
sampling and analysis far assessing attainment. Statistical
procedures relevant to the termination decision are dis-
cussed in Chapter 6. Analysis  of data collected during
treatment may indicate that the cleanup standards will not
be achieved by the chosen cleanup methods, in which case
the cleanup technology and goals must be reassessed.

The  ground-water system will be disturbed from its  natural
level and flow by the treatment process, including perhaps
pumping or reinjection of ground water.  After treatment is
terminated, the transient effects will dissipate and the
ground-water levels and flows  will gradually reach their
natural levels.  In this process,  the contaminant concen-
trations may change in unpredictable ways.  Before  the
assessment is initiated, the ground water must be able to
return to its natural level and flow pattern, called steady
state, so that the data collected are relevant to assess condi-
tions in the future. Sampling and analysis during the
return to natural conditions are discussed in Chapter 7.
The ground water at a particular site will be considered to
have achieved steady state if the  assumption of steady state
is consistent with both statistical tests and the advice of a
hydrogeologist familiar with  the site. The attainment
sampling can begin once it is determined that the site is at
steady  state.

After the water levels and flows have reached steady state,
sampling to assess attainment of the cleanup standards can
begin.  Statistical procedures for assessing attainment  are
presented in Chapters 8 and 9. The statistical tests used
may be either fixed sample size tests or sequential tests. At
many sites sequential  tests will probably be preferred.
During the assessment phase, measured concentrations are
expected to either fluctuate around a constant or gradually
decreasing concentration. If the measurements consistently
increase, then  either the ground-water system is not at
steady state or there is reason to believe that the sources of
contamination have not been  adequately cleaned up. In this
situation, a reassessment of the data is required to deter-
mine if more time must pass  until the site is at steady state
or if additional  remedial  activity is required.
                                       2-4

-------
   CHAPTER 2: INTRODUCTION TO STATISTICAL  CONCEPTS  AND DECISIONS


(6)    Based on  statistical     If the cleanup standard has been attained, implementation,
       tests, determine if the   of periodic sampling to monitor  for unanticipated problems",
       cleanup  standard has   is recommended. The attainment decision  is based on
       been attained or not.    several assumptions.  From a statistical perspective, the
                             purpose of periodic monitoring after attainment is to check
                             the validity of the assumptions. If the  attainment  objectives
                             have not been met, the cleanup technology and goals must be reassessed.
              Different statistical procedures are needed at different steps in this process.

The statistical procedures which are helpful in dedetermining whether to terminate treatment
arc different from those used in the attainment  decision. In all aspects of the site investiga-
tion and remediation, statistical procedures may be required that are not addressed in this
document. In this case, consultation with a statistician familiar with ground-water data is

recommended


              This  document takes the  approach that:


                     A decision that the ground water in the wells attains the cleanup
                     standard requires the assumption that the ground water can be
                     expected to continue to attain the cleanup standards beyond the
                     termination of sampling,  and

                     Data collected while the ground-water system is disturbed by treat-
                     ment cannot reliably predict  concentrations after steady state has
                     been achieved. Therefore, it is recommended that the ground-water
                     system return to steady state before the sampling for assessing
                     attainment commences.  The data gathered prior to reaching steady
                     state  can be used for guidance  in selecting the statistical procedure  to
                     employ for assessing attainment
2.2.2        The Contaminants to be Tested


              In general, multiple contaminants will be identified  at the site prior to reme-
dial action.  The mixture of contaminants which are present at any one time or place will

depend on many  factors.


              The discussion in this document assumes that relevant regulatory agencies

have specified the contaminants which arc to be used to assess attainment. Conclusions

based on the statistical procedures introduced in this document apply only to the com-

pounds actually  sampled  and the corresponding data analyzed in the statistical tests.
                                        2-5

-------
    CHAPTER  2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
2.2.3        The Ground-Water System to be Tested

              Contamination in ground water is measured from water samples collected
 from wells at specified locations and times. The location of the wells, the times and
 frequency  of the  sampling,   and the assumptions behind the analyses will affect the interpre-
 tation of the statistical results.

               This document assumes that the attainment decision will be based on
 samples from established wells, This document does not make recommendations on where
 to locate wells for sampling. However, decisions must be made on which wells arc to be
 used for the assessing attainment. Because wells arc not randomly located throughout an
 aquifer, the statistical  conclusions  strictly apply only to the water obtained from the selected
 wells and  not to the aquifer in general.  Conclusions about the aquifer must be based on a
 combination of statistical results for the sampled wells and expert knowledge or beliefs
 about the  ground-water  system and not on statistical inference.

               Because  of the high cost of installing  a new well and the possibility of using
 information from previous investigation stages, this  document assumes that the location of
 wells has been specified by experts in ground-water  hydrology and approved by regulatory
 agencies who arc familiar with the contamination data at the site.

               Interpretation of the results of the  statistical analysis will depend on a
 judgment  as to whether the wells are in the correct place. If it is necessary to test the
 assumptions used to select wells,  additional wells will have to be established and sampled.
 In this case, consultation with a statistician is  recommended.
 2.2.4         The Cleanup  Standard

               The cleanup standard is the criterion set by EPA against which the measured
 concentrations are compared to determine if the ground water at the  Superfund site is
 acceptable or nor. If the ground water meets the cleanup standard, then the remediation
 efforts are judged to be complete. The specification of the cleanup standard by EPA or
 another regulatory agency may be different for  different sites and for different chemicals or
 mixtures of chemicals.  With  a  mixture of contaminants, the cleanup standard may  apply to
                                        2-6

-------
  CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS

an aggregate measure, or, in complex mixtures, the ground water may be required to meet
the cleanup standard for every contaminant present. For more information, see Guidance
on Remedial Actions for  Contaminated Ground  Water at Superfund Sites (EPA,  1988).


2.2.5        The Definition  of Attainment

              In order to determine if the contaminant concentrations at the site attain the
cleanup standard one must carefully define what concentration is to be compared to the
cleanup standard and what criteria are to be used to make the comparison for assessing
attainment. This document assumes that either the average concentration or a selected
percentile of the concentrations is to be compared to the cleanup standard.  The examples in
the text usually use the  average concentration. The ground water in a well attains the
cleanup standard if,  based on statistical tests, it is unlikely that the  average concentration (or
the  percentile) is greater than the cleanup Standard.

              The  statistical procedures for assesing the attainment of the cleanup  stan-
dard use a basic statistical technique called hypothesis testing. To show that the ground
water in the selected wells is actually below the cleanup standard (i.e.,  attains the cleanup
standard), we assume that the water in the wells does not attain the cleanup standard. This
assumption is  called the null hypothesis.  Then data arc collected. If the  data arc suffi-
ciently inconsistent with the null hypothesis, the null hypothesis is rejected and we con-
clude that the water in the well attains the cleanup standard.

              The  steps  involved in  hypothesis testing are:

               (1)    Establish the null hypothesis, "The contaminant concentrations in
                     the select+ wells do  not  attain the  applicable cleanup  standard";
               (2)    collect data; and
               (3)    Based on the  data, decide if the ground water attains the cleanup
                     standard:
                     (a)    If the data are  inconsistent  with the null hypothesis, conclude
                            that there  is  sufficient evidence to reject the null  hypothesis.
                            Accept the alternate hypothesis that  the contaminant concern-
                            trations attain the applicable cleanup standard, i.e., conclude
                            that the ground  water is clean.
                     (b)    Otherwise, conclude that there is insufficient evidence to
                            reject the  null hypothesis  and that the contaminant concentra-

                                         2-7

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

                            tions do not attain the cleanup standards, i.e., conclude that
                            the  ground  water is  contaminated.

             - To be technically correct, the results of the hypothesis test indicate whether
the null hypothesis can be rejected with  a  specified level of confidence. In practice, we
would conclude that the concentrations  do  or do not attain the cleanup standards and act as
if that conclusion were known as fact rather than subject to error. Therefore to avoid the
verbose but technically correct wording above, the results of the hypothesis tests will be
worded as concluding that the concentrations either attain or do not attain the cleanup
standard.

              When specifying simplified Superfund site cleanup objectives in consent
decrees, records of decision, or work  plans, it is extremely important to say that the site
shall be cleaned up until the sampling program indicates with reasonable  confidence that the
concentrations of the contaminants at the entire site are less than the cleanup standard.
However, attainment is often wrongly described by saying that concentrations at the site
shall not exceed the cleanup standard.
2.3           Introduction  to  Statistical Issues For Assessing Attainment

              This section provides a discussion of some basic statistical issues with an
emphasis on those with specific application to assessing attainment in ground water. This
discussion provides a general background for the specification of attainment objectives in
Chapter 3  and the  statistical procedures presented in Chapters 4 through 9.
2.3.1        Specification of the Parameter to be  Compared to the Cleanup
              Standard
              In order to define a statistical test to determine whether the ground water
attains the cleanup standard, the characteristics of the chemical concentrations to be com-
pared to the cleanup standard must be specified.  Such  characteristics  are called parameters.
The choice of the parameter to use when assessing attainment at  Superfund sites may
depend on site specific characteristics and decisions and has not, in general, been specified
by EPA.
                                         2-8

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL  CONCEPTS AND  DECISIONS

              The parameters discussed in this document are the mean  or average concen-
tration and a selected percentile  of the concentrations.  For example, the  rule for deciding if
the ground water attains the cleanup standard might be: the ground water is considered
clean (orremediated) if the mean concentration is below the cleanup standard based on a
statistical test. The following sections define parameters for distributions of data and the
statistical properties of these  parameters. An understanding of these properties is necessary
for  determining the appropriate parameter to test


              The Distribution of Data Values

              This section discusses the characteristics of concentration distributions
which might be expected at Superfund sites and how the distribution of concentrations in
the ground water can be described using parameters.  These topics are  discussed in more
detail-in Volume  I (Sections 2.8  and 3.5).

              Consider  the set  of concentration measurements  which  would be obtained if
all possible ground-water  samples from a particular monitoring well over a  specified period
of time could be collected and analyzed. This set of measurements is called the popula-
tion of ground-water sample measurements. The set of ground-water samples comprising
the population may cover a fixed period of time, such as one year, or an unlimited time,
such as all future measurements. The set of ground-water measurements can be described
mathematically and graphically  by the "population distribution function" referred to as the
"distribution of the data". Figure 2.2 shows a plot of the population distribution for data
from three hypothetical distributions. The vertical axis shows the relative proportion of the
population measurements at each concentration value on the horizontal axis. In the plots,
the areas under the curve between any two points on the concentration axis represents the
percentage of the ground-water measurements that have concentration values within the
specified range.

              Two distributions, the normal and lognormal distributions, will be used as
examples in the  following discussion. Both the normal and lognormal distributions are
useful in statistical work and can be used to approximate the concentration distributions
from wells at Superfund sites. Figure 2.2 shows an  example of a normal and a lognormal
distribution.
                                        2-9

-------
  CHAPTER 2:  INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS
Figure 2.2     Measures of location: Mean, median, 25th percentile, 75th percentile, and
              95th percentile for three hypothetical distributions
 -1
                          Hypothetical Distribution
 3456
Concentration ppm
                        Lognormal Distribution
                                                      Legend:
                                                        Measures of Location:
 25th Percentik


Median (50th Percentile)


 Mean


  75th Percentile


  95th Percentile
                                                           Measure of Spread:

                                                              ± 1 Standard Deviations
                                                                  Around the Mean
                 2345
                    Concentration ppm
                                      2-10

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL  CONCEPTS AND DECISIONS
              Summary measures describing characteristics of the population distribution
are referred to as parameters or population parameters. Three important characteris-
tics of the data described by these parameters:

              •      The location of the data;
              •      The spread (or dispersion) of the data; and
              •      The general shape or "skewness" of the data distribution.


              Measures 'of Location

              Measures of location (or central tendency) are often used to describe where
most of the data lie along the concentration axis of the distribution plot. Examples  of such
measures  of location are:

                    "The mean (or average) concentration  of all ground-water samples is
                     17.2 ppm" (i.e., 17.2 is the mean concentration);
                    "Half the ground-water samples have  concentrations  greater  than 13
                    ppm and half less than 13 ppm" (13  is the median  concentration);
                    or
                    "Concentrations of 5 ppm (rounded to the nearest unit) occur more
                    often than any other concentration value" (the mode is 5 ppm).

              Another measure of location is the percentile. The Qth percentile is the
concentration which separates the lower Q percent of the ground-water   measurements from
the upper 100-Q percent of the ground-water measurements. The median is a  special
percentile, the 50th percentile. The 25th percentile is the concentration which is greater
than the lowest 25 percent of the ground-water measurements and less than the remaining
75 percent of the ground-water measurements. Figure 2.2 shows the mean, median, 25th
percentile, 75th percentile, and 95th percentile for three  distributions introduced previously.

              Throughout this document, the Greek letter, JL,  (spelled  "mu" and pro-
nounced  "mew") will be used to denote the population mean. The median  will be denoted
       and the Qth percentile will be denoted by XQ.
                                       2-11

-------
   CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPT'S AND DECISIONS

               Measures of Spread

               Measures of spread provide information about the variability or dispersion
 of a set of measurements.  Examples of different measures  of spread are:

                     The standard deviation or the variance (the square  of the
                     standard deviation). The population standard deviation is denoted
                     by the Greek letter, o, (pronounced "sigma") throughout this docu-
                     ment.  If data are normally distributed, two-thirds of the data are
                     within one standard deviation of die mean;
                     The coefficient of variation is the ratio of the standard deviation
                        .   .     a   ,
                     to the mean, —, and
                                 H
                     The interquartile range is the difference between  the 75th and
                     25th percentiles  of the distribution.

              For each distribution in Figure 2.2, the mean and the range of plus and
 minus one standard deviation around the mean are  shown on the  plots.

              Measure of Skewness

              Skewness is a measure of the extent to which a distribution is  symmetric or
 asymmetric. A distribution is symmetric if the shape of the two halves are mirror images of
 each other about a center line. One common symmetric distribution is the normal distribu-
 tion, which is often described as having a "bell-shape." Many statistical tests assume that
the sample measurements are normally distributed (i.e., have a normal distribution).

              The distribution of concentrations is not likely to be symmetric. It may be
 skewed to the right. That is, the highest measurements (those to the right on the plot of the
 distribution function) are farther from the mean concentration than are the lowest concen-
 trations. Ground-water measurements often have a skewed distribution which can be
 approximated by a lognormal distribution (see Gilbert 1987,  for additional  discussion of
 the normal and lognormal distributions). Note that for right skewed distributions (e.g., the
 lognormal distribution in Figure 2.2) the mean is greater than the  median.

              The three distributions shown in Figure 2.2 have the same mean and stan-
 dard deviation. Note, however, that the occurrence of particularly  high or low concentra-
                                       2-12

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS  AND DECISIONS

 tions differs for the three distributions. In general, the more skewed the distribution, the
 more likely are these extreme observations.


              Selecting the Parameter to Compare to the Cleanup Standard

              In order to determine if the contaminant concentrations attain the cleanup
 standard the measure of location which is to be compared to the cleanup standard must be
 specified. Even though the true distribution is unknown, the specified measure of location,
 or parameter of interest, can be  selected based  on:

                    Information  about the distribution from preliminary data;
                    Information about the behavior of each parameter  for different
                    distributions;
                    The effects  of various concentrations of the contaminant on human
                    health and the environment; and
                    Relevant  criteria far protecting human health and the environment.

              Chapter 3 discusses in more detail the selection of the mean or a percentile
 to be compared to the cleanup standard.
2.3.2        Short-term Versus Long-term Tests

              Due to fluctuating concentrations over time, the average contaminant
concentration over a short period of time may be very different from the average over a
long period of time. Figure 2.3 shows a hypothetical series of weekly ground-water
concentration measurements collected over a period of 70 weeks (about 16 months). The
figure shows the weekly concentration measurements,  the average concentration for weeks
21 through 46  (6 months),  and  the long-term average  concentration which  is obtained from
data collected over 50 years (only a portion of which is shown here). From the figure, it
can be seen that the short-term average concentration can be very different from the long-
term average.
                                       2-13

-------
   CHAPTER  2: INTRODUCTION  TO  STATISTICAL CONCEPTS AND  DECISIONS
 Figure 2.3    Illustration of the difference between a short- and long-term mean
              concentration
                                                               Long-
                          10     20    30    40
                                     Weeks
50    60     70
              The  short-term average  is estimated using data collected during the period of
 interest, in this example during weeks 21 through 46.  Similarly the longer term average
 can be estimated based on data collected over the longer period of interest, perhaps 50
 years. Fortunately,  by using information on the correlation of the measurements across
 time, it is usually possible to  estimate the long-term average concentration from data
collected over a limited period of time. In  order to estimate  the average  concentration for a
 period which is longer than the data collection period, assumptions must be made which
 relate the unmeasured future concentrations to the concentrations which are actually
 measured.  These  assumptions are stated in  terms of a model for the data.

              Statistical decisions and estimates that only  apply to the sampling period arc
 referred to here as "short-term" estimates and are presented in Chapter 4. Decisions and
 estimates that apply to the foreseeable future are called "long-term" estimates. The long-
 term  estimates are made based on the assumption that  the ground-water  concentrations  will
 behave in a predictable manner. The  assumptions take into account the expected natural
 fluctuations in ground-water  flows  and contaminant concentrations.

              In this document the ground water is said to attain the cleanup standard only
 if the concentrations attain the cleanup standard for the  foreseeable (or at least predictable)
 future.  Thus, long-term estimates and procedures are used to assess attainment. Short-
 term estimates can be  used to make interim management decisions.
                                        2-14

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL  CONCEPTS AND DECISIONS
2.3.3        The Role of  Statistical Sampling and  Inference  in Assessing
              Attainment
              When assessing attainment, it is desirable to compare the population mean
(or population  percentile or  other parameter)  of the concentrations to the cleanup standard.
However, the data for assessing attainment arc derived from a sample, a small proportion
of the population.  Statistical inference is used to make conclusions about the population
parameter from the sample measurements.  For illustration, the  following discussion
assumes that the population mean must be less than the cleanup  standard if we arc to
conclude that the water in the well attains the cleanup standard.

              The mean concentration calculated from the sample  data provides an esti-
mate of the population mean. Estimates of concentration levels computed from a statistical
sample are subject to "error" in part because they arc based on only a small subset of the
population. The use of the term "error" in this context in no way implies that then are
mistakes in the data.  Rather, "error" is a short hand way of saying that there is variability
in the sample  estimates from different samples.  There are two components to this error
sampling error and lab, or measurement, error.

                     Different samples will yield different estimates of the parameter of
                     interest due to  sampling error.
                     Unknown factors in the handling and lab analysis procedures result
                     in errors or variation in the lab measurements, i.e., two lab analyses
                     of the same ground-water sample will usually give slightly different
                     concentration values.  This  difference is attributed to lab error or
                     measurement error.

              Because the sample mean is subject to error, it cannot be directly compared
to the cleanup  standard to decide if the population mean is less than the cleanup standard.
For example, just because the mean for a particular sample  happens to be below the cleanup
standard does  not mean that the standard has  been attained.  To make meaningful infer-
ences, it is necessary to obtain a measure of the error (or expressed another way, the preci-
sion) associated with the sample mean. An estimate of the error in the sample mean can be
calculated from the sample and is referred  to as the  standard error of the mean. It is a
1The possible bias in the measurements is assumed to be zero. The quality assurance plan should address
  the problems of possible bias.

                                       2-15

-------
    CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS

 basic measure of the absolute variability  of the  calculated sample mean from one sample to
 another.

               The standard error of the mean can be used to construct confidence
 intervals around a sample mean using equation (2.1) in Box 2.1. Under general condi-
 tions, the interval constructed using equation (2.1) will include the population mean in
 approximately 95 percent of all samples collected and is called a "95 percent two-sided
 confidence interval."  This useful fact follows from the Central Limit Theorem which
 states that, under fairly general conditions, the distribution of the sample mean is "close" to
 a normal distribution even though we may not know the distribution of the original  data.
 Note also that the validity of the confidence interval given in Box 2.1 depends on the data
 being independent in a statistical sense.  Independent ground water measurements are
 obtained when the sample collection times are randomly selected within the sampling
period.

              When assessing attainment, a two-sided test would be used for pH because
 both high and low values represent pollution. For most other pollutants, use one-sided
 confidence intervals because only high values  indicate pollution. A 95 percent one-sided
 confidence interval can be obtained from equation (22) in Box 2.1. The interval from zero
 (the lowest possible measurement) to this upper endpoint will also include the  population
 mean in approximately  95 percent  of all samples collected.
                                      Box 2.1
          Construction of Confidence Intervals Under  Assumptions  of Normality
        To construct a 95 percent two-sided confidence interval around a sample
        mean:
              lower endpoint = sample mean - 1.96 * standard error and
              upper endpoint = sample mean + 1.96 * standard error.      (2.1)

        To construct a 95 percent one-sided confidence interval:
              upper endpoint = sample mean + 1.65 * standard error.      (2.2)
              Using confidence intervals, the following procedure can be used to make
 conclusions about the population mean based on a sample of data:
                                       2-16

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS


              (1)    Calculate the  sample mean;
              (2)    calculate the standard error of the sample mean;
              (3)    Calculate the upper endpoint of the one-sided confidence interval;

              (4)    If the upper endpoint of the confidence interval is below the cleanup
                    standard, then conclude that the ground water attains the cleanup
                    standard, otherwise conclude that the ground water does not attain
                    the cleanup  standard.

A 95 percent confidence interval will not  cover the population parameter in 5 percent of the
samples. When using the confidence interval to assess attainment, one will incorrectly
concluded that the ground water attains the cleanup standard in up to 5 percent of all
samples. Thus, this procedure is said to have a false positive rate of 5 percent. This false
positive  rate is  discussed in detail in the next section.
2.3.4        Specification   of Precision  and  Confidence  Levels  for
              Protection Against Adverse Health and Environmental  Risks

              The validity of the decision that a site meets the cleanup standard depends
on how well the samples represent the ground water during the period of sampling, how
accurately the samples are  analyzed, and the criteria used to define attainment.  The true but
unknown condition is that the ground water is either clean or contaminated. Similarly,  the
decisions made using the statistical procedures will result in an attainment  or non-attainment
decision. The relationship between these two conditions is shown in Table 2.1.
                                      2-17

-------
  CHAPTER 2:  INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Table 2.1      False  positive and negative  decisions
Decision based on a
statistical sample
Clean
Contaminated
True condition in the well:
Clean (Attains the
cleanup standard)
Correct decision
False negative
decision
Contaminated (Does
not attain the cleanup
standard)
False positive
decision
correct decision
              As a result of the sampling and measurement uncertainty, one may decide
that the site is clean when it is not. In the context of this document, this mistaken conclu-
sion is referred to as a false positive finding (statisticians refer to a false positive as a
'Type I error"). There are several points to make regarding false positives:

                     Reducing the chance of a false positive decision helps to protect
                     human health and  the environment;
                     A low false positive rate does not come without cost. The additional
                     cost of lowering false positive rates comes  from taking additional
                     samples and using  more precise analysis methods;
                     The definition of a false p9sitive in this document is exactly the
                     opposite of the more familiar  definition of a false positive under
                     RCRA detection and  compliance monitoring.

              In order to design a statistical test for assessing attainment, those specifying
the sampling and analysis objectives must  select the maximum acceptable false  positive rate
(the maximum probability of a false positive decision  is denoted by the Greek  letter alpha,
a). It is usually set at, levels  such as 0.10, 0.05, or 0.01 (that is 10%. 5%, or 1%),
depending on the potential consequences of declaring that the ground water is clean when
in fact it is not. While different false positive rates can be used for each chemical, it is
recommended that the same rate be used for all chemicals being investigated. For a further
discussion of false positive rates,  see Sokal and  Rohlf (198  1).
                                       2-18

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL  CONCEPTS  AND  DECISIONS
              The converse of a false positive decision is a false negative decision (or
Type II error), the mistake of concluding the ground water requires additional treatment
when, in fact, it attains the cleanup standard This error results in the waste of resources in
unnecessary treatment. It would be desirable to minimize the probability of false negative
decisions as well as false positive decisions. The Greek letter beta (P) is used to represent
the probability  of a false negative  decision.

              If both a and P can be reduced, the percentage of time that the correct deci-
sion will be made will be increased. Unfortunately, simultaneous reduction usually can
only be achieved by increasing  sample size (the number  of samples collected and analyzed),
which may be expensive.

              The probability of declaring the  ground water to be clean will depend on the
true mean concentration of the ground water.  If the population mean is above the cleanup
standard, the ground water will  rarely be declared clean (this will only  happen if the  partic-
ular sample chosen has a large associated sampling and/or measurement error).  If the
population mean is much smaller than the cleanup  standard, the ground water will almost
always be judged to be clean.  This relationship can be plotted for various values  of the
population mean as in Figure 2.4. The plot shows the probability of declaring the ground
water to be clean as a function of a hypothetical population mean, and is referred to  as a
power  curve.  For practical purposes, in this volume the probability  of declaring the site
clean is the "power of the test." The following assumptions were made when plotting the
example power curve in  Figure 2.4: the false positive rate is 5%, the false negative  rate
when the true mean, ji], is 0.6  is 20%, and the cleanup standard is 1.0.

              If the population mean concentration is equal to or just above the cleanup
standard  (i.e., does not attain the cleanup  standard),  the  probability of  declaring the ground
water to be clean is a; this is the maximum false positive rate.

              For the specification of the attainment objectives (discussed in Chapter 3),
the acceptable probabilities of a false positive and false negative decision must be specified.
Based on these values and the selected statistical procedures, the required  sample size can
be  calculated.
                                        2-19

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS
Figure 2.4    Hypothetical power curve
  Probability
  of deciding
    die ate
  attains the
   cleanup
   standard
              0.8
                   \ False negative rate of
                         at a mean of .6 ppm
0.6
0.4 •
              0.2
             Power at ^^is 80%
Cleanup
Standard
                          0.2       0.4       0.6       0.8        1
                               Population mean concentration, ppm
2.3.5
Attainment Decisions  Based on  Multiple Wells
              The ground water will be judged to attain the cleanup standard if the con-

taminant concentrations in the selected wells are sufficiently low compared to the cleanup

standard. Below are two possible ways in which the attainment decision can be based on

water samples  from  multiple  wells:


                     Assess each well individually: make a separate attainment decision
                     for each well; conclude that the ground water at the site attains the
                     cleanup standard if the ground water in each tested well attains the
                     cleanup standard.

                     Associate selected wells into groups: collect samples in all wells in
                     a group at the same time, combine the results from all wells in the
                     same group into one summary statistic for that time period; conclude
                     that the ground water represented by each group attains the cleanup
                     standard if the summary statistic attains the cleanup standard.
                     Conclude that the ground water at the site attains the cleanup stan-
                     dard if the summary statistics from all groups attain the standard.


              The choice of assessing wells individually or as a group has implications for

the interpretation of the statistical results and the false positive  and  false  negative probabili-

ties for deciding that the site, as opposed to the well, attains the cleanup standard. These

issues are discussed in more detail in the following three sections.
                                       2-20

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND  DECISIONS
              Assessing Multiple Wells Individually


              When assessing each  well individually, slightly  different  criteria can be used
for each attainment decision. For example, different sample  collection schedules can be
used for each well. Assessing each well individually may  require substantially fewer
samples than assessing the wells as a group, depending on the concentrations in the wells.
              The attainment decisions for each individual well must be combined to make
an attainment decision for the entire site.  The only procedure discussed in this document
for combining the results from assessments on individual wells is to conclude that the
ground water at the site attains the cleanup standard only if the ground water in each well
attains the cleanup standard

              If many wells are tested the site will not attain the cleanup standard if any
one of the wells does not attain the standard. Even if all wells actually attain the cleanup
standard, the more wells used to assess attainment, the greater the likelihood of a false
negative decision in one well, resulting in an overall non-attainment decision.  On the other
hand, assessing all wells individually can  result in significant  protection for human health
and the environment because all concentrations must attain the cleanup standard in spite of
false negative  decisions.  Implicit in the above discussion is the conflict of protecting the
public health versus the cost of possible overcleaning are  over attainment.


              Testing Multiple Wells  as  a Group

              When multiple wells are tested as a group, samples must be collected in
each well at the same, time and thus the same number of samples will be collected in all
wells within a group. At each sample time, the measurements from each well are combined
into a summary statistic. The ground water  in the group of wells would be declared to
attain the cleanup standard if the summary statistic was significantly less than the cleanup
standard. Several methods can be used to  combine the measurements from all tested wells
at each sample time into one  summary statistic. Two methods arc:

                     Average of measurements from all wells within a group; and
                     Take the maximum  concentration  across  all wells within a group.

                                       2-21

-------
  CHAPTER  2: INTRODUCTION TO STATISTICAL  CONCEPTS AND DECISIONS
              If the average across all wells must be less than the cleanup standard, then
the site may be declared clean if the concentrations in some wells are substantially greater
than the cleanup standard as long as concentrations in other wells arc much less than the
cleanup  stand&d.  These differences among wells in a groups can sometimes be minimized
by grouping wells with similar concentration levels. On the other hand, requiring that the
maximum concentration across all wells attain the cleanup standard assures that each well
individually  will  attain  the standard.

              If the average concentration across all wells is to be compared to the cleanup
standard, a decrease in lab costs may be achieved by compositing the water samples across
wells (and possibly across time)  and analyzing the contaminant  concentrations in the
composite samples.  Since the recommended number of samples to be composited and the
length of the sample period will depend on the  serial correlation of the data and several cost
and variance estimates, consultation with a statistician is recommended if compositing is
considered.
              Multiple Statistical Tests

              When assessing attainment in multiple wells (or groups of wells) and when
assessing attainment far  multiple  chemicals,  two probabilities  are of interest: the probability
of deciding that one compound in one well (or group of wells) is clean and the probability
of deciding that all compounds in all wells (or groups of wells) are  clean. The following
discussion will be phrased in terms of testing individual wells. However, it also applies to
testing groups of  wells.

              For an individual statistical decision on one compound or well, the maxi-
mum probability  of a false positive decision is denoted by the Greek letter alpha, a. This
may also be called the comparison-wise alpha. When multiple chemicals or wells are
being assessed, the overall alpha or experiment-wise alpha is the maximum  probability
of incorrectly declaring that the all compounds in all ground water wells at the site attain the
cleanup standard.*   In this document it is assumed that the site will be declared to have
'Note that the procedures discussed here for assessing the attainment of the site from the results of multiple
  statistical tests are different from the typical presentations on "multiple comparison tests" or "experiment-
  wise  versus comparison-wise tests" presented in many introductory statistics textbooks which use a
  different null hypothesis.  Here all tests, rather than any single test, must have a significant result.

                                        2-22

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS


attained the cleanup standard only if all contaminants tested attain their specified cleanup
Standard


              The probability of deciding that all compounds in all wells  attain the cleanup
standard, i.e., the overall a, depends on the number of statistical tests performed.  If wells

are assessed individually, more statistical tests will be performed than when assessing

wells as a group. Thus, the decision on whether to group wells is related to the selection of
the probabilities of a false positive or false negative decision.


              The overall probability of declaring that a site has attained the cleanup

standard depends on the:

                     Number of contaminants and wells being assessed

                     Concentrations of the  contaminants being assessed;

                     Statistical tests being used  for the  individual contaminants;

                     Correlation between the concentration measurements of different
                     contaminants in the same wells and contaminants in different wells;
                     and

                     Decision rules for combining the statistical results from each
                     contaminant and well to  decide if  the overall  site  attains the cleanup
                     Standard,


Although the  calculation of the overall probability of declaring the  site to  attain the cleanup
standard can be difficult, the following general conclusions can be stated when using the

rale that all contaminants    (or wells) must attain the cleanup standard:

                     The probability of incorrectly deciding that the site attains the
                     cleanup standard, the  overall  alpha, is always less than or equal to
                     the maximum probability of mistakenly deciding that any one
                     contaminant (or well)  attains its cleanup standard (comparison-wise
                     alpha).

                     As the number of contaminants being assessed increases, the
                     probability of deciding  that the site is clean  decreases,regardless  of
                     the true status of the site.


              Choice of a strategy for combining the results from many statistical tests

involves both policy  and statistical questions. As a result no general recommendations can
                                       2-23

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS

 be made in this document. When many  contaminants or wells arc being assessed, consul-
 tation with a statistician is recommended.


 2.3.6       Statistical Versus Predictive Modeling

              A model is a mathematical description of the process or phenomenon from
 which the data are collected. A model provides a framework for extrapolating from the
 measurements obtained during the data  collection period to other periods of time and for
 describing the important characteristics of the data. Perhaps most importantly, a model
 serves as a formal description, of the assumptions which are being made about the data.
 The choice of statistical method used to analyze the data depends on the nature of these
 assumptions. (See Appendix D for a discussion on modeling the data.)

              Mathematical (deterministic) models can be used to predict or simulate the
 contaminant concentrations, the effect of treatment on the contaminants, the time required
 far remediation, and the remaining concentrations after remedial action. These models are
 referred to here as predictive models.  To predict future concentrations these models typi-
 cally use (1) mathematical formulae describing the flow of ground water and contaminants
 through porous or fractured media, (2) boundary conditions  to specify the conditions at the
 start of the simulation (often based on assumptions), and (3) assumptions about the aquifer
 conditions. Predictive models are powerful tools, providing predictions in a relatively
 short time with minimal cost compared to the corresponding field sampling. They  allow
 comparison of the expected results of different treatment alternatives. However, it is
 difficult  to determine the probability of correctly or incorrectly  deciding if the ground water
 attains the cleanup standard using predictive models, in part, due to the many assumptions
 on which the models are based.

              On the other hand, the statistical models and procedures discussed in this
document arc based on very few assumptions and can be used whether or not predictive
 models have been applied at the site. The statistical procedures can also be used as a check
 on the predictive models.  Unlike the predictive models, the  statistical models presented in
this document for assessing attainment only use measurements from the period after
remedial action has been terminated.
                                      2-24

-------
  CHAPTER 2:  INTRODUCTION  TO  STATISTICAL CONCEPTS  AND DECISIONS

              While this document makes the assumption that the attainment  decision will
be based on statistical models and procedures,  predictive models  and  data collected prior to
the sampling for the attain-t decision provide a guide as to which wells are to be used
for assessing attainment, when to initiate an evaluation, and what criteria are to be used to
define attainment of the cleanup standard. If predictive models are used in other ways for
the attainment decision, consultation with a statistician is recommended. Due to the
complexity of both site conditions and predictive modeling, other procedures which might
be used to combine the results of predictive and statistical models are beyond the scope of
this document
2.3.7        Practical  Problems  with the  Data Collection  and Their
              Resolution
              With any collection of data there are possible problems which must be
addressed by the statistical procedures. The problems discussed below are: measurements
below the detection limit, missing data and very  unusual observations, often called
"outliers."
              Measurements  Below the Detection Limit

              The detection limit for a laboratory measurementprocedure is the lowest
concentration level which can be determined to be different from a blank. Measurements
which arc below the detection limit may be reported in one of several different ways
(Gilbert 1987). For example:

                    A concentration value, with the notation that the reported concentra-
                    tion is below the detection limit;
                    Less than a specified detection limit;  or
                    Coded as  "below  the detection limit" with no  concentration or  detec-
                    tion  limit specified.

              Special procedures arc required to use the below-detection-limit mesure-
mets in a statistical analysis. If, due to poor selection of the laboratory analysis method or
unanticipated problems  with  the  analysis, the cleanup standard is below the  detection limit,
the  possible statistical  procedures which might be used to compare the concentrations  to the
                                       2-25

-------
    CHAPTER 2:  INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

  cleanup standard are very limited and required many assumptions which are difficult to
  justify. As a result, this document only addresses the situation where the cleanup standard
  is greater than the detection limit.

               For all of the procedures described in this  manual, the following procedures
  for  handling belowdetection-limit  measurements  are  recommended:

               Whenever the measured concentration for a given water sample is reported
               by the laboratory, use this concentration in the analysis even thougn it is
               below  the detection limit;
               When the concentration is reported as less than a specified detection limit,
               use the value at the detection limit as the measured concentration in the
               analysis; and
               When  the laboratory reports that the chemical concentration is "below the
               detection limit" with no specified detection limit,  contact the analytical
               laboratory to determine the minimum  detectable value, and use this value in
               the analysis. Do not  treat below-detection-level  measurements   as missing.

               Using the detection limit for values below the detection limit is conservative;
  i.e., errs in favor of minimizing health and environmental risks. Other methods of
handling below-detection-limit problems can  be used, but are more  difficult to implement
and have the potential of erring in the opposite direction.  Selection of a method can be
  dependent upon the proportion of non-detects. Alternative procedures  should be investi-
 gated and assessed as to how data are affected Some of these alternative procedures are
  discussed in the  following  references on  detection limit problems: Bishop,  1985; Clayton et
  al., 1986; Gilbert, 1981; Gilliom  and Helsel, 1986;  Helsel and Gilliom, 1986; and Gleit,
  1985.
               Missing Values

               Missing  concentration values are different from below-detection measure-
 ments in that no information about the missing concentration (either above or below the
 detection level) is known.  Missing values may be due to many factors, including either (1)
 non-collection of the scheduled sample. (2) loss of the sample before it is analyzed due to
 shipping or lab problems, or (3) loss of the lab results due to improper recording of results
 or loss of the data records.
                                         2-26

-------
   CHAPTER  2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

               In general, this problem can be minimized with appropriate planning and <•
 backup  procedures and by using a proper chain of custody procedures, careful
 packaging and handling, clear labeling, and keeping copies of important records.

               If the sample is lost shortly after collection, it is recommended that another
 sample be collected immediately to replace the lost sample as long as the time between the
 lost and replacement sample is less than half the time between successive samples specified
 in the sample design. Any deviations to the sampling design, including lost and replace-
 ment samples should be reported with the data and analysis. The replacement or substitu-
 tion of missing data by numerical values is never recommended.


               Outliers

               In many statistical texts, measurements that are (1) very large or small
 relative to the rest of the  data,  or (2) suspected of being unrepresentative  of the true concen-
 tration at the sample location are often called "outliers." Observations which appear to be
unusual may correctly represent unusual concentrations in the field,  or may result from
 unrecognized handling problems, such as contamination, lab measurement, or data
 recording errors.  If a particular observation is suspected to be in error,  the error should be
 identified and corrected, and the corrected value used in the analysis.  If no such verifica-
 tion is possible, a  statistician  should be consulted  to provide modifications to  the statistical
 analysis that  account for the suspected "outlier." For more background on statistical
 methods  to handle  outliers, see Bamett and Lewis (1984).

               The handling of outliers is a controversial topic. In this document, all data
 not known to be in error are considered to be valid because:

                     The expected distribution  of concentration values may be skewed
                     (i.e., non-symmetric) so that large concentrations  which look  like
                     "outliers" to some analysts  may be legitimate;
                     The procedures recommended in this document are less sensitive to
                     extremely low  concentrations than to extremely  high  concentrations;
                     and
                     High  concentrations arc of particular concern for their potential
                     health  and  environmental impact.
                                       2-27

-------
   CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS  AND DECISIONS
 2.4          Limitations and Assumptions  of the Procedures Addressed in
              this Document
              Because a single document cannot adequately address the wide variety of
 situations found at all Superfund sites, this document will only discuss those statistical
 procedures that are applicable to most sites and  can be implemented without a detailed
 knowledge of statistical methods. Although the  procedures recommended here will be
 generally applicable, specific objectives or situations at some sites may require the use of
 other statistical procedures. Where possible problems are anticipated, the text will recom-
 mend consultation with a statistician.

              Due to the complex nature of conditions at Superfund sites, this document
 cannot address all statistical issues applicable either to Superfund sites or to assessing the
 attainment of cleanup standards. The discussion in this document is based on certain
 assumptions about what statistical tests will be requited and what the situations at the site
 will be. For completeness, the major assumptions  are reviewed below.

                     The contaminants are known;
                     The ground water does not attain the cleanup standard until this
                     assumption (that is the null  hypothesis) is rejected using a statistical
                     test;
                     At the time of sampling for assessing attainment, there are no
                    reasons to believe the ground-water concentrations might increase
                     over time;
                     Location of the monitoring and pumping (or treatment) wells arc
                     fixed and arc not to be specified as part of the statistical methods.
                     As a result, the attainment decision strictly applies only to the water
                     in the wells, not to the ground water in general. To draw general
                     conclusions about the ground  water, additional assumptions must be
                     made or  additional wells must be established; and
                     The  cleanup standard is  greater than the detection limit for all chemi-
                     cals to be tested.
2.5           Summary

              This guidance considers the variety and complexity of ground water condi-
tions at Superfund sites and provides procedures which can be used at most sites  and under
most conditions. This chapter outlines some of the conditions found at Superfund sites and
                                      2-28

-------
  CHAPTER 2: INTRODUCTION TO  STATISTICAL CONCEPTS AND DECISIONS


some of the assumptions which have been made as a guide to the selection of statistical
procedures  presented in  later  chapters.


              Errors are possible in evaluating whether a site attains the cleanup stan-
dards, resulting in false  positive and false negative decisions.   Statistical methods provide
approaches for balancing these two decision errors and allow extrapolation in a scientifi-

cally-valid  fashion.


              This chapter reviews briefly the statistical concepts that farm a basis for the

procedures described in this guidance. These include:

                     false positive  decision  - a site is thought  to be clean when  it is not;

                     false negative decision - a site is thought to be contaminated when it
                     is not;

                     mean — the value that corresponds to the "center" of the concentra-
                     tion distribution;

                     Qth proportion or percentile — a value  that separates the lower Q
                     percent of the measurements from the upper 100-Q percent of the
                     measurements;

                     confidence intervals ~ a sample-based estimate of a mean or
                     percentile which is expressed as a range or interval of values which
                     will include the true parameter value with a known probability or
                     confidence;

                     null hypothesis  ~  the prior assumption that  the contaminant concen-
                     trations in the  ground water at the site do not attain the cleanup
                     Standard;

                     hypothesis tests ~ a statistical procedure far assessing attainment of
                     the ground water by accepting or rejecting the null hypothesis on the
                     basis of data;  and

                     power  curve ~ for a specified statistical test and sample size, the
                     probability of concluding that the ground water attains the cleanup
                     standard  versus true  concentration.


              Unlike statistical tests  in other  circumstances, assessment of ground water

requires consideration of the correlation  between measurements across time and space. As
a result of correlation across time, estimating the short-term and long-term concentrations

requires different procedures. The-ground water is defined as  attaining the cleanup stan-
                                        2-29

-------
  CHAPTER 2: INTRODUCTION TO STATISTICAL CONCEPTS AND DECISIONS

dard if the statistical test indicates the long-term mean concentration or concentration
percentile at the site attains the cleanup standard

              When many wells or contaminants  are assessed,   careful  consideration must
be given to the decision procedures which arc used to combine data from separate wells or
contaminants in order to determine if the site as a whole attains all relevant cleanup stan-
dards.  How the data from separate wells are combined affects the interpretation of the
results  and the probability  of concluding that the  overall site attains the cleanup standard.  A
complete discussion of how to assess attainment using multiple wells is beyond the scope
of this  volume.
                                      2-30

-------
   3.  SPECIFICATION  OF  ATTAINMENT OBJECTIVES
             This chapter discusses the  specification of the attainment objectives,  includ-
ing the specific procedures to be used to assess attainment. The sampling and analysis
plans, discussed in the next chapter, outline procedures to be used to assess attainment
consistent with the attainment objectives. The specification of objectives must be com-
pleted by personnel familiar with the following:

                    The characteristics of the ground water and contamination present at
                    the waste site;
                    The health and environmental risks  of the  chemicals involved; and
                    The costs of sampling, analysis and remediation.

             The flow  chart in Figure 3.1 summarizes the steps required to specify the
sampling and analysis objectives and shows where each step is discussed. In general,
specification of the attainment objectives for the site under investigation involves specifying
the  following items:

                    The wells to be sampled;
                    The sample collection and handling procedures;
                    The chemicals to be tested and the laboratory test methods to be
                    used;
                    The relevant cleanup standard for the chemicals under  investigation;
                    The parameter (e.g., the mean or a percentile)  of the chemical
                    concentration distribution which is to be compared to the cleanup
                    standard
                    The "false positive rate" for the statistical test (the confidence level
                    for protection against  adverse health and  environmental  risk);
                    The precision to be achieved; and
                    Any  other secondary objectives for which the data are to be used
                    which may  affect the  choice of statistical procedure.
                                       3-1

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
Figure 3.1     Steps in defining the attainment objectives
                      Sim
               Specify sample wells
                  (Section 3.2)
                      I
               Specify the ample
                  (Section 3 J)
                     I
            Specify the chemical to be
                     tested.
                  (Section 3.4)
                     1
           Specify the parameter to compare
              to the cleanup standard
                  (Section 3.5)
                     I
         Specify the probability of mistakenly
          declaring the.sample area clean.
                  (Section 3.6)
        Specify the precision to be achieved
                  (Section 3.7)
                     I
            Review all elements of the
              attainment objectives.
     Are any
   changes in the
attainment objectives
     required?
                                          3-2

-------
           CHAPTER 3:  SPECIFICATION OF ATTAINMENT OBJECTIVES
              The items which make up the attainment objectives are discussed in detail in
the following  sections.


3.1          Data Quality Objectives

              The Quality Assurance Management staff within EPA has developed
requirements and procedures for the development  of Data Quality  Objectives (DQOs) when
environmental data  arc collected to support regulatory  and programmatic decisions.
Although the DQOs are an important part of the attainment  objectives, they are discussed in
detail elsewhere and will not be addressed here. For more information, readers should
refer to U.S. EPA (1987a) and U.S. EPA (1987b).


3  . 2        Specification of the Wells to be Sampled

              Wells within the site will be monitored and evaluated with respect to the
applicable cleanup standards. Extending  inferences from the sampled wells to the ground
water in general must be made on the basis of both available  data and expert knowledge
about the ground-water system and not on the basis of statistical sampling theory.  Careful
selection of the ground-water wells to be used for assessment is required to ensure that
attainment  of the cleanup standard in the sampled wells implies  to  all parties concerned that
the ground-water quality has been adequately  protected.

              Sections 2.2.3 and 2.3.5 provide more discussion on the implications of the
decision on which wells must attain the cleanup  standard.


3.3           Specification  of Sample  Collection and Handling  Procedures

              The results of any statistical analysis are only as good as the data on which
it is based. Therefore, an important objective for sampling and  analysis plan is to carefully
define all aspects of data collection and measurement  procedures,  including:

                    How  the ground-water sample is to be collected;
                    What equipment and procedures are to be used;
                                       3-3

-------
          CHAPTER 3:  SPECIFICATION OF ATTAINMENT  OBJECTIVES
                    How the sample is to be handled between collection and
                    measurements
                    How the laboratory measurements are to  be  made; and
                    What precision is to be achieved

              One reference for guidance on these topics is The Handbook for Sampling
and Sample Preservation of Water and Wastewater (U.S. EPA,  1982).
3.4           Specification  of the  Chemicals to be 'rested  and  Applicable
              Cleanup Standards

              The chemicals to be tested should be listed.  When multiple chemicals are
tested, this document assumes that all chemicals must attain the relevant  cleanup standard  in
order for the ground water from the well(s) to be declared clean.

              The term "cleanup standard" is a generic term for the value to which the
sample measurements must be compared. Throughout this document, the cleanup standard
will be denoted by Cs.  The cleanup standard for each chemical of concern must be stated
at the outset of the study. Cleanup standards are determined by EPA in the process  of
evaluating site-specific cleanup alternatives.  Final selection  of the cleanup standard
depends on many factors. These factors are discussed in Guidance on Remedial Actions
for Contaminated Ground Water at Superfund Sites [Interim Final! (I J.S. EPA,  1988).
3.5           Specification of the Parameters to Test

              In order to define a statistical test to determine if the contaminant concentra-
tions in ground water well(s) attain the cleanup standard, the characteristic of the concen-
trations which is to be compared to the cleanup standard must be specified. Such character-
istics are called parameters.  The two parameters discussed in this document for testing
individual wells are the mean concentration and a  specified percentile of the concentrations
such as the median or the 90th percentile of the ground-water concentrations. The follow-
ing sections discuss the criteria for selecting the parameters to test. These parameters have
been defined previously in Section 2.3.1.
                                       3-4

-------
          CHAPTER 3:  SPECIFICATION  OF  ATTAINMENT OBJECTIVES
3.5.1        Selecting the Parameters to Investigate


              Criteria for selecting the parameter to use in the statistical attainment

decision are:

                     The criteria used to develop the risk-based  standards, if known;

                     Whether the effects of the contaminant being measured are acute or
                     chronic;

                     The relative sample sizes required;

                     The likelihood of finding concentration measurements below the
                     Cleanup standard; and

                     The relative spread of the data.

              For example, if the cleanup standard is a risk-based standard developed  for

the mean concentration over a specified period of time,  it is logical that the cleanup standard

be compared to the mean concentration. Alternatively, if the cleanup standard is a risk-

based standard developed  for extreme concentrations which should rarely be exceeded, it is
logical to test an upper percentile of  the concentration distribution.


              Many considerations may go into the selection of the parameter to test.

Table 3.1 presents criteria and conditions that support or contradict the  use of each

parameter.


              Some general rules for selecting the parameter to test are:


              (  1 )  If the chemical contaminant of concern  has short-term or acute
                     effects on human health or the environment, testing of upper
                     percentiles is  recommended, with higher percentiles being chosen
                     for testing when the distribution of contamination has a higher
                     coefficient  of  variation.

              (2)     If the chemical contaminant of concern has long-term  or chronic
                     effects on human  health or the environment, Table 3.2 shows the
                     recommended parameter based on the coefficient of variation of the
                     data and the likelihood of measurements  below the detection level.
                                        3-5

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
Table 3.1      Points to consider when trying to choose among the mean, upper
               proportion/percentile, a median
Parameter
                          Points to consider
Mean
 1) Easy to calculate and estimate a confidence interval.

 2) Useful when the cleanup  standard has been based on consideration
    of carcinogenic a chronic health effects a long-term average
    exposure.

 3) Useful when the data have little variation from sample to  sample or
    season to season.

 4) If the data have a large coefficient of variation (greater than  about
     1.5) testing the  mean can require more samples than for testing an
    upper percentile in order to  provide the same protection to human
    health and the environment

 5) Can have high false positive rates with small sample  sizes and
    highly skewed data,  i.e. when the contamination  levels are  generally
    low with only occasional short  periods of high contamination.

 6) Not as powerful  for testing attainment when there is  a large
    proportion of less-thandetection-limit values.

 7) Is adversely affected by outliers  or errors in a few data values.
Upper
Proportion
Percentile
1) Requiring that an upper percentile be less than the cleanup standard
    can limit the occurrence of samples with high concentrations,
    depending  on the selected percentile.

 2) Unaffected by less-thandetection-limit values, as long as the
    detection limit is less than the cleanup standard.

 3) If the health effects of the contaminant axe acute, extreme
    concentrations are of concern and are best tested by ensuring that a
    large proportion of the  measurements are below a cleanup standard.

 4)  The proportion of the samples that must  be  below the cleanup
    standard must be chosen.

 5)  For highly variable or skewed data,  can  provide similar protection of
    human health and the environment with a smaller sample size than
    when testing the mean.

 6)  Is relatively unaffected by a small number of outliers.
                                          3-6

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
Table 3.1      Points to consider when trying to choose among the  mean, upper
               proportion/percentile,  or median  (continued)
                                           Points to Consider
Median
1)  Has benefits over the mean because it is not as heavily influenced by
    outliers and highly variable data, and  can be used with a large
    number  of less-than-detection-limit  values.

2)  Has many of the positive features  of  the mean, in particular its
    usefulness for  evaluating cleanup  standards based on  carcinogenic
    or  chronic health effects  and long-term average exposure.

3)  For positively skewed  data,  the median is lower than the mean and
    therefore  testing the median  provides less  protection for human
    health and the  environment than testing the mean.

4)  Retains some negative  features of the mean in that testing the median
    will not limit the occurrence of extreme values.
Table 3.2      Recommended parameters to test when comparing the cleanup standard to
               the concentration of a chemical  with  chronic effects'
      Large  Coefficient
      of  Variation
      (Perhaps cv > 1.5)
      Intermediate  Coefficient of
      Variation
      (Perhaps 1.5 > cv > .5)
      Small Coefficient
      of Variation
      (Perhaps cv < .5)
                                      Proportion of the data with concentrations
                                              below the detection limit:
                                     (Perhaps  30%)
                       MeanOr
                   Upper Percentile
                   (Upper  percentile
                requires  fewer  samples)
                       Mean or
                   Upper Percentile
                         Mean
                      or Median
                                                  High
                                            (Perhaps > 30%)
Upper  Percentile
Upper  Percentile
    Median
1 Based on Westat simulations and analysis summarized in an internal Westat memo.
                                          3-7

-------
          CHAPTER 3:  SPECIFICATION OF ATTAINMENT OBJECTIVES
3.5.2        Multiple  Attainment Criteria


              In some situations two or more parameters might be chosen. For example,
both the mean and an upper percentile can be tested using the rule that the ground water

attains the cleanup standard if both parameters are below the cleanup standard.


              Other more complicated criteria may be used to assess the attainment to the

cleanup criteria. Examples of multiple criteria are:


                     It is desirable that most of the ground-water samples have concen-
                     trations below the cleanup standard and that the concentrations
                     which are above the cleanup standard are not too large. This may be
                     accomplished by testing if the 75th percentile is below the cleanup
                     standard and the mean of those concentrations which are above the
                     cleanup standard is less than twice the cleanup standard. This com-
                     bination of tests can be performed with modifications of the methods
                     presented in this document.

                     It is desirable that the mean concentration be less than the cleanup
                     standard and that the standard deviation of the data be small. This
                     may be accomplished by testing if the mean is below the cleanup
                     standard and the standard deviation is below a specified value. This
                     document does not address testing the standard deviation, variance,
                     or coefficient of variation against a standard.


For testing of multiple criteria not discussed in the guidance document, consultation with a

statistician is recommended.
3.6           Specification of Confidence  Levels  for  Protection  Against
              Adverse Health and Environmental Risks
              In order to design a statistical test for deciding if the ground water attains the

cleanup standard, those specifying the sampling and analysis objectives must select the

false positive rate. This rate is the maximum probability that the test results will show the

ground water to be clean when it is actually contaminated. It is usually set at levels such as

0.10, 0.05, or 0.01 (that  < 10%, 5%, or 1%), depending on the potential consequences of

deciding that the ground water is clean when, in fact, it is not clean. While different false

positive rates can be used for each chemical, it is recommended that the same rate be used
                                       3-8

-------
           CHAPTER 3:  SPECIFICATION OF  ATTAINMENT OBJECTIVES
for all chemicals being investigated.   For a further discussion of false positive rates see
Section 2.3.4 or Sokal and Rohlf (1981).


3.7           Specification of the Precision to be Achieved

               Recision generally refers to the degree to which repeated measurements are
similar to one another. In this context it refers  to the degree to which  estimates  from differ-
ent samples are similar to  one another.  Decisions  based on precise  estimates  will usually be
the same  from sample to sample. The desired precision of the  statistical  test  is specified by
the desired confidence in the  statistical  decisions resulting  from the  statistical  test.
               Specification of the precision to be achieved is required to completely define
the statistical test to use. The precision  which is to be achieved can be defined by specify-
ing the-parameter value for which the probability of a false negative decision is to be
controlled. For a definition of "false negative" see Section 2.3.4.

               To completely define the precision when testing the  mean, the following
items must be specified:

                      a, the false positive rate;
                      Cs, the cleanup  standard;

               •      m, the mean concentration at which the false negative rate is to be
                      specified; and
               •      P, the false negative rate at \i\.

               To completely define the precision when testing percentiles, the following
items must be specified:

                      a, the false positive rate;
                      Cs, the cleanup  standard;
'When testing multiple chemicals from the same ground water samples, the overall false positive rate will
  be approximately the same as that for individual chemical tests if the concentrations of different chemicals
  are highly correlated. In situations when the concentrations are not highly correlated, the overall false
  positive rate for the entire site will be smaller than that specified for the individual chemicals.
                                          3-9

-------
           CHAPTER 3: SPECIFICATION OF ATTAINMENT OBJECTIVES
                   P0, the largest acceptable proportion of ground-water samples with
                   concentrations above the  cleanup standard;
              •      PI, the value of the proportion for which the false negative rate is to
                     be specified (comparable to jii, when testing means);
              •      0, the false negative rate at PI-

              The specification of these items is  discussed in &tail Chapter 2 of this
document and in Chapter 6 and 7 of Volume L  The reader should refer to Volume I for
detailed instructions on how these items arc to be specified.
3.8          Secondary Objectives

              The sampling and analysis data may be used for purposes other than assess-
ing the attainment of the cleanup standards. For example, they may be used to determine
the relationship between concentrations of different  contaminants, to determine the  seasonal
patterns in the measurements, or to get measurements on a contaminant not being assessed.
These secondary  objectives may determine what procedure is used to  collect the samples or
how often the  samples arc  collected.


3.9          Summary

              This chapter discussed the specification of the various items which make up
the attainment objectives. The objectives will be specified by EPA, regulatory agencies,
and others familiar with the site, the environmental and health risks, and the sampling and
remediation costs. As part of the objectives, careful consideration must be given to
defining the wells  to be tested, the ground-water sampling and analysis procedures, the
statistical parameter to be compared to the cleanup standard, and the precision and confi-
dence level desired. The attainment objectives provide the background for developing the
sampling and analysis  plans discussed  in Chapter 4.
                                       3-10

-------
  4.  DESIGN OF THE SAMPLING AND ANALYSIS PLAN
             Once the attainment objectives are specified by program and subject matter
personnel, statisticians and hydrogeologists can be useful in designing important compo-
nents of sampling and analysis plans. The sampling plan specifies how the water samples
are to be collected, stored, and analyzed, and how many samples to collect. The analysis
plan specifies which of the statistical procedures presented in the following chapters are to
be used. The sampling and analysis plans are interrelated and must be prepared together.
The decision regarding attainment of the cleanup standard can be made only if the field and
laboratory procedures (in the sampling plan) provide data  that are representative of the
ground water and can provide the  parameter estimates (from the analysis  plan)  specified in
the attainment objectives.

             The specification of the sampling and analysis plans will depend on the
characteristics of the waste site and the evidence needed to evaluate attainment. The statisti-
cal methods must be consistent with the sample design and attainment objectives. If there
appears to be any  reason to use  different sample designs or analysis plans than those
discussed in this guidance, or if there is any reason to change either the sample design or
the analysis plan after field data collection has started, it is  recommended that a  statistician
be  consulted.
4.1          The Sample  Design

             The sample  design, or sampling plan; outlines the procedure for
collecting the data, including  the timing, location, and filed procedures for obtaining each
physical water sample. The discussion here focuses on the timing of the sample collection
activities. Common types of sample design are random sampling and systematic sampling.
Either of these sample collection procedures can require a fixed number of samples or use
sequential sampling in which  the number of samples to be collected is not specified before
the sampling  period.
                                      4-1

-------
         CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

4.1.1        Random Sampling

              In a random sample design, samples arc collected at random times through-
out the sampling period. For example, using simple random sampling 48 sample collection
times might be  randomly selected within a four year sampling period. Using  simple
random sampling, some years may have more samples than other years. One alternative to
'simple random sampling is stratified random sample in which 12 samples arc collected in
each of four years, with the sample times within each year being randomly selected. In
either case, with a simple random sample the time interval between the collection of the
water samples  will vary. Some samples may be collected within days of each other while
at other times there may be many  months between samples.

              Although random sampling has some advantages when calculating the
statistical results for short term tests (Chapter 5), systematic sampling is generally recom-
mended far assessing attainment.


4.1.2        Systematic  Sampling

              Using a systematic sample with a random start, ground water samples arc
collected at regular time intervals, (such as  every week, month, three months, year, etc.)
starting from the fast sample collection time, which is randomly determined. In this
document, the systematic sample with a random start will be referred to  as  simply a
systematic sample.

              When sampling ground water, a systematic sample is usually preferred over
a simple random  sample because:
                    Extrapolating from the sample period to future periods is easier with
                    a systematic sample than a simple random sample;
                    Seasonal cycles can be easily identified and accounted for in the data
                    analysis;
                    A systematic sample will be easier to administer because of the fixed
                    schedule for  sampling  times;  and
                    Most ground water samples  have been traditionally collected using a
                    systematic  sample.
                                      4-2

-------
         CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

              The procedures described in the following chapters assume that either a
systematic or random sample is used when collecting data for a short term test and that a
systematic sample is collected when assessing attainment. If other sample designs arc
considered, consultation with a statistician is recommended. It should be noted that when
implementing  a systematic sample,  care must be taken to  capture any periodic seasonal
variations in the data.  The seasonal patterns in the data will repeat themselves (after adjust-
ing for measurement  errors) following a regular pattern.   For example, if ground water
measurements at a site exhibit seasonal fluctuations, following the  four seasons of the year,
collecting data  every six months may miss some important aspects  of the data, such as high
or low measurements, and could present a misleading picture of the status of the site.
Because many seasonal patterns will have a yearly cycle (due to yearly patterns in surface
water recharge) the text will often refer to the number of samples per year instead of the
number of samples per seasonal  cycle.

              One variation of the  standard systematic sample  uses a different random
start for each years data.  For example, if one water sample is collected each month, in the
first year samples might be collected on the 17th of each month and in the second year on
the 25th of each month, etc. This variation is preferred when there arc large seasonal
fluctuations in  the data.

              Follow  the steps below to specify the systematic sample design:

              (1)    Determine the period of any seasonal fluctuation (i.e., time period
                     between repeating patterns  in the data). This period will usually be a
                     year. If no period is  discernible from the data, the use of a one-year
                     period is recommended
              (2)     Determine the  number  of ground water samples,  n, to collect in each
                     year (seasonal cycle) and the corresponding sampling period
                     between samples.  A minimum of four sample collections per year is
                     recommended.
              (3)     Specify the beginning  of the attainment sampling  period.
              (4)     Randomly select a sampling time during the first  sampling period.
              (5)     Subsequent sampling should be at equal intervals of the sampling
                     period after the first sample is collected.

              In practice, the samples need not  be collected precisely at the time called fur
by the sampling interval. However,  the difference between the scheduled  sampling time
and the actual time of sampling should be small compared to the time between successive

                                       4-3

-------
         CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN


samples. The sample collection of subsequent samples should not be changed if one

sample is collected early or later than scheduled.  An example of the  procedure is presented

in Box 4.1.
                                     Box 4.1
            Example of Procedure for Specifying  a Systematic  Sample Design

       (1)    The seasonal cycle in the measurements is assumed to have a period
              of one year.

       (2)    Based on the methods in Chapter 8, it is decided to collect 6
              samples per year, one every two months.

       (3)    The attainment sampling period is to start on April 1,1992

       (4)    The first sampling time during the first two-month sampling period
              is randomly selected using successive flips of a coin. Each flip
              divides the portion of the sampling period being considered into
              two. Heads  chooses the earlier half, tails the later half. After 5
              flips, the chosen day for the first sample is April 15.

       (5)    Samples are scheduled to be collected the  15th of every other month.
              If one sample is collected on the 20th of a month, the subsequent
              sample  should still be targeted  for  the 15th of the appropriate month.
4.1.3        Fixed  versus Sequential Sampling


              For most statistical tests or procedures, the statistical analysis is performed
after the entire set of water samples  has been collected and the laboratory results arc

complete. This procedure uses a fixed sample size test because the number of samples

to be collected is established and fixed before the sample collection begins. In sequential

testing, the water samples are analyzed in the lab and the statistical analysis is performed

as the sample  collection  proceeds.  A statistical  analysis of the data  collected at any point in

time is used to determine whether  another  sample  is to be collected  or if the sampling termi-

nates. Sequential statistical tests  for data collected using sequential  sampling of ground

water are discussed in detail in Chapter 9.
                                       4-4

-------
         CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN

4.2          The Analysis Plan

              Similar to sampling plan, planning an approach to analysis begins before the
first physical sample is collected. The first step is to define the attainment objectives,
discussed in Chapter 3. If the mean is to be compared to cleanup standards, the statisti-
cal methods will be different than if a specified proportion of the samples must have
concentrations  below the cleanup  standard.  Second, the analysis plan must be developed in
conjunction with  the sampling plan discussed earlier in this chapter.

              Third, determine  the appropriate sample size (i.e. the number of physical
samples to be collected) for the selected sample and analysis plan. Whether using a fixed
sample size or sequential  design, calculate the sample size for the fixed sample size test.
Use this sample  size for comparing alternate plans. In some cases, the number of samples
is determined by economics and budget rather than an evaluation of the required accuracy.
Nevertheless,  it is important  to evaluate the  accuracy associated with a prespecified  number
of samples.

              Fourth, the  analysis plan will describe the  statistical evaluation of the  data.

              In many  cases, specification of the sampling and analysis plan will involve
consideration of several alternatives.  It may also be an iterative process as the plans are
refined. In cases where  the  costs of meeting the attainment objectives are not acceptable, it
may be necessary to reconsider those objectives.  When trying to balance cost and preci-
sion, decreasing  the precision can decrease the sampling and lab costs while increasing the
costs of additional remediation due to incorrectly concluding that  the ground water does  not
attain the cleanup standard. In this situation, consultation with a statistician, and possibly
an economist, is  recommended.

              Chapters 8  and 9 offer various statistical methods, depending on attainment
objectives and the sampling plan. Table 4.1 presents the locations in this document where
various combinations of analysis and sampling plans are discuss&
                                        4-5

-------
     CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN
Table 4.1     Locations in this document of discussions of sample designs and analysis
             for ground water sampling
Type of Evaluation
Continuous Data
Discrete Data
Analysis Method
Test of the Mean
Test of Proportions
Sample Design
Fixed Sample Site
Sections 8.3 and 8.4
Section 8.5
Sequential
Sections 9.3 and 9.4
section 9.5
4.3          Other Considerations for Ground Water Sampling and Analysis
             Plans
             At a minimum, all ground water sampling and analysis plans should'
specify:
                    sampling objective;
                    sampling preliminaries;
                    Sample  collection;
                    In-situ field analysis;
                    Sample  preservation and analysis;
                    Chain of custody control;
                    Analytical procedures and quantitation limits;
                    Field and laboratory QA/QC plans;
                    Analysis procedures far any QC data;
                    statistical analysis  procedures; and
                    Interim and final statistics to  be  provided to project personnel.

             For more information on other considerations  in ground water sampling and
analysis, see RCRA  Ground Water Monitoring Technical Enforcement  Guidance Document
(EPA,  1986b).
                                      4-6

-------
         CHAPTER 4: DESIGN OF THE  SAMPLING AND ANALYSIS PLAN

4.4          Summary

              Design of the sampling  and  analysis plan requires specification of attainment
objectives-by program and subject matter personnel. The sampling and analysis objectives
can be refined with the assistance of statistical expertise. The sample design and analysis
plans go together, therefore, the methods of analysis must be con&tent with the sample
design and both must be consistent with the characteristics of the data and the attainment
objectives.

              Types  of sample design include simple random sampling or systematic
sampling, and fixed sample size or sequential sampling. This guidance assumes the data
will be collected  using a systematic sample  when assessing attainment.

              Steps required to  plan an approach to analysis are:
                     Specify the attainment  objectives;
                     Develop the analysis  plan in  conjunction with the sampling plan,
                     Determine the  appropriate sample size; and
                     Describe how the resulting data will  be  evaluated.
                                       4-7

-------
CHAPTER 4: DESIGN OF THE SAMPLING AND ANALYSIS PLAN
                       4-8

-------
     5. DESCRIPTIVE STATISTICS AND HYPOTHESIS TESTING
              This chapter introduces the reader to some basic statistical procedures that
can be used to both describe  (or characterize) a set of data, and to test hypotheses and make
inferences from the data. The procedures use the mean or a selected percentile from a
sample of ground water measurements along with its associated confidence interval. The
confidence interval indicates how well the population (a actual) mean on percentile can be
estimated from the sample mean or percentile.  These parameter estimates and their
confidence intervals  can be useful in communicating the current status of a clean up effort.
Methods  of  assessing whether the concentrations meet target levels  are useful for evaluating
progress of the remediation. The statistical procedures given in this chapter arc called
"parametric" procedures.  These methods usually assume  that the underlying  distribution  of
the data is known. Fortunately, the procedures perform well even when these assumptions
arc not strictly true; thus they are applicable in many different field conditions (see
Conover, 1980). The text notes situations in which the statistical procedures are sensitive
to violations of these assumptions.   In these cases, consultation with a  statistician  is
recommended.

              Calculations of means, proportions, percentiles, and their corresponding
standard errors and their associated confidence intervals (measures of how precise these
estimated means, proportions, or percentiles are) will be described. The  statistics and
inferential procedures presented in this chapter are appropriate only for estimating short-
term characteristics of contaminant levels   By "short-term characteristics"  we mean
characteristics such  as the mean or percentile of contaminant concentrations during the fixed
period of time during which sampling occurs. For example, data collected over a one year
period can be used to characterize the mean contaminant concentrations during the year.
Procedures for estimating the  long-term mean and  for  assessing attainment  arc discussed in
Chapters 8 and 9. The distinction between the methods of this chapter and those given in
Chapters  8 and 9 is that inferences based on short-term methods apply only to the specified
period of sampling and not to future points of time. The procedures discussed in this
chapter can be used in any phase of the remedial effort; however, they will be most 'useful
during treatment, as indicated in Figure 5.1. For a further discussion of  short- versus
long-term tests, see  Section  2.3.2.
                                       5-1

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS
 Figure 5.1    Example scenario for contaminant measurements during successful remedial
              action
              Much of the material on means, percentiles, standard errors and confidence
 intends has been previously presented in Volume I of this series of guidance documents.
 To avoid duplication, the discussion of these topics in this chapter is limited to the main
 points. The reader should refer to Volume I (Section 6.3 and 7.3) for additional details.
              Some Notations and Definitions

              Unless stated otherwise, the symbols xi, X2,.... xi,..., XN will be used in
-this manual to denote the contaminant concentration measurements for N ground-water
 samples taken at regular intervals during a specified period of time. The subscript on the
 x's indicates the time order in which the sample was drawn; e.g., Xj is the first (or oldest)
 measurement while XN is the Nth (or latest) measurement. Collectively, the set of x's is
 referred to as a data set, and, in general, x; will be used to denote the il measurement in the
 data  set.

              The data set has properties which can be summarized by individual
 numerical quantities  such as the sample mean,  standard deviation or percentile
 (including the median).  In general, these numerical quantities are called sample
 statistics. The sample mean or median provides a measure of the central tendency of the
 data or the concentration around which the measurements cluster. The sample standard
 deviation provides a measure of the spread or dispersion of the data, indicating whether the
                                       5-2

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

sample data are relatively close in value or somewhat spread out about the mean. The
sample variance is the square of the standard deviation. The computational formulas for
these quantities  arc given  in subsequent sections.

              As one of many possible sets of samples which could have been obtained
from a ground water well, the mean, standard deviation, or median of the observed sample
of measurements, xb x2, . .  . , XN, represent just one of the many possible values that could
have been obtained. Different samples will obviously lead to different values of the sample
mean, standard deviation or median. This sample-to-sample variability is  referred to as
sampling error or sampling  variability and is used to characterize the precision of
sample-based estimates.

              The precision of a sample-based estimate is measured by a quantity known
as the standard error. For example, an estimate of the standard error of the mean will
provide information on the extent to which the sample mean  can  be expected to vary among
different sets of samples, each set collected during the same sample collection period. The
standard error can be used to construct confidence intervals. A confidence interval
provides a range of values within which we would expect the true parameter value to lie
with a specified level of confidence. Statistical applications requiring the use of standard
errors and confidence intervals are described in detail in the sections which  follow. The
standard error differs from the standard deviation in that the standard deviation measures
the variability of the individual observations about their mean while the standard error
measures the variability of the  sample mean among  independent samples.

              Throughout the remainder of this document, certain mathematical symbols
will be used. For reference,  some of the frequently-used symbols are summarized in
Table 5.1.

              Finally, note  that the equations  that follow assume that there  are no missing
observations. If there arc relatively few  missing observations (i.e., five percent or less of
the data set have missing data for the chemical measurement under consideration), the
ground-water samples with missing data  should be deleted from the data set.  In this case,
all  statistics should be calculated  with the available data, where the "sample size" now
corresponds to the number of samples which have non-missing concentration values.
However, if more than five  percent of the data arc missing, a statistician should be
                                       5-3

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS


consulted. Additional comments regarding the treatment of the missing values will be

given in the sections where  specific statistical  procedures  are being  discussed.



Table 5.1    Summary of notation used in Chapters 5 through 9
   Symbol
                           Definition
     m
     N
     Of


     Cs
 Contaminant measurement for the ith ground water sample. For
 measurements reported as below detection, X; = the detection limit.

 In the discussion of regression, the dependent variable, often the
 sample collection time, sometimes the sample collection time after a
 transformation.

 The number of years for which data were collected (usually the
 analysis will be performed with data obtained over full  year periods)

 The number of sample measurements per year (for monthly data, n =
 12; for quarterly data, n = 4). This is also referred to as the number of
 "seasons"  per year

 The total number of sample measurements (for data obtained over full
 year periods with no missing values, N = nm)

 An alternative way of denoting  a contaminant measurement,  where k  =
 1,2, ,,,, m denotes the year; and j = 1, 2,. . ., n denotes the sampling
 period (season) within the year. If there are no missing values, the
 subscript for xjk is related to the subscript for X; in  the following
 manners: i = (k-l)n + j.

 The  mean (or average) of the N ground water measurements.

 The  variance of the N ground water  measurements.

 The  standard deviation of the N ground water measurements.

 The  standard error of the mean (this is calculated differently for long
 and short term tests).

 The degrees of freedom associated with the standard error of an
 estimate.

The cleanup standard relevant to the ground water and the contaminan
 being rested.
                                       5-4

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS
Table 5.1      summary of notation used in Chapters 5 through 9 (continued)
   Symbol
                            Definition
      P



     PO
     Pi
The "true" but unknown proportion of the ground water with
contaminant concentrations  greater than the cleanup Standard.

The criterion for defining whether the sample area is clean or
contaminated  using proportions. According to the attainment
objectives, the ground water attains the cleanup standard if the
proportion of the ground  water  samples wrth contaminant
concentrations greater than the cleanup standard is less than P0 i.e.,
the ground water is clean if P
-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS



5.1          Calculating the Mean, Variance, and Standard Deviation of the Data



            The basic equation presented in Box 5.1 for calculating the mean and
variance (or standard deviation) for a sample of data can be found in any introductory
statistics text (e.g., Sokal and Rohlf, 1981 or Neter, Wasserman, and Whitmore, 1982).
                                   Box 5.1
             calculating Sample Mean, variance and standard Deviation

       Designate the individual data values from a sample of N observations as xlt
       x2,.... XN. The sample mean (or arithmetic average) of these observations,
       indicated by X, is given by

                                        N


                                                                  (5.1)

       The equation for die sample variance, s2, is
                      e2   2]	N      W	          ,.-
                      s  ~ ^^^^^^^^^ *    KJ   1             (5.2)
                               N-l           N ' !

       The corresponding equation for the standard deviation of the data is
                                                                  (5.3)

       Both the variance and standard deviation have N-l degrees of freedom.
             The mean and standard deviation are descriptive statistics that provide
information about certain properties of the data set. The mean is a measure of the
                                      5-6

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

 concentration around which the individual measurements cluster (the location central
 tendency). The standard deviation (or equivalently, the variance) provides a  measure of the
 extent to which sample data vary about their mean.

              Note that samples with missing data should be excluded from these
 calculations,  in which  case  N equals  the number of samples  with non-missing
observations.  If more than five percent of the data have missing values, consult a
 statistician.

              The term, "Degrees of Freedom," denoted by Df, can be thought of as a
 measure of the amount of information used to estimate the variance (or standard deviation)
 and thus reflects the precision of the estimate. For example, the variance and standard
 deviation calculated from formulas (5.2) and (5.3). respectively, are based on "N-l degrees
 of freedom."  For other estimates of variance (e.g., see Section 5.2.2 or 5.2.4). the
 associated degrees of freedom may be different. The degrees of freedom is used in
 calculating confidence intervals  and  performing hypothesis tests.


 5.2          Calculating the Standard Error of the Mean

              The standard error of the mean (denoted by s,) provides a measure of the
 precision of the mean concentration obtained from ground-water samples that have been
 collected over a period of time. The standard error of a statistic (e.g.,  a mean) reflects the
 degree to which that statistic will vary from one randomly selected set of samples to another
 (each of the same  size).  Small values of s, indicate that the mean is relatively precise,
 whereas large values indicate that the mean is relatively imprecise.

              A number  of different formulas are available for calculating the standard
 error of the mean. The appropriate formula to use depends on the behavior of contaminant
 measurements over time and the sampling design used for sample collection. Four
 methods of calculating the standard error and the conditions under which they are
 applicable  are discussed below. Care should be  taken in each case to insure that an
 appropriate estimation formula for the standard error is chosen. Appropriate formulas
 should be decided on a site-by-site basis.
                                      5-7

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS


              General rules for the selection of the formula for calculating the standard

error of the mean include:

                     If the ground water samples are collected using a random  sample,
                     use the formulas in section 5.2.1 and Box 5.2.

                     If the ground water samples are collected using  a systematic sample:

                            Use the formulas in Section 5.2.4 and Box 5.6 unless there
                            are no obvious seasonal patterns or the  serial correlations in
                            the data  are not significant.

                            Use the formulas in Section 5.2.2 and  Box 5.3 if there are
                            obviously no seasonal patterns in the data however the data
                           might be  correlated.

                            Use the formulas in Section 5.2.3 and  Box 5.4 if there are
                            seasonal patterns in the data and serial correlations in the
                            residuals  are not significant.

                            Use the formulas in Section 5.2.1 and  Box 5.2 if there are
                            obviously no seasonal patterns in the data and serial
                            correlations in  the data  are not significant.

                            If there  are trends in the data consider using regression
                            methods (Chapter 6).  If regression methods are not used
                            and the trends are small relative to the variation of the data,
                            the methods using differences (Sections 5.2.2 and 5.2.4) are
                            preferred  over  the  other methods.


              Sections 5.3 and 5.6  discusses procedures for  estimating the serial

correlation and statistical tests for  determining if it is significant.
5.2.1        Treating the Systematic Observations as a Random Sample


              The simplest method of estimating the standard error is to treat the

systematic sample as a simple random sample (see Section 4.1). In this case, the standard
error of the mean (denoted by sx) is given by the equations in Box 5.2. Formula (5.4) will

provide a reasonably good estimate of the standard error if the contamination is distributed

randomly  with respect to time. The formula may overstate the standard error if there are

trends in contamination over time, seasonal patterns or if the  data are serially correlated.
                                        5-8

-------
                     CHAPTER 5:  DESCRIPTIVE STATISTICS
                                     Box 5.2
                   Calculating the  Standard Error Treating the Sample
                            as a Simple Random Sample
       where s is the standard deviation of the data as computed from equation
       (5.3) and  N is the number of non-missing observations. Equation (5.4) is
       equivalent  to
                                           N(N-l)                   (5.5)
       The degrees of freedom for this estimate of the standard error is N-l.
5.2.2        Estimates  From  Differences Between Adjacent Observations

              Another method in common use is  based on overlapping pairs of
consecutive observations. That is, observation 1 is paired with observation 2.2 with 3, 3
with 4, and so on. This method often gives a more accurate estimate of the standard error
if the serial correlation between successive  observations is high. The computational
formula for this estimate of the standard error is given in Box 5.3 (e.g.,  see Kish, 1965,
page 119 or Wolter, 1985, page 251).

              If the data are independent, that is if the samples are collected using a
random sample or if the data have no seasonal  patterns or serial correlations, the standard
error calculated using equation (5.6) will be less precise than that using equation (5.4).
Since most statistics text books assume that the data are independent, these text books
present only equation (5.4) for estimating the standard error of the mean.  However, when
using a systematic sample, the data are rarely independent. When the data are not
independent, equation (5.4) may over estimate the standard error of the short term mean.
On the other hand, equation (5.6) is preferred because it provides a less biased estimate of
the standard error of the short-term mean.  Calculation of the standard error using the
differences between adjacent observations, equation (5.6), is not  appropriate for estimating
                                       5-9

-------
                      CHAPTER 5: DESCRIPTIVE STATISTICS

 the  standard error of a long-term mean. Because systematic samples and short term means
 (i.e., the mean of the limited population being sampled) are often of interest in survey
 sampling, equation (5.6) is more commonly used in the analysis of sample surveys.
                                      Box  5.3
            Calculating  the Standard  Error Using Estimates Between Adjacent
                                         VI (Xj - Xj.l)2
                                         **	
                                           2N(N-1)
                                                                      (5.6)
       The number of degrees of freedom for the standard error given by (5.6) is
       approximately -y, as suggested by DuMouchel,  Govindarajulu and
       Rothman (1973). When using this formula, round the approximate degress
       of freedom down to the next smallest integer.
              We suggest that this method of successive differences using overlapping
pairs be used to estimate the standard error of the mean unless there are obvious seasonal
patterns in the data, or seasonal patterns are expected. If there are seasonal patterns or
trends in the data,  equation (5.6)  will tend to overestimate  the standard error. If the sample
data  reflect  seasonal variation, the method for computing the standard error discussed in the
next  section should be employed.


5.2.3        Calculating the Standard Error After Correcting for  Seasonal


              The formulas given in the preceding sections for calculating the standard
error are are appropriate for data exhibiting seasonal variability.  Seasonal variability  is
generally  indicated by a regular pattern that is repeated every year. For example,
Figure 5.2 shows  16 chemical observations taken at quarterly intervals. Notice that
beginning with the first observation, there is a fairly obvious seasonal pattern in the data.
That is, within  each year, the first quarter observation tends to have the  largest value,  while
                                        5-10

-------
                     CHAPTER 5:  DESCRIPTIVE STATISTICS

the third quarter observation tends to have the smallest value. Over the year, the general
pattern is for the concentration to start at a high value, decrease in the second quarter,
decrease again  in the third quarter, and  then in the fourth quarter..


Figure 5.2    Example of data from a monitoring well exhibiting as  a seasonal pattern
                8-

                7-

                6-

                5-


                3-

                2
0   2
                           4    6    8   10   12   14   16  18
                               Tim* (Quarter)
              When the data exhibit regular seasonal patterns, the seasonal means should
be calculated separately and then used to "adjust" the sample data.  Specifically, let xit
denote the observed concentration for the ground water sample taken from the jth time  point
in year k. Let n be the number of "seasons" in a seasonal cycle. Note that if data arc
collected every month, then we have n = 12 and j = 1, 2, . . . , 12.  However, if data arc
collected quarterly, then we have n = 4 and j = 1,2,3,4.  In general, let j = 1, 2,. . . , n;
and k = 1, 2,. .  . , nij, where  nij is the number of non-missing observations that arc
available  for season j. Note that nij will equal m (the number of years) far all j (i.e., for all
seasons) unless some data arc missing.  Even if the seasonal effects arc relatively small, it
is recommended that the seasonal means  be subtracted from the sample data.   The presence
of "significant" seasonal patterns can be formally tested by means of analysis of variance
(ANOVA) techniques. A statistician should be consulted for more information about these
tests.
                                       5-11

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

              The equations for the j  seasonal average, the average of the nij (non-
missing) sample observations for season j, and the sample residual  after correcting for the
seasonal means are given in Box 5.4. Additional discussion of methods for adjusting for
Seasonality can be found in Statistical Analysis of Ground-Water Monitoring Data at RCRA
Facilities (EPA,  1989b).
                                      Box 5.4
                  Calculating Seasonal Averages and  Sample Residuals
       The jth seasonal  average is:



       where nij is the number  of non-missing observations available for season j.
       The sample residual after correcting for the seasonal means is  defined by

                                    Cjk**fc- *i                        (5.8)
              By subtracting the estimated seasonal means from the measurements, the
resulting values,  Ejk (or residuals), will all have an expected mean of zero and the variation
of the 6jk about the value zero reflects the general variation of the observations.  Using the
residuals calculated from formula (5.8).  the standard error of the mean can be calculated
from the equations in Box 5.5 (e.g., see Neter, Wasserman, and Kutner, 1985, pages 573
and 539). The term s^ is referred to as the mean square error and is standard output in
many statistical computer packages (e.g.. see Appendix E for details on using SAS to
calculate the relevant statistics).
                                       5-12

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS
                                    Box 5.5
           Calculating the Standard Error After Removing Seasonal Averages

       The standard error based on the  residuals resulting from removing the
     seasonal averages is:
                                         j-1 k-1 J*

                              s»"   V   W-n>                  (5.9)

       where
                                       n
                                      ~  DJ.                      (5.10)



       The degrees of freedom associated with the standard error is Df = N-n.

       Note that equation (5.9) can also  be written as:
                                                                   (5.11)

       where
                                       n  ">
                                      z
                                 2	J-1
                                        N.n
       The estimate of S* above is the same as the mean square error when using
       one-way analysis of variance.
5.2.4       Calculating  the Standard  Error After  Correcting for  Serial
             Correlation
             If the serial correlation of the seasonally adjusted residuals is significant (see

Section 5.6), the following formula in Box 5.6 should be used to compute the standard

error of the mean, s,.
                                      5-13

-------
                     CHAPTER 5:  DESCRIPTIVE STATISTICS
                                     Box 5.6
           Calculating the  Standard Error After Removing  Seasonal Averages
       The-standard error based on the residuals resulting from removing the
       seasonal averages  is:              	
                                         VI (ei - ei.i
                                         *2   _  „
                                           2N(N-1)
I)2
                  (5.13)
       The degrees of freedom associated with the standard error given by formula
       (5.13) is approximately Df =  ' g'n'.  When using this formula, round the
       approximate degress of freedom down to the next smallest integer.  This
       equation results from applying equation 5.6 to the residuals from equation
       5.8.
5.3           Calculating Lag 1 Serial Correlation

              The serial correlation (or autocorrelation) measures the correlation of obser-
vations separated in time. Consider  the situation where the ground water  concentrations  are
distributed around an average concentration, with no long-term trend or seasonal patterns.
The ground water measurements   will fluctuate around the mean  due to historic fluctuations
in the contamination events and the ground water flows and levels. Even though the
measurements fluctuate around the mean in what may appear to be a random pattern, the
measurements in ground water samples taken close in time (such as on successive days)
will typically be more similar than measurements taken far apart in time (such as a year
apart). Therefore  measurements taken close together in time arc more highly  correlated
than measurements taken far apart in time. The extent to which successive measurements
arc  correlated if measured by the serial correlation.  The presence of significant serial corre-
lation  affects the standard error of the mean.

              If serial correlation is present in the data, statistical methods must be
selected which will provide correct results when applied to correlated data  Some of the
statistical  procedures  described in Chapters 5, through 9  require the  calculation of the  serial
correlation.  In general, serial correlations need  not be  based on observations which
immediately follow  one another in time  sequence ("lag  1" serial correlations). Serial
correlations may be defined that are 2 time periods, 3 time periods, etc., apart. These are
                                       5-14

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS

referred to as "lag 2", "lag 3", or in general, "lag k" serial correlations.  Serial correlations
are discussed more fully in Gilbert (1987). page 38 or Box and Jenkins (1976). page 26.
only "lag 1" serial correlations will be considered in this document,

             To calculate the serial correlation, first compute the seasonally adjusted
residuals,  6jk; using  the procedure described  in Section 5.2.3.  Order the 6jk,'s
chronologically and denote the ith time-ordered residual by e; The serial correlation
between the residuals can then be computed as shown in Box 5.7 (see Neter, Wasserman,
and Kutner, 1985, page 456).
                                    Box 5.7
          Calculating the Correlation from the Residuals After Removing
                                Seasonal Averages
       The sample estimate of the serial correlation of the residuals is:
                                        N
       Where Cj, i» 1,2, ...,N are the residuals after removing seasonal averages,
       in the time order in which the samples were collected.
             The serial correlation between successive observations, computed from
formula (5.14), depends on the time interval between collection of ground-water samples.
For example, for quarterly data, $bbs represents the correlation between measurements that
are taken  three months  apart, while, for monthly data, $0j,s represents the correlation
between measurements that are taken one month apart. Correlations between observations
taken at different intervals will generally be different For estimating sample sizes (Section
5.10) it will be convenient to work with the monthly serial correlation, i.e., the correlation
between observations that are one month apart  If the data are not collected at monthly
intervals, the formula in Box 5.8 can be used to convert fa» to a monthly serial correlation
$ (see Box and Jenkins,  1970, for more details).  Equation (5.15) estimates the monthly
correlation from a correlation based on observations separated by t months.  For example,
for a sample correlation calculated from quarterly data, t - 3.  Equation (5.15) is based on
                                      5-15

-------
                      CHAPTER 5: DESCRIPTIVE STATISTICS

 assumptions about the factors which affect the correlations in the  measurements. These
 assumptions become more important as the frequency at which the observations are
 collected differs from monthly (see  Box and Jenkins, 1970, page 57  and  Appendix  D).
                                      Box 5.8
             Estimating the Serial Correlation Between Monthly Observations
        The estimated serial correlation between monthly observations based on a
        sample  estimate of the serial correlation between observations separated by t
        months  is:
                                              I
                                                                     (5-15)
              With data from multiple wells, the estimates of serial correlations can be
 combined across wells to provide a better estimate when the following conditions are met:
              •      The contaminant concentration levels in the wells are similar,
              •      The wells are sampled at the same frequency;
              •      The wells are sampled for roughly the same period of time; and
              •      .The wells are geographically close.
 Under these conditions, the combined estimate of serial  correlation is calculated by
averaging the estimates calculated for each well.
 5.4          Statistical  Inferences:   What can be Concluded from Sample
              Data
              The first two sections of this chapter dealt with the computation of several
 types of measures that can be used to characterize the sample data, means, standard errors,
 and serial correlation coefficients.  In addition to characterizing or describing one's data
 with summary statistics, it is often desirable to draw conclusions from the data, such as an
 answer to the question:  Is the mean concentration less than  the cleanup standard?
                                       5-16

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

              A general approach to drawing conclusions from the data, also referred to as
making inferences from the data, uses a standard structure and process for making such'
decisions referred to as "hypothesis testing" in statistical literature.  It can be outlined as
follows,

       1.     Make an assumption about the concentrations which you would like to
              disprove  (e.g.. the average population measure of a contaminant  is greater
              than the cleanup standard of 2.0 ppm). This cleanup standard represents
              your initial or null hypothesis  about the current situation.
       2.     Collect a set of data, representing a random sample from the population of
              interest.
       3.     Construct a statistic from  the sample data. Assuming that the null
              hypothesis is true, calculate the expected distribution of the statistic.
       4.     If the value of the statistics is  consistent with the null hypothesis, conclude
              that the null hypothesis provides an  acceptable description of the present
              situation.
       5.     If the value of the statistic  is highly unlikely given the assumed null
              hypothesis, conclude that  the null hypothesis  is incorrect.

              Of course, sample data may occasionally provide an estimate that is
somewhat different from the true value of the population parameters being estimated. For
example, the average value of the sample data could be, by chance, much higher than that
of the full population.   If the sample you happened to collect was substantially different
from the population, you  might draw the  wrong conclusion. Specifically, you  might
conclude that the value assumed in the null hypothesis had changed when it really had not.
This false conclusion  would have been arrived at simply by chance, by the luck of
randomly selecting a particular set of observations or data values. The probability of
incorrectly rejecting the  null  hypothesis by chance can be controled  in the hypothesis  test.

              If the chance of obtaining a value of a test statistic beyond a specified limit
is, say, 5% if the null hypothesis is true, then if the sample value is beyond this limit you
have substantial evidence that the null hypothesis is not true. Of course, 5% of the time
when the null hypothesis is true a test statistic value will be beyond that specified limit.
This probability of incorrectly rejecting the null hypothesis is generally denoted by the
symbol a  (alpha) in statistical literature. The person(s) making the  decision specify the risk
of making this type of error (often referred to as a Type I error in statistical literature) prior
to analyzing the data.  If one wishes to be conservative, one might choose a=.01, allowing
                                        5-17

-------
                     CHAPTER  5: DESCRIPTIVE STATISTICS

up to a one percent chance of incorrectly rejecting the null hypothesis. With less concern
about this type of error, one might choose o=.l. A common choice is a=.05.

              Many of the test procedures presented below use confidence intervals. A
confidence interval shows the range of values for the parameter of interest for which the
test statistics discussed above  would not result  in the  rejection of the null hypothesis.
5.5          The Construction  and  Interpretation of Confidence Intervals
              about Means

              A confidence intend is a range of values which will include the population
parameter, such as the population mean,  with a known probability or confidence. The
confidence interval indicates how closely the mean of a sample drawn from a population
approximates the true mean of the population. Any level of confidence can be specified for
a confidence interval. For example, a 95  percent confidence interval constructed from
sample data will cover the true mean 95 percent of the time. In general, a 100( 1-a) percent
confidence interval will cover the true mean 100(1-a) percent of the time. As indicated
above the value of a, the probability of a Type I error, must be decided upon and is usually
chosen to be  small; e.g., 0.10, 0.05, or 0.01. The general form of a confidence interval
for the mean is shown in Box 5.9.
                                     Box 5.9
               General  Construction  of Two-sided  Confidence Intervals
       A two-side confidence interval for a mean is generally of the form:

                              *-t*s*  to  * + t*s*                  (5.16)
             In equation (5.16) the product t*s* represents the distance (in terms of
sample standard enors) on either side of the sample average that is likely to include the true
population mean. One determines t from a table of the t-distribution giving the probability
that the ratio of (a) the difference between the true mean and the sample mean to (b) the
sample standard error of the mean exceeds a certain value. To determine t, you actually
                                      5-18

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

 need to determine two parameters: a, the probability of a Type I error, and Df, the number
 of degrees of freedom associated with the standard error. Thus, t is usually expressed as
 ti-oj* and the appropriate value of ti-aj>f can be found from a table of the critical values
 of the t distribution using the row and column associated with the values of 1-a and Df (see
 Appendix A).

              Given below are the formulas for one- and  two-sided  confidence limits for a
 population mean (Boxes 5.10 and 5.11). Here, the population (or "true")  mean is the
 conceptual average contamination over all possible ground-water samples taken during the
 specified time period. The one-sided confidence interval  (establishing an acceptable limit
 on the range of possible values for the population mean on only one side of the sample
 mean) can be used to test whether the ground water in the well for the (short-term) period
 of is significantly less than the  cleanup  standard The  two-sided version  of  the
 confidence interval can be used to characterize the ground-water contamination levels
 during the period of sampling.
                                    Box 5.10
                General Construction of One-sided Confidence Intervals
       The upper one-sided confidence limit for the mean is given by:

                               "uo - «+ti-aJ>fS»                  (5.17)
                                    Box  5.11
                   Construction  of Two-sided  Confidence Intervals
       The corresponding  two-sided  confidence limits are given by:

                                                                   (5.18)
       and
                                                                   (5.19)
             In equations (5.17) to (5.19), 1-a is the confidence level associated with the
interval, x is the computed mean level of contamination; sx is the corresponding standard
                                      5-19

-------
                     CHAPTER 5:  DESCRIPTIVE STATISTICS


 error computed from the appropriate formula in Section 5.2, and Df is the number of
 degrees of freedom associated with s,. The degrees of freedom (Df) associated with the

 standard error depend on the particular formula used. Table 5.2 summarizes the various
 standard error formulas, their corresponding degrees of freedom, and the conditions under
 which they should be used.  The appropriate value of ti.a>r>f can be obtained from
 Appendix Table A.I. Note that for two-sided intervals, the t-value used is ti^jx rather

 than ti.a*
  A        \I  w?
  «•  s*  =   y    2N(N-1)
                                           N-l      Data exhibit no seasonal
                                                    patterns and no serial
                                                    correlation  (Section 5.2.1)
                                            2N     Data exhibit no seasonal
                                             3      patterns, but may be serially
                                                    correlated (Section 5.2.2)
                                         N-n      Data exhibit a seasonal
                                                  pattern, but no serial
                                                  correlation (Section 5.2.3)
                                        2(N-n)    Seasonally-adjusted
                                           3      residuals exhibit serial
                                                  correlation (Section 5.2.4)
             The upper one-sided confidence limit u^, defined in equation (5.17) can be

used to test whether the average contaminant levels for ground-water samples collected
                                      5-20

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

over a specified period of time is less than the cleanup standard, Cs (see Box 5.12).
Although the rules indicated beloaLcan be used to monitor cleanup piumess. they should
not be used  to  assess attainment of the cleanup standard^  Procedures for assessing
attainment are given in Chapters 8 and 9.
                                     Box 5.12
            Comparing the Short Term Mean to the Cleanup Standard Using

       For short-term means, the decision rule to be used to decide whether or not
       the ground water is less than the cleanup standard is the following:
       If jiuo < Cs, conclude that the short-term mean ground-water contaminant
       concentration is less than the cleanup standard (i.e., \i < Cs).
       If mja ^ Cs, conclude that the short-term mean ground-water contaminant
       concentration exceeds the cleanup standard (i.e., M. £ Cs)
5.6           Procedures for Testing for Significant Serial Correlation

              Different statistical methods may be required if the data have significant
serial  correlations. The serial correlation can  be estimated using the procedures in  Box 5.7.
The Durbin-Watson test and the approximate large sample test in sections 5.6.1 and 5.6.2
can be used to test if the observed serial correlation, 3^, is significantly different from
zero.


5.6.1        Durbin-Watson Test

              The discussion here on  determining the existence  of serial  correlation in the
data assumes the knowledge of confidence  intervals and hypothesis testing. Sections 5.4
and 2.3.4 provide a discussion of these concepts,  if the reader would like  to review them.

              If there is no serial correlation between observations, the expected value of
$obs will be close to zero. However, the calculated value of ^ is unlikely to be zero even
if the actual serial correlation is zero. The Durbin-Watson statistic can be used to test
whether the observed value of $^5 is significantly different from zero.  To perform the test
                                       5-21

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS


(e.g., see Neter, Wasserman, and Kutner, 1985, page 450). compute the statistic D shown

in Box 5.14.
                                   Box 5.13
                    Example:  Calculation of Confidence  Intervals

      Suppose that 47  monthly ground-water samples  were collected over a
      period of slightly less than 4 years.  The measurements for three of the
      samples were below the detection limit and were replaced in the analysis by
      the detection limit Based on these data, the overall mean is .33.  Since the
      data did not exhibit any seasonal patterns but was thought to be serially
      correlated, equation (5-6) was used to compute the standard error of the
      mean; i.e., s, = .1025.   The degrees of freedom associated  with the
      standard error is 2N/3 = 2(47)/3 = 31. Hence, for a two-sided 99 percent
      confidence interval, a = 0.01 and 100531 = 2.75 from Appendix Table A.I.

      The required confidence interval for the mean goes from x - t\.a/2 of sx to
      * + 'l-ofl.Df s*  »•«•• from P3 -2.75(. 1025)]  to [.33 + 2.75(.l625)]  or
      from .048* to .612 ppm.

      For a one-sided 99 percent confidence interval, a - 0.01 and t OU1 = 2.457
      from Appendix Table A.I.   The corresponding  one-sided confidence
      interval goes from zero to

                                  * -33 + 2.457(.1025) ».58 ppm.


      Since the cleanup standard is Cs - 0.5 ppm it is concluded that for the
      period of observation, there is insufficient evidence to conclude with
      confidence that  the true mean ground-water concentration is less than the
      cleanup standard. This is the case even though the sample mean happens to
      be less than the cleanup standard. There is enough variability in me data
      that a true  mean greater than  0.5 ppm cannot be ruled out.
                                   Box 5.14
                     Calculation of the Durbin-Watson Statistic
                                    N


                                    ^jj	.                   (5.20)


                                      i«l  '
                                     5-22

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

              If D < dij, where du is the upper "critical" value for the test given in
Appendix Table A-6 of the book by Nctcr, Wassennan, and Kutner, 1985 (pages 1086-*
1087), conclude that there is a significant serial correlation.  If D £ dij, conclude that there
is no serial correlation1. The Durbin-Watson statistic D is standard output in many
regression packages.


5.6.2        An Approximate Large-Sample Test

              If N > 50, the following approximate test can be used in place of the
Durbin-Watson test (e.g., see Abraham and Ledolter, 1983, page 63).
                                    Box 5. 15
              Large Sample Confidence Interval for the Serial  Correlation
       Compute the lower and upper limits, ^ and +u, defined by

                                                                    (5.21)
       and
                                                  "                 (522)
If the interval from $L to *u do*5 Hfit contain the value 0, conclude that the serial correlation
is significant Otherwise, conclude that die serial correlation is not significant


5.7           Procedures for Testing the Assumption of Normality

              Many of the, procedures discussed in this manual assume that the sampling
and measurement error follow a normal distribution. In particular, the assumption of
normality is critical for the method of tolerance intervals described later in Section 5.8.
1 The decision rule used here is somewhat different from the usual Durbin-Watson test described in most
 text books. For the applications given in this manual, the recommended decision rule results in deciding
 that autocorrelation exists unless there is strong evidence to the contrary.  Also, the particular value of du
 to use depends on N and "p-1". where p is the number of parameters in the fitted model. See section
      for an example

                                       5-23

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

 Thus, it will be important to ascertain whether the assumption of normality holds.  Some
 methods for checking the  normality  assumption are discussed below.


 5.7.1      Formal  Tests  for Normality

              The statistical tests used for evaluating whether or not the data follow a
 specified distribution are  called "goodness-of-fit tests."* The computational procedures
 necessary for performing the goodness-of-fit tests that work best with the normal
 distribution are beyond the scope of this guidance document.  Instead, the user of this
 document should use one of the statistical packages that implements a goodness-of-fit test.
 SAS (the Statistical Analysis System) is one such statistical package. A good reference for
 these tests is the book on  nonparametric statistics  by  Conover (1980).  Chapter 6. There axe
 many different tests for evaluating normality (e.g. D'Agostino, 1970; Filliben, 1975;
 Mage, 1982; and Shapiro and Wilk, 1985).  If a  choice is available, the Shapiro-Wilk or
 the Kolmogorov-Smirnov test with the Lilliefors  critical values is  recommended.


 5.7.2        Normal Probability Plots

              A relatively simple way of checking the normality of the data or residuals
 (such as those obtained from Box 5.4) is to plot the data or residuals ordered by si/e
 against  their expected values under normality. Their il expected value will be called EV;.
Such a plot is referred to as a "normal probability  plot."

              If there are no seasonal  effects, the residual e;, is simply defined to be the
 difference  between  the observed value and the sample  mean, i.e.,

                                     ej = Xj- X.                           (5.23)

 If seasonal variability is present, the residuals should be calculated from formula (5.8).  In
 either case, the ith ordered residual, e^, for i * 1,2,..., N, is defined to be the ith smallest
 value of the Cj's (that is, e(i) £ e(2) £... £ e(i) £... £ e^), and its expected value is given
 approximately by (SAS 1985):
2 These should not be confused with tests for assessing the fit of a regression model which are discussed
  later in Chapter 5.

                                        5-24

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS
                                                                       (5.24)

where $„-, is the sUndard deviation of the residuals and z(.) is given by formula
(5.25) below.  If formula (5.23) applies ~i.e., no seasonal effects are in evidence- and is
used to compute the residuals, then $„,» s, where s is given by formula (5.3). If formula
(5.8)  applies-requiring an adjustment for seasonal effects-and is used to compute the
residuals, then s^ = v where sj is given by formula (5.12).  The function z(a) is defined
to be the upper  lOOa percentage point of the standard normal distribution  and is
approximated by (Joiner and Rosenblatt 1975):

                                                                       (5.25)

              Under normality, the plot of the ordered residuals, e(;), against EV; should
fall approximately along a straight line. An example of the use of normal probability plots
is given in Section 6.X For more rigorous  statistical procedures for testing normality, use
the  "goodness-of-fit" tests mentioned  in  Figure  6.17.
5.8           Procedures for Testing Percentiles Using Tolerance Intervals

              This section describes a statistical technique for estimating and evaluating
percentiles of a concentration distribution. The technique is based on tolerance intervals
and is not recommended if there are seasonal or other systematic patterns in the data.
Moreover, this procedure is relatively sensitive to the  assumption that the data (or
transformed data) follow a normal distribution.  If it is suspected that a normal  distribution
does not adequately approximate the distribution of the data (even after transformation),
tolerance intervals should not  be used. Instead,  the procedure described later in  Section 5.9
should be  used.
5.8.1        Calculating a Tolerance Interval

              The Qth percentile of a distribution of concentration measurements is that
concentration value, say XQ, for which Q percent of the concentration measurements are
less than XQ and (100-Q) percent of the measurements arc greater than XQ. For example,
                                       5-25

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS

if the value 3.2 represents the 25th percendle for a give population of data, 25% of the data
fall below the value 3.2 and 75% are above it  Since the data represent a sample (rather
than the population) of concentration values, it is not possible to determine the exact value
of XQ from the sample data. However, with normally distributed data, a 100(1-a) percent
confidence interval around the desired percentile can be easily computed.

             Let x,, x2,.... XN denote N concentration measurements collected during a
specified period of time. As explained in Section 2.3.7, values that are recorded as below
the detection limit should be assigned the minimum detectable value (DL). The sample
mean, x, and the sample standard deviation, s, should initially be computed using the basic
formulas given in Section 5.1.

             Given Q and a, the upper 100(1-a) one-sided confidence limit for the true
percentile, XQ, is given by:

                                  *Q «  x + ks                       (5.26)

where k is a constant that depends on n, a, and P0 = (100-Q)/100. The appropriate values
of k can be obtained from  Appendix Table A.3. For values not shown in  the table, see
Guttman (1970).
5.8.2        Inference:   Deciding if the True Percentile is Less than the
             Cleanup  Standard

             The upper confidence interval as computed from equation (5.26) can be
used to test whether the true (unknown) Qth percentile, XQ, for a specified sampling period
is less than a value, Cs. The decision rule to be used to test whether the true percentile is
below Cs is:

             If XQ < Cs, conclude that the 0th percentile of ground-water contaminant
                    concentrations is less than the Cs (i.e., XQ < Cs).
             If £Q £ Cs, conclude that the Q* percentile of ground water contaminant
                    concentrations is not less than Cs and may be much greater than Cs.
                                     5-26

-------
                     CHAPTER  5: DESCRIPTIVE STATISTICS
                                    Box 5.16
         Tolerance Intervals: Testing for the 95th Percentile with Lognormal  Data

       Data for 20 ground-water samples were obtained to determine if the 95th
       percentile of the contaminant concentrations observed for a two-year period
       was below the cleanup standard of 100 ppm. A false positive rate of one
       percent (a = 0.01) was specified for the test  The data appeared to follow a
       lognormal distribution.   Therefore, the logarithms of the data (the
       transformed data) were assumed to have a normal distribution and were
       analyzed. In the following discussion, x refers to the original data and y
       refers to the transformed data. Because the log of the data was used, the
       upper confidence interval on the 95th percentile of the data was compared to
       the log of the cleanup standard [ln(100)-4.605].

       For the transformed data, the sample mean (the average of the logarithms)
       is:
                               _   72.372  ,.,10
                               y «-20--3.619


     The standard deviation of the transformed observations, s,  as calculated
       from equation (5.3)  is 0.715.

       For N = 20, a - .01  and PQ = 5%, k = 2.808 (from Appendix Table A.3).
       Finally, £95 can be calculated using equation (5.26):


                         y95 = 3.619 + 2.808(.715) » 5.627

       Since 5.627 is greater than 4.605 (the cleanup standard in log units), it is
       concluded that the 95^ true percentile may be greater than Cs.
5.9           Procedures for Testing Proportions


              An alternative statistical procedure for testing percentiles is based on the

proportion of water samples that have contaminant levels exceeding a specified value. As

was the case for the method of Section 5.8, this method is not recommended if there are

seasonal patterns in the data. If seasonal variability is present, consult a statistician.  The

equations presented in this section apply if the acceptable proportion of contaminated

samples is less thflfl Q|5 and large sample sizes are used.


              To  apply this test,  each  sample ground-water measurement should be coded

as either equal  to or above the cleanup standard, Cs, (coded as" 1") or below Cs (coded as

"0"). The statistical analysis is based on the resulting coded data set of O's and 1's. This
                                      5-27

-------
                     CHAPTER 5:  DESCRIPTIVE STATISTICS

test can be applied to any concentration distribution (unlike the method of tolerance
intervals which applies only to normally distributed data) and requires only that the cleanup
Standard be greater than the detection limit.

              Let xlt x2, .... XN denote N concentration measurements collected during a
specified period of time.  Corresponding to each measurement xit define a coded value
yj * 1 if Xj is greater than the cleanup standard and ys = 0, otherwise. The proportion of
samples, p, above the cleanup standard can be calculated using die following equations:

                                     r-I/i                           (5.27)
                                        i-l

                                       N
              Assuming that the observations are independent, the standard error of the
proportion, S, is given by:
                                 SP= V    N                            (5'29)

Formula (5.29) will tend to over estimate the variance if the data have a significant serial
correlation.  If the data have significant serial correlations, we can use formula (5.6) with
the x's replaced by the y's. Note that formulas (5.29) and (5.6) should only be used if N
is large; i.e., if N * 10/p and N2 10/(l-p).


5.9.1        Calculating Confidence Intervals for Proportions
              For sufficiently large sample sizes (i.e., N £ 10/p and N£ 10/(l-p), i.e. at
least 10 samples with measurements  above the cleanup  standard and  10  with measurements
below the cleanup standard), an approximate confidence interval may be constructed using
the normal approximation. If there is concern about the sample size N being too small
relative to p,  a statistician should  be  consulted.
                                       5-28

-------
                     CHAPTER 5:  DESCRIPTIVE STATISTICS

              For large sample sizes, the one-sided 100(l-o) percent upper confidence
 limit is given by:
                                PUO - P + zi-o S                      (5-30)
where p is the proportion of ground- water samples that have concentrations exceeding Cs,
and zi.a is the appropriate critical value obtained  from the normal distribution (see
Appendix Table A.2).

              The corresponding two-sided 100(l-a) percent confidence limits are given
by:

                              Pu«/2 » P + zi-otfSp                    (5.31)
and
                               pLo/2 " P ' rl-o/2 «p                     (5.32)
where zj.^ is the appropriate critical value obtained from the normal distribution (see
Table A.2). The range of values from P^^ to PUafl represents a 100(1 -a) percent
confidence interval for the corresponding population proportion.
5.9.2        Inference: Deciding  Whether the Observed Proportion  Meets
              the Cleanup  Standard

              The upper confidence limit as computed from equation (5.30) can be used to
test whether the true (unknown) proportion, P, is less than a specified standard, P0. The
decision rule to be used to test whether the true proportion is below P0 is:

              If Pua? < PO conclude that the proportion of ground-water samples with
                    contaminant concentrations exceeding Cs is less than P0.
              If PUO 2 PO, conclude that the proportion of ground-water samples with
                    contaminant concentrations exceeding Cs may be greater than or
                    equal to P0.
                                      5-29

-------
                      CHAPTER 5: DESCRIPTIVE STATISTICS
                                     Box 5. 17
                          Calculation  of Confidence Intervals
        For 184 ground- water samples collected during an 8-year period, 11
        samples had concentrations greater than or equal  to the cleanup standard.
        The proportion of contamination samples  is (equation 5.27):
       A one-sided confidence interval has an upper limit of (from equations 5.30):
       Assuming a = 0.05 (i.e., 95 percent confidence), Zj.a»  1.645.  The
       standard error of p determined from formula (5.29) is Sp = 0.0175.
       The confidence interval is thus .0000 to .0598 + .0288 or .0000 to .0886.
5.9.3        Nonparametric Confidence Intervals Around a Median

              An alternate approach to testing proportions is to test percentiles. For
example, the following two approaches are equivalent: (a) testing to see if less than 50% of
the samples have contamination greater than the cleanup standard and (b) testing to see if
the median concentration is  less than the cleanup standard. The method presented in this
section for testing the median can be extended to testing other percentiles, however, the
calculations can be cumbersome. If you wish to test percentiles rather than proportions, or
to test the median using other confidence intervals than are presented here, consultation
with a statistician is recommended

              If the data do not adequately follow the normal distribution even after
transformation,  a nonparametric confidence interval  around  the  median can  be constructed.
The median concentration equals  the mean if the  distribution is symmetric (see Section
2.5). The nonparametric confidence interval for the median is generally  wider and  requires
more data than the corresponding confidence interval for the mean based on the normal
distribution. Therefore, the  normal or log-normal distribution interval should be used
whenever it  is appropriate.
                                       5-30

-------
                     CHAPTER 5: DESCRIPTIVE  STATISTICS


              The nonparametric confidence interval for the median requires a minimum

of seven (7) observations in order to construct a 98 percent two-sided confidence interval,

or a 99 percent one-sided confidence interval. Consequently, it is applicable only for the
pooled concentration of compliance wells at a single point in time a for sampling to
produce  a  minimum of seven observations at a single well during the sampling period.


              The procedures below for construction of a nonparametric confidence

interval for the median concentration follow (U.S. EPA, 1989b). An example is presented

in Box 5.19.


              (1)    Within each well or group of wells, order the N data from least to
                    greatest, denoting the ordered data by xi, X2,...XN, where x; is the
                    ith value in the ordered data,  lies do not affect the  procedure. If
                    there are ties, order the observations as before, including all of the
                    tied values  as  separate observations.   That is,  each  of the
                    observations  with a common value  is included in the ordered list
                    (e.g., 1, 2, 2, 2, 3, 4,  etc.).  For ties, use the average of the tied
                    ranks.

              (2)    Determine the critical values of the order statistics as  follows. If the
                    minimum seven observations is used, the critical values arc 1 and 7.
                    Otherwise, find the smallest integer, M, such that the cumulative
                    binomial,  distribution with parameters N (the sample size) and p =
                    0.5 is at least 0.99. Table 5.3 gives the values of M and N+l-M
                    together with the exact confidence coefficient far sample sizes from
                    4 to  11. For larger samples, use the equation in Box 5.18.

              (3)    Once  M  has been  determined, find N+l-M and take  as the
                    confidence limits the order  statistics XM and xn+1.M (With the
                    minimum  seven observations, use X[ and x7.)

              (4)    Inference:  Deciding whether the site  meets the  cleanup standards.

                    After calculating  the upper one-sided nonparametric  confidence  limit
                    XM from (3).  use the following rule  to decide whether the ground
                    water attains the  cleanup standard:

                    If XM < Cs, conclude the median ground water concentration in the
                    wells during the sampling period is less than the cleanup standard.

                    If XM 2 Cs, conclude the median ground water concentration in the
                    wells during the sampling period  is not less than the cleanup
                    standard.
                                      5-31

-------
                   CHAPTER 5: DESCRIPTIVE STATISTICS


Table 5.3     Values of M and N+l-M and confidence coefficients for small samples
N
4
5
6
7
8
9
10
11
M
4
5
6
7
8
9
9
10
N+l-M
1
1
1
1
1
1
2
2
Two-sided
confidence
87.5%
93.8%
96.9%
98.4%
99.2%
99.6%
97.9%
98.8%
                                  Box 5.18
                               Calculation of M
                                                                (5.33)
      where zo.99 is the 99th peicendle from the normal distribution and equals
      2.33. (From Table A.2 in Appendix A)
                                    5-32

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS

Table 5.4     Example contamination data used in Box 5.19 to generate nonparametric
             confidence interval
Sampling
Dale
Jan. 1



April 1



Julyl



Octl



Wclll
Concentration
(ppm)
3.17
2.32
7.37
4.44
9.50
21.36
5.15
. 15.70
5.58
3.39
8.44
10.25
3.65
6.15
6.94
3.74
Rank
(2)
0)
(11)
(6)
(13)
(16)
(7)
(15)
(8)
(3)
(12)
(14)
(4)
(9)
(10)
(5)
WeU 2
Concentration
(ppm)
3.52
12.32
2.28
5.30
8.12
3.36
11.02
35.05
2.20
0.00
9.30
10.30
5.93
6.39
0.00
6.53
Rank
(6)
(15)
(4)
a)
(11)
(5)
(14)
(16)
(3)
(1.5)
(12)
(13)
(8)
(9)
(1.5)
(10)
5.10        Determining Sample Size for Short-Term Analysis and Other
             Data Collection Issues
             The discussion in Chapter 4 assumes that the number of ground-water
samples to be analyzed has been previously specified. In general, determination of the
number of samples to be collected for analysis must be done before collection of the
samples. The appropriate sample size for a particular application will depend upon the
desired level of precision, as well as on assumptions about the underlying distribution of
the measurements. Given below arc some guidelines for determining sample size for
estimating means, percentiles and proportions for short term analyses. When assessing
whether remediation has indeed been successful, use the procedures discussed in chapters
8 and 9 to determine the required sample size.  Some discussion of various data collection
issues is also offered hue.
                                    5-33

-------
                     CHAPTER 5: DESCRIPTIVE  STATISTICS
                                    Box 5.19
             Example of Constructing Nonparametric Confidence  Intervals

       Table 5.4 contains concentrations of a contaminant in parts per million from
       two hypothetical wells. The data are are assumed to consist of 4 samples taken
       each quarter for a year, so that 16 observations are available from each
       well. The data are not normally distributed, neither as raw data nor when
       log-transformed Thus, the nonparametric confidence interval is used. The
       Cs is 25 ppm

       (1)    The 16 measurements are ordered from the least to greatest within
             each well separately. The numbers in parentheses beside each
             concentration in Table 5.4 arc the ranks or order of the observation.
             For example, in Well 1, the smallest observation is 2.32, which has
             rank 1. The second smallest is 3.17, which has rank 2, and so
             forth, with the largest observation of 21.36 having rank 16.

       (2)    The sample size is large enough so that the approximation (equation
             5.33) is used to find M:


                        M -y + 1 + 2.33 -y^ - 13.7 = 14


       (3)    The approximate 95 percent confidence limits are given by the N + 1
             - M observation (16 + 1 -14 - 3rd) and the Mth largest observation
             (14th). For Well 1,  the 3rd  observation is 3.39 and the  Mth
             observation is 10.25. Thus the confidence limits for Well 1 are
             (3.99, 10.25). Similarly for Well 2, the 3rd observation and the
              14th observation arc found to give the confidence interval (2.20,
              11.02). Note that for Well 2 there were two values below the
             detection. These were assigned a value equal to the detection limit
             and received the two smallest ranks. Had there been three or more
             values below the detection, the lower limit of  the confidence  interval
             would have been the limit of detection because these values would
             have been the smallest values and so would have included the third
             order statistic.

       (4)    Neither of the two  confidence intervals' upper limit exceeds the
             cleanup standard of 25 ppm. Therefore, the  short-term  median
             ground water concentrations arc less than  the cleanup standard.
5.10.1       Sample Sizes for Estimating a Mean


              In order to determine the  sample size for estimating  a  mean, some
information about the standard deviation, o, (or equivalently, the variance o2) of the
                                      5-34

-------
                    CHAPTER 5: DESCRIPTIVE  STATISTICS

measurements of each contaminant is required This parameter represents the underlying
variability of the conceptual population of contaminant measurements. The symbol "A" is
used to denote that a is an estimate of o. In practice, o is either obtained from prior data or
by conducting a small preliminary investigation. Cochran (1977), pages 78-81, discusses
various approaches to determining a preliminary value for d.  Some procedures that are
useful in ground-water studies are outlined below.
             Use of Data from a Comparable Period

             The value o may be calculated from existing data which is comparable to the
data expected from the sampling effort. Comparable data will have a similar level of
contamination and be collected under similar conditions. For calculating the sample size
required for assessing attainment, one may be able to use data on contamination levels for
the wells under investigation from ground-water samples collected during the period in
which steady state is being established.  Using the comparable data, the value o may be
calculated using  formula  (5.3).


             Use of Data Collected Prior to Remedial Action

             If data from samples collected prior to remediation are available, the
variability of these sample measurements can be used to obtain a rough estimate of a using
the coefficient of variation.  The coefficient of variation is defined to be the standard
deviation divided by the mean. Remediation will usually result in a lowering of both the
mean and the standard deviation of contamination levels. In this case, it might be
reasonable to expect the coefficient of variation to remain approximately constant. In this
case, estimates of the coefficient of variation from the available data can be used to obtain a
as follows.

             Using this data, let (X) and s represent the sample mean and sample standard
deviation for data collected prior to remedial action,  perhaps from a previous study.
Calculate (X) and  (s) using the equations in Section 5.1. An estimate o of the standard
deviation when clean up standards are attained can be computed using the cleanup standard,
Cs, where
                                      5-35

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS



                                                                    (5.35)
                                        x



             Conducting  a Preliminary Study After Remedial  Action


             The following approach can be used if there are no existing data on
contamination levels from which to estimate a and if there is time to collect preliminary data

before sampling begins.

             (1)     After achieving steady state conditions (see Chapter 7), collect a
                    preliminary sample of at least nj = 8 ground-water samples over a
                    minimum period of 2 years. Determine the contamination levels for
                    these  samples.  The larger the sample size and the longer the period
                    of time over which the samples are collected, the more reliable the
                    estimate of o. A minimum of four samples per year is recommended
                    so that seasonal variation will be reflected in die estimate.

             (2)     From this preliminary sample, compute the estimated standard
                    deviation, s, of the contaminant levels. Use this standard deviation
                    as an estimate of o.
                                   Box 5.20
              Estimating a from Data Collected Prior to Remedial Action

       Suppose that the number of ground-water samples to be taken from a
       monitoring well prior to  remedial action was  limited to 10. The
       concentrations of total PAH'S  from the samples axe:

       0.24, 2.93, 3.09, 0.14, 0.60, 4.20, 3.81, 2.31, 1.11, and 0.07

       Using equations (5.1) and (5.3), the mean concentration is  X = 1.85 ppm
       and the  standard deviation of the measurements is s = 1.60 ppm.

       With a cleanup standard of .5 ppm, the value of a to use for determining
       sample size can be obtained from:

                              A   Cs * s   .5 * 1.60   .,
                              a =	     ~   1.85   s'43
                                     5-36

-------
                    CHAPTER 5: DESCRIPTIVE STATISTICS

             A Rough Approximation  of the  Standard Deviation

             If there are no existing data  to estimate o and a preliminary study is not
feasible, a very rough approximation for d  can still be obtained.  The approximation is
rough because it is based on speculation and  judgments concerning the range within which
the ground-water measurements are likely to fall.  Because the approximation is based on
very little data, it is possible that the sample sizes computed from these approximations will
be too small to achieve the specified level of precision. Consequently, this method should
only be used if no other alternative is available.

             The approximation is based on the fact that the range of possible ground-
water measurements (i.e., the largest such value minus the smallest such value) provides a
measure  of the underlying variability of the data. Moreover,  if the frequency  distribution of
the ground-water  measurementsof interest is approximately bell-shaped, then virtually all
of the measurements  can be expected to lie within three standard deviations of the mean. In
this case, if R represents the expected range of the data, an estimate of a is given by

                                     8-£.                           (5.36)

If the data are not bell-shaped, the alternative (conservative) estimate a » R/5 should be
used.

             Formula for Determining Sample Size for Estimating a Mean

             The equations for determining sample size require the specification of the
following quantities: Cs, ulf a, 0, a. Given  these quantities, the required sample size can
be computed from the following formula (e.g., see Neter, Wasserman, and Whitmore,
1982, page 264  and  Appendix F):
where z^ and zl^L are the critical values for the normal distribution with probabilities of
1-a and 1 -P (Table A.2) and the factor of 2 is empirically derived in Appendix F.
                                      5-37

-------
                     CHAPTER 5:  DESCRIPTIVE  STATISTICS

              Strictly speaking, formula (5.37) applies to simple random sampling.
 However, the standard error of a mean based on a systematic sample will usually be less
 than or equal to the standard error of a mean based on a simple random sample of the same
 size. Therefore, using the sample size formula given above may provide greater precision
 than is required.
                                    Box 5.21
                        Example of Sample  Size Calculations
             Following the example in Box 520, suppose that it is desired to be
       able to detect a difference of .2 ppm from the cleanup standard of .5 ppm
       (Cs = .5, m - .3) with a power of .80 (i.e., p = .20).  Also suppose that d
       = .43 and o* .01.
             From tables of the cumulative normal distribution (Appendix Table
       A.2), we  find  that zt_a *  2.326  and zt_p =  0.842.   Then using
       formula (5.37)
                      n   (.43)2 (2.236 + .842)2      ,_ft
                      n*       (.5-.3)2    <- + 2-45.8
             Rounding up, the sample size is 46.
5.10.2      Sample  Sizes for Estimating a Percentile Using  Tolerance
             Intervals
             To determine the required sample size for tests based on the procedure
described in Section 5.8, the following terms need to be defined: P0,Pi,o, p (e.g., see
Volume 1, Section 7.6). Once these terms have been established, the following quantities
should be obtained from Appendix Table A.2:

             Z!_P,   the upper ^-percentage point of a normal distribution;
             Zi.«,   the upper a-percentage point of a normal distribution;
             z1 _p0,  the upper P0-percentage point of a normal distribution; and

             ZJ.PJ,  the upper Prpercentage point of a normal distribution.
                                     5-38

-------
                    CHAPTER 5: DESCRIPTIVE  STATISTICS


             The sample size necessary to meet the stated objectives is then (see

Guttman, 1970):
                                 JZ1-B *  *l-al*

                                 1Z1-P0-  Z1-P,J
                            (5.38)
                                   Box 5.22
                   Calculating Sample Size for Tolerance  Intervals

             PCB's have contaminated the ground water near a former industrial
       site. The site managers have decided to use the procedures of Section 5.8 to
       help decide if the treatment can be terminated, Specifically, after discussion
       with ground-water experts, they decide to conclude that the treatment can be
       terminated if the 99th percentile of the PCB concentrations is less than Cs.
       That is, in the notation of Section 5.8, PQ- 1-.99 « .01. They have also
       decided to set the false positive rate of the test to a = .05. Moreover, they
       have required the false negative rate to be no more than 20 percent (P =
       0.20) when the actual proportion of contaminated samples is 0.5 percent
           .005).
             From Appendix Table A.2, zl p -z^-2.326; zlp =zw5=2.576;

       Zj^=z95=1.645; and Zj «=z ^=0.842. Using formula (5.38), the required
       sample size for each well is:
                      J.842 + 1.645\2
                   n* 12.326-2.576J
2'48712 =98.96-99
-.250
       where z,.p and zlKIare critical values from  the  normal distribution

       associated with probabilities of 1-a and 1-p (Appendix Table A.2).
5.10.3      Sample  Sizes for Estimating Proportions


             The sample size required for estimating a proportion using the procedures of
Section 5.9 depend on the following quantities: P0, Plt a, and p.  Given these quantities,

the sample size can be computed from the following formula (e.g., see Neter, Wasserman,

and Whitmorc, 1982, page 304):
                                     5-39

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS
                                     p  . p
                                     ro  ri
                                    Box 5.23
                 Sample size Determination Estimating Proportions

       At a site with corrosive residues in the top soil, much of the contaminated
       top soil has been removed. However, it is known that the contaminants
       have leached into the ground water. Wanting to minimise the possibility of
       future health effects, the site manager would like to know if, in the short
       term, she can be 95 percent confident (a * .05) that less than 10 percent (P0
       » .10) of the  ground-water samples have concentrations exceeding the
       cleanup standard. The expected proportion of contaminated ground-water
       samples is very low, less than 5  percent.  The manager wants to be 80
       percent confident (P »1-.80 » .20) that the ground water will be declared
       clean if the proportion of contaminated ground water samples is less than 5
       percent (Pi».05).

              Using  formula (5.39).
                        r.842V .05(.95) + 1.645V.10(.9Q) 12
                        I             in. o4?             /
                       .10-.

                       = 183.3

Rounding up gives a final sample size of 184.
5.10.4      Collecting the Data


             After the sample size and sampling frequency have been specified,

collection of the ground-water samples can begin. In collecting the samples, it is important

to maintain strict quality control standard and to fully document the sampling procedures.

Occasionally, a sample will be lost in the field or the lab.  If this happens, it is best to try to

collect another sample to replace the missing observation before teaching the next sampling

period. Any changes in the sampling protocol should be fully documented.
                                     5-40

-------
                     CHAPTER 5: DESCRIPTIVE  STATISTICS

              Data resulting from a sampling program can only be evaluated and
interpreted  with confidence  when adequate quality assurance methods and procedures  have
been incorporated into the program design.  An adequate quality assurance program
requires awareness of the sources  of error or variation associated  with each step of the
sampling  effort.

              If a timely and representative  sample of proper size and content is not
delivered to the analytical lab, the analysis cannot be expected to give meaningful results.
Failing to build in a quality assurance program often results in considerable money spent on
sampling and analysis only to find that the samples were not collected in a manner that
allows valid conclusions to be drawn from the resulting data.  Seen in its broadest sense,
the QA program should address the sample design selected,  the quality of the ground-water
samples,  and the care and skill spent on the preparation and testing of the samples.

              The samples should reflect what is actually present in the ground water.
Improper or careless collection of the samples can likely influence the magnitude of the
sample  collection  error.  Sample preparation also introduces quality control  issues.

              While a full discussion  of these topics is beyond the scope of this
document, the  implementation of an adequate QA program is important


5.10.5       Making Adjustments for Values Below the Detection Limit

              Sometimes the reported concentration for a ground-water sample will be
below the detection limit (DL) for the sampling and analytical procedure used.  The rules
outlined in Section 2.3.7 should be used to handle such measurements in the statistical
analysis.


5.11         Summary

              This chapter introduces the reader to some basic statistical procedures that
can be used to both describe (or characterize) a set of data, and  to test hypotheses and make
inferences from the data.  The chapter discusses the calculation of means and proportions.
Hypothesis tests-and confidence intervals are discussed for making inferences from the
data The statistics and inferential procedures  presented in this chapter are appropriate &

                                       5-41

-------
                     CHAPTER 5: DESCRIPTIVE STATISTICS

 for estimating  short-term  characteristics of contaminant levels  By "short-term
 characteristics" we mean characteristics such as the mean or percentile of contaminant
 concentrations during the fixed  of time during which sampling occurs.  Procedures
 far estimating the long-term mean and far assessing attainment are discussed in Chapters 8
 and 9. The procedures discussed in this chapter can be used in any phase of the remedial
effort; however, they will be useful during treatment.

              This chapter provided procedures for estimating the sample sizes required
for assessing the status of the cleanup effort prior tn a_final assessment of whether the
 remediation effort has been successful. It also discussed briefly issues involved in data
 collection.
                                      5-42

-------
    6.   DECIDING TO  TERMINATE TREATMENT  USING
                      REGRESSION  ANALYSIS
             The decision to stop treatment is based on many sources of information
including (1) expert knowledge of the ground water system at the site; (2) mathematical
modeling of how treatment affects ground water flows and contamination levels; and (3)
statistical results from the monitoring wells from  which levels of contamination can be
model and extrapolated. This chapter  is concerned with the third source of information.
In particular, it describes how one statistical technique, known as regression analysis,
can be used in conjunction with other sources of information to decide when to terminate
treatment.  The methods given here are applicable to analyzing data from the treatment
period indicated by the unshaded portion of Figure 6.1. Methods other than regression
analysis, such as  time series analysis (Box and Jenkins,  1970) can also be used.
However, these methods are usually computer intensive and require the assistance of a
statistician familiar with these methods,
Figure 6.1     Example  Scenario for Contaminant Measurements  During Successful
             Remedial  Action
                        Start
                      Treatment
   End     Start
Treatment Sampling
    Measured
     Ground
     Water
  Concentration
                                                            Hind Sampting
                                                           Declare Clean or
                                                             Contaminated
                                            Dale
             Section 6.1 provides a brief overview of regression analysis and serves as a
review of the basic concepts for those readers who have had some previous exposure to the
subject.  Section 6.2, the major focus of the chapter, provides a discussion of the steps
required to implement a regression analysis  of ground water remediation data  Section 6.3
                                     6-1

-------
    CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

  briefly  outlines important  considerations  in  combining  statistical  and  nonstatistical  informa-
  tion.


  6   .   1     Introduction to Regression Analysis

                Regression  analysis is a statistical technique  far fitting a theoretical  curve to
  a set of sample data For example, as a result of site clean-up, it is expected that contami-
  nation levels will decrease over time. Regression analysis provides a method for modeling
  (i.e., describing) the rate of this decrease.  In ground-water monitoring studies, regression
  techniques can be used to (1) detect trends in contaminant concentration levels over time,
  (2) determine  variables that influence concentration levels, and (3) predict chemical concen-
  trations at future points in time. An example  of a situation where a regression analysis
  might be useful is given in Figure 6.2 which shows a  plot of chemical concentrations for
  15 monthly samples taken from a hypothetical monitoring  well during the period of treat-
  ment.  As seen from the plot, there is a distinct downward trend in the observed chemical
  concentrations as a  function of time. Moreover, aside from some "random" fluctuation, it
  appears  that the functional  relationship between  contaminant levels and time can be  reason-
ably approximated by a straight line for the time interval shown. This mathematical rela-
  tionship is referred to as the regression "curve" or regression model. The goal of a regres-
  sion analysis is to estimate the underlying functional relationship (i.e., the model), assess
  the fit of the model, and,  if appropriate, use the model to  make predictions about future
.-observations.

               In general,  the underlying regression model need not be linear. However,
 to fix ideas, it is useful to introduce regression methods in the context of the simple
 linear regression model of which the  linear relationship in Figure 6.2 is an example.
 Underlying assumptions,  required notation, and the basic framework for simple linear
 regression analysis are provided in Section 6.1.1.  Section 6.1.2 gives the formulas
 required to fit the regression model. Section 6.1.3 discusses how to evaluate the fit of the
 regression  model using the residuals.   Section  6.1.4 discusses how  some  important
 regression statistics  can be used for inferential purposes (i.e., forming statistically defensi-
 ble  conclusions  form the data).
                                         62

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
 Figure 6.2    Example of a Linear Relationship Between Chemical Concentration
              Measurements and Time
                 I
                 J
          10 <
                        i-
                            F— J *«*>*y**
                            10       15
                                                          20
                                     TtaM (Month*)
6.1.1
Definitions, Notation, and Assumptions
             Assume that a total of N ground water samples have been taken from a
monitoring well over a period of time for chemical measurement. Denote the sample
collection time for il sample as t; and the chemical concentration measurement in the il
sample as c; where i = 1, 2, . . ., N. Let y; denote some function of the il  observed
concentration, for example, the identity function, y{ « cit the square root, y-t = Vcj. or the
log transformation; yi = ln(Q). Let Xi denote time or a function of the time, for example, if
the "time" variable is the original collection time, x; = t;, if the time variable is the reciprocal
of the collection time then Xi = l/t;, etc. If the samples are collected at regular time inter-
vals, then the time index, i, can be used to measure time in place of the actual  collection
time, i.e., Xi = i or x; = 1/i in the examples above.   Note that the notation used  in this
section is  different from that introduced in Chapter 5.

             The simple linear regression model relating the concentration mea-
surements  to time is defined by equation (6.1) in Box 6.1.
                                     6-3

-------
  CHAPTER 6: DECIDING TO TERMINATE  TREATMENT USING  REGRESSION
                                  ANALYSIS
                                   Box 6.1
                         Simple  Linear Regression Model

                       Xi - Po + Pi*i + «i. i-1.2	N            (6.1)
             In equation (6.1), p0 and p, are constants referred to as the regression
coefficients, or alternatively as the parameters of the model, and Cj is a random
error. The term "yi" is often referred to as the dependent, response, or outcome variable.
In this document, the outcome variables of interest are contamination levels or related
measures. The term "xi" is also referred to as an independent or explanatory variable. The
independent variable (for example the collection time) is generally under the control of the
experimenter. The term N represents the number of observations or measurements on
which the regression model is based.

             The regression coefficients are unknown but can be estimated from the
observed data under the assumption that the underlying model is correct The non random
pan of the regression model is the formula for a straight line with y-intercept equal to PQ
and slope equal to Pj. In most regression applications, primary interest centers on the
slope parameter.  For example, if x} = i and the slope is negative, then the model states that
the chemical concentrations decrease linearly with time, and the value of 0t gives the rate at
which the chemical concentrations decrease.

             The random error, ej, represents "random" fluctuations of the  observed
chemical measurements around the hypothesized regression line, yi = Po + PiXj.  It reflects
the sources of variability not accounted for by the model, e.g., sources of variability due to
unassignable or immeasurable causes.   Regression analysis imposes the  following
assumptions on the errors:

             (i) The tj's are independent;
             (ii) The q's have mean 0 for all values of x^
             (iii) The EJ'S have constant variance, a2, for all values of xi; and
             (iv) The tj's are normally distributed.
                                      6-4

-------
   CHAPTER 6: DECIDING TO TERMINATE  TREATMENT  USING  REGRESSION
                                   ANALYSIS
              These assumptions are critical  for the Validity Of the statistical tests used in a
 regression analysis.  If they do not hold, steps must be taken to accommodate any depar-
 tures from the underlying assumptions. Section 6.2.3 describes some simple graphical
 procedures which can be used to study the aptness of the Underlying assumptions and also
 indicates some corrective measures when the above assumptions do not hold

              Interested readers should refer to Draper and Smith (1966) or Neter,
 Wasserman, and Kutner (1985) for more details on the theoretical  aspects of regression
 analysis.


 6.1.2        Computational Formulas for Simple Linear Regression

              The computational formulas for most  of the important  quantities needed in  a
 simple linear  regression analysis  are summarized below. These formulas are given primar-
 ily for completeness, but have been written in sufficient detail so that they can be used by
 persons wishing to carry  out a simple regression analysis without the aid of a computer,
 spreadsheet, or scientific  calculator. Readers who do not need to know the computational
 details in a regression analysis  should skip this section and go directly to Sections 6.1.3
 and 6.1.4, where specific  procedures for assessing the fit of the model and making infer-
 ences based on regression model arc discussed

              Estimates of the slope, Plf and intercept, PQ, of the regression line are given
 by the values fy and b0 in  equations (6.2)  and (6.3) in Box 6.2. 'The statistics fy and b0 arc
referred to as least squares estimates. If the four critical  assumptions given in Section
 6.1.1. hold for the simple linear regression model in Box 6.1, b\ and b0 will be unbiased
estimates of Pj and P0, and the precision of the estimates can be determined.

              The estimated regression line (or, more  generally, the fitted curve)
under the model is represented by  equation (6.4)  in Box 6.3.
                                       6-5

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                  ANALYSIS
                                   Box 6.2
                        Calculating Least  Square Estimates

                                 N    N

                       £x.y.  . J'1  'j-1      £X.y. .Njfy
                      hL_LJ	_	„ i=»J	      (6>2)
                         N       (2Xi)2       £  X? - NX*
                              N         N
                              lYi      I«i
                                               y-bix             (6.3)
                                   Box 6.3
                            Estimated Regression Line
                                                                  (6.4)
The cakulated value of fy is called the predicted value under the model corresponding to
the value of the independent variable, &;. The difference between the predicted value, 9j,
and the observed value, yit is called the residual. The equation for calculating the residuals
is shown in Box 6.4.  If the model provides a good prediction of the data, we would expect
the predicted values, 9i« to  be close to the observed values,  y{. Thus, the sum of the
squared differences (y; - y*;)2 provides a measure of how well the model fits the data and is
a basic quantity necessary for assessing the model.
                                     6-6

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT  USING REGRESSION:
                                   Box 6.4
                                                                  (6.5)
             Formally, we define the sum of squares due to error (SSE) and the
 corresponding mean square error (MSE) by formulas (6.5) and (6.6). respectively, in
 Box 6.5.
                                   Box 6.5
               Sum of Squares Due to Error and the Mean Square Error

                             SSE -  I (yi - 9i)2                  (6.6)
                                     1-1
                                MSE  =  g .                      (6.7)
             As seen in the formulas in Box 6.2, the analysis of a simple linear regres-
sion model requires the computation of certain sums and sums of cross products of the
observed data values. Therefore, it is convenient to define the five basic regression quanti-
ties in Box 6.6.

             The estimated model parameters and SSE can be computed from these terms
using the formulas in Box 6.7.
                                     6-7

-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                              ANALYSIS
                               Box 6.6
       Five Basic Quantities for Use in Simple Linear Regression Analysis

                                    N
                              Sx  *  Z *i                     (6.8)


                                    N
                              sy« £y§                     (6.9)
                                  N     S
                           S«  =  Zx2 -Tf                 (6.10)
                                  N  ,  S2
                                                            (6.11)



                                                            (6.12)
                               Box 6.7
            Calculation of the Estimated Model Parameters ad SSE
                                                        '   ' (6.13)


                                        f                   (6.14)


                                        *
                          SSE m Sw - «^                  (6.15)
                                       a
         An example of these basic regression calculations is presented in Box 6.8.
                                 6-8

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
                                     Box 6.8
                 Example of Basic Calculations for Linear Regression

       Table 6.1  gives hypothetical water contamination levels for each of 15
       consecutive months. A plot of the data is shown in Figure 6.3. Using the
       formulas in Box 6.5, the following quantities were  calculated:
Sx  -  120

Syx -  -51.05
                                137.4
                                9.16
                           280
                           8
Syy » 11.801
       The estimated regression coefficients are then calculated as:
       bi» -0.1823                 bo » 10.62

       Therefore the fitted model is

             ft - bo + bixj -10.62-. 1832 Xj

       and, the corresponding mean square error is
MSE m SSE/(N - 2)
                                                     .1918.
       The straight line in Figure 6.4 is a plot of the fitted model.
Table 6.1      Hypothetical Data for the Regression Example in F'igure 6.3
Time (Month)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Contamination (PPM)
10.6
10.4
9.5
9.6
10.0
9.5
8.9
9.5
9.6
9.4
8.75
7.8
7.6
8.25
8.0
                                       6-9

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
Figure 6.3    Plot of data for from Table 6.1

o.

,0
"5
c
E
J
12-
10-
8 •

6 •
4 •
2 •
•
X w V
XXXXYXXXV
X Y
x x x x

1

»
                                                10
15
                                     Month
Figure 6.4    Plot of data and predicted values for from Table 6.1

a
e
o
1
c
|
c
o
o

12 •
10-
8 •

6 •
4 •
2 -


•
X-X-x-x-*-x-r*_>Lx_
x x *-*

t
t



                                                10
15
                                      Month
                                    6-10

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                    ANALYSIS

6.1.3        Assessing the Fit of the Model


              It is important to note that the computational procedures given in Section
6.1.2 can-always be applied to a set of data, regardless of whether the assumed model is
true. That is, it is always possible to fit a line (or curve) to a set of data. Whether the fitted
model provides an adequate description of the observed pattern of data is a question that

must be answered through examination of the "residuals." The residuals are the difference
between the observed and predicted values for the dependent variable (see Box 6.4). If the
model does not provide an adequate description of the data, examination of the residuals

can provide clues on how to modify the model.


             In a regression analysis, a residual is the difference between the observed
concentration measurement, y{ and the corresponding fitted (predicted) value, $1 (Box 6.3).
Recall that9i » b0 + b^, where bo and bt are the least squares estimates given by

equations (6.3) and (6.2), respectively.


             Since the residuals, ejf estimate the underlying error, BJ, the patterns exhib-

ited by the residuals should be consistent with the  assumptions given in Section 6.1.1 if the

fitted model is correct. This means that the residuals should be randomly and approxi-

mately normally distributed around zero, independent, and have constant variance. Some
graphical checks of these assumptions, arc indicated below.  An example of an analysis of

residuals is presented in Box 6.17.


              1.     To check for model fit,  lot the residuals against the time index or
                    the time variable, x;:.  The   appearance of cyclical of curvilinear
                    patterns (see Figure 6.5, plots b and c) indicate lack of fit or inade-
                    quacy of the model (see Section 6.2.1 for a discussion of corrective
                    measures).

             2.     To  check for constancy of variance, examine the plot of the residuals
                    against x; and the plot of the residuals against the predicted value,
                    yV   For both plots,  the residuals should be confined within  a
                    horizontal band such as  illustrated in Figure 6.5a If the variability
                    in the residuals increases such as in Figure 6.5d, the assumption of
                    constant variance is violated  (see Section 6.2.4 for a discussion of
                    corrective  measures  in  the presence of nonconstant variances).
                                       6-11

-------
  CHAPTER 6:  DECIDINGTO TERMINATE TREATMENT  USING REGRESSION
                                 ANALYSIS
Figure 6.5    Examples of Residual Plots (source: adapted from figures in Draper and
             Smith, 1966, page 89)
 OJ
 OJ
 0*
   2   M
      •o>
      •06
            a. MduiiMfcaagoodfttotili
            1      «

•OJ.  .  ,  .
  0    4    I   12  16  20  24
    ThM («  toMUfaHMJ to» nrhbto)

                                                         hi MaM don nit ahquttlydMaiii
                                                                      HI

                                                      0    4   1   12   16   20   24
                                                       ThM (or
   015

   0.10

   (LOS

2  ftOO'
•
'  -005

   •aio-
      415
            & HoiW dov not dtqutfriy dwribi
              pttmintodtt
             Xl
               "1
         0    4    6   12  16  20  24
          Tim (or  tmulonMd tlm vwiibto)
                                                •
                                                2
                                                0
                                                      4 Vmnc* B not conttiL
                                                         1   »     I
                                                             *  '
                                                                    «
                                                04    8    12   18  20   24
                                                 ThM (or trwMfonMd Urn vwlabto)
                                    6-12

-------
   CHAPTER 6: DECIDING TO TERMINATE  TREATMENT USING REGRESSION ANALYSIS

              3.     To check for normality of the residuals, plot the ordered residuals
                    (from smallest to largest) against their expected values under
                    normality,  EV; using the procedures of Section  5.7.2. Note that in
                    this case, the formula for computing EV; is given by equation (5.24)
                    with s^, replaced by VMSE.
              4.     To test for independence of the error terms, compute the serial
                    correlation of the residuals and perform the  Durbin-Watson test (or
                    the approximate large-sample test) described  in Section 5.6.

              It may happen that one or more of the  underlying assumptions  for linear
regression is violated. Corrective measures are discussed in Section 6.2. Figure 6.6
shows the residuals for the analysis discussed in Box 6.8. These residuals can be
compared to the examples in Figure 6.5.
Figure 6.6    Plot of residuals for from Table 6.1

f
a
o

e
I
3
0
u
0.8-
0.6-
0.4 «
0.2*

-0.2,1
-0.4.
-0.6.
-0.8 •

x x
x *
* x x * x

•5 10 15
X
X
X XX
1
                                         Month
6.1.4
Inference in Regression
             As mentioned earlier, two important goals of a regression analysis on
ground water remediation arc the determination of significant trends in the concentration
measurements and the prediction of future concentration levels. Assuming that the hypoth-
esized model is correct, the mean square error (MSB) defined by equation (6.6) plays an
                                      6-13

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

important role in malting inferences from regression models. The MSB is an estimate of
that portion of the variance of the concentration measurements that is not explained by the
model.  It provides information about the precision of the estimated regression coefficients
and predicated values, as well as the overall fit of the model.


              6.1.4.1      Calculating the Coefficient  of Determination

              The coefficient of determination, denoted by R2, is a descriptive
statistic that  provides a measure of the overall fit of the model and is defined in Box 6.9.
                                      Box 6.9
                             Coefficient of Determination
                                                                      (6.16)

       where SSE is given by equation (6.6) and Syy is given by equation (6.11).
              R2 is always a number between 0 and 1 and can be interpreted as the
proportion of the total variance in the y^s that is accounted for by the regression model.  If
R2 is close to 1 then the regression model provides a much better prediction of individual
observations than does the mean of the observations.  If R2 is close to 0 then using the
regression  equation to predict future observations is not much better than using the  mean of
the y^s to predict future observations.  A perfect fit (i.e., when all of the observed data
points fall on the fitted regression line) would be indicated by an R2 equal to 1. In practice,
a value  of R2 of 0.6 or greater is usually considered to be high and  thus an indicator that the
model can be reasonably used for predicting future observations; however, it  is not a
guarantee. A plot of the predicted values from the model and the corresponding observed
values should be examined to assess the  usefulness of the model.

              Figure 6.7 shows the R2 values for several hypothetical data sets. Notice
that the data in the middle of the chart  (represented by the symbol "x")  exhibit a pronounced
                                       6-14

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT  USING REGRESSION ANALYSIS

downward linear and, and this is reflected in a high R2 of .93. On the other hand,-the set
of data in the top of the chart (represented by "diamonds") exhibits no and in concentra-
tions, and this is reflected in a low R2 of .02. Finally, we note that the R  for the set of
data at the bottom of the chart is fairly low (about 0.5). even though there appears to be a
fairly strong (nonlinear) trend. This is because R measures the linear trend over  time
(months). For these data, the and in the concentrations is not linear, thus the correspond-
ing R  is fairly low.   If the time axis were transformed to  the reciprocal of time, the
resulting R2 for  the third data set would be close to 0.90.
Figure 6.7    Examples of R-Square for Selected Data Sets
i
**
|
£
3
^
1
0
12-
«
10-
8-
6-
,
4-


*«X*« •* ••***•*
* X
x x*x* » R-
XXXX ^ R.
x x* x R-
X
MX ||

                                                           R-Squar* - .93
                                                           R-Squar*-.Q2
0      5      10      15     20
               Tbn* (Month)
                                                    25     30
              While R  is a useful indicator of the fit of a model and the usefulness of the
model for predicting individual observations, it is not definitive. If the model is used to
predict the mean concentration rather than an individual observation or if the trend in the
concentrations is of interest, other measures of the model fit arc more useful. These arc
addressed  in the following  sections.
                                        6-15

-------
    CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

               6.1.4.2 Calculating  the Standard Error of the Estimated
                            Slope

               In a simple linear regression, the slope of the fitted regression line gives the
 magnitude and direction of the underlying trend (if any).  Because  different sets  of samples
would provide different estimates of the slope, the estimated slope given by equation (6.2)
 'is subject to sampling variability. Even  if the form of the assumed model  (6.1) were
 known to be true, it would still not be possible to determine the slope of the true relation-
 ship exactly.  However, it is possible to  estimate, with a specified degree of confidence, a
 range within which the true slope is  expected  to  fall.

               The standard error of bj provides a measure of the variability of the
 estimated slope. It is denoted by s(bO and is defined in Box 6.10.
                                     Box 6.10
                  Calculating the Standard Error  of the Estimated Slope
                                                                      (6.17)
              The standard error can be used to construct a confidence interval around the
 true slope of the regression line.  The formula for a 100(1-a) percent confidence interval is
 given by equation (6.17) in Box  6.11.
                                     Box 6.11
                   Calculating a Confidence Interval Around the  Slope
        where  t1^/2;N.2 is the upper 1- y percentage point of a t distribution with
        N-2 degrees of freedom (see Appendix Table A. 1).
                                       6-16

-------
   CHAPTER 6:  DECIDING TO TERMINATE  TREATMENT  USING  REGRESSION
                                    ANALYSIS
              The confidence interval provides a measure of reliability for the estimated
 value t>!.  The narrower the interval, the greater is the precision of the estimate b^  Because
 the confidence interval provides a range of likely values of P! when the model holds, it can
 be used to test hypotheses concerning the significant of the observed trend.


              6.1.4.3      Decision Rule for Identifying Significant Trends

              If the confidence interval given by equation (6.17) contains the value zero,
 there is insufficient evidence (at the a significance level) to  conclude  that them is a trend

              On  the other hand, if the confidence interval includes  only negative (or  only
 positive) values, we would conclude that there  is a  significant negative (or positive) trend.

              An  example in which the above decision rule is used to identify  a significant
 trend is given in Box 6.12.


              6.1.4.4      Predicting Future Observations

              If the fitted model is appropriate, then an unbiased prediction of the concen-
 tration level at time h is $h=&0 + b)Xh, where xh is the value of the time variable at time h.
 The  standard error of the estimate is given by equation (6.18), and the corresponding 100(1
 - a)  percent confidence limits around  the predicted value at time h are given by formula
 (6.19)  in Box 6.13.

              Note that if the fitted regression model is based on data collected during the
 cleanup period, the confidence limits given by formula (6.20) may  not strictly apply after
 treatment is terminated. Consequently, confidence  limits based on data from the treatment
 period which are used to draw inferences about the post-treatment period should be inter-
 preted with caution. Further discussion of the use of predicted values in ground  water
monitoring studies  is given in Section 6.2.
                                       6-17

-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                ANALYSIS
                                 Box 6.12
     Using the Confidence Interval for the  Slope to  Identify a Significant Trend

          For the data in Table 6.1, the  estimated regression line was
    determined to be yj» bo + bi x; * 10.62 - .1823 Xj.

                                                                  SSE
           The coefficient of determination for the fitted model is R2 = 1- 4r=

    » 1 - (2.49/11.8) «  .79.  That is, 79 percent of the variability in &
    contamination measurements is explained by the regression model provided
    that the model is correct.


           Using equation (6.16), the standard error of the estimated slope is

    s(bj) =  V^jf   = VTWT  =  .02617; and the corresponding 95 percent

    confidence limits for Pi are given by -.1823 ± (2.101) (.02617) or -.2373 to

    -.1273. (Note that a = .05, 1 - y » .975, N  =  15, and N-2 = 13; thus,

    ti-o/2j*-2 * t.925,13 * 2.101 from Appendix Table A.I.)

           Since the interval (-.2373, -.1273) does not include zero, we can
    conclude that the observed  downward trend is significant at the a = .05
    level. That is, we have high  confidence that the observed downward trend
    is real and not just due  to sample variability.
                                 Box 6.13
       Calculating the Standard Error and Confidence Intervals for Predicted
                                  Values
                                        w       *«             (6.19)


                            XH ± ti^.ji.2s(yh)                  «•*«
                                   6-18

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS


              An example in which the regression model is used to predict future values is
 presented in Box 6.14.
                                     Box 6.14
              Using the Simple Regression Model to Predict Future Values

              Continuing the example in Box 6.11, suppose that the site manager
       is interested in predicting the contaminant concentration for month  16*. The
       predicted concentration  level for month  16, assuming that the model holds,
       is

                   $16 - *>o + Mi6 «  W.62 - .1823(16) » 7.703.

       The standard error of the predicted  value is
                          .1918(1 +Tg+v*Wr) «  '*9M-
                                      1J      ^fcO\/

              Therefore, if the model holds, 99 percent confidence limits around
       the predicted value [see formula (6.20)] are given by 7.703 ± 2.878 (.4984)
       or from 6.269 to 9.137.
       * Again, it shpuld be emphasized that whenever a regression model is used to make
         predictions about concentrations outside the range of the sampling period, extreme
         caution should be used in interpreting the results. In particular, the regression results
         should not be used alone, but should be combined with other sources of information
         (see discussion in Section 63).
              6.1.4.5      Predicting Future Mean Concentrations


              If the fitted model is appropriate, then an unbiased prediction of the mean
concentration level at time h is £h * bo + btxh, where Xj, is the value of the time variable at

time h. Although the predicted mean  and  the predicted value for an individual observation
arc the same, the prediction error of the predicted mean is less than that for an individual

predicted value. The standard error of the predicted  mean is given by equation (6.21), and
the corresponding 100(1 - a) percent confidence limits around the predicted mean at time h

are given by formula (6.22) in Box 6.15


                                       6-19

-------
   CHAPTER 6:  DECIDING TO TERMINATE  TREATMENT USING REGRESSION
                                  ANALYSIS
                                   Box 6.15
        Calculating the Standard Error and Confidence Interval a Predicted Mean
                                                                   (6-22)
             Note that if the fitted regression model is based on data collected  during the
cleanup period, the confidence limits given by formula (6.19) may not strictly apply after
treatment is terminated. Consequently, confidence limits based on data from the treatment
period which are used to draw inferences about the post-treatment period should be
interpreted with caution.  Further discussion of the use of predicted values in ground water
monitoring studies is given in Section 6.2.


             6.1.4.6      Example  of a "Nonlinear" Regression

             Applying regression analysis is not always as straightforward as the
examples in Boxes 6.8, and 6.12 indicate. To show some of the possible complexities and
to help fix some  of the ideas presented, we will do a regression analysis on the data in
Table 6.2.  As shown in Figure 6.8, these data are not linear with respect to time and hence
a transformation of the independent variable was employed  (More information about the
use of transformations is given later in Section 6.2.3.) The analysis is summarized in Box
6.16 and the  fitted model is  plotted in Figure 6.9.
                                     6-20

-------
  CHAPTER 6:  DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                ANALYSIS

Table 6.2    Hypothetical concentration measurement for mercury (Hg) in ppm for 20
            ground water samples taken at monthly intends
- Month
January
February
March
April
May
June
July
August
September
October
November
Dpccmbw
January
February
March
April
May
June
July
August
Year
1986
1986
1986
1986
1986
1986
1986
1986
1986
1986
1986
1986
1987
1987
1987
1987
1987
1987
1987
1987
Goded
month (0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
COOCGQtfSDOfl

0.401
0380
0.352
0.343
OJ54
0.350
0.343
0.333
0.325
0.325
0.327
0.329
0.324
0.325
0.319
0.323
0.316
0.318
0.321
0.331
Reciprocal
of month (x)
1.0000
0.5000
0.3333
0.2500
0.2000
0.1667
0.1429
0.1250
0.1111
0.1000
0.0909
0.0833
0.0769
0.0714
0.0667
0.0625
0.0588
0.0556
0.0526
0.0500
Figure 6.8    Plot of Mercury Measurements as a Function of Time (See Box 6.16)
       o
0.42-


0.40-


0.38


0.36-


0.34-


0.32-


0.30
                        X
                     X   X
                                 M
                             ******  «
                                     *  X*
                  10      15

                 Tbn* (Month)
                                            20
25
                                   6-21

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING  REGRESSION
                                  ANALYSIS
                                   Box 6.16
                     Example of Basic Regression Calculations

       Table 6.2 shows mercury concentrations for 20 ground water samples taken
       from January 1986 to August 1987.  A plot of the concentration measure-
       ments as a function of time is shown in Figure 6.8.  Because the data
       exhibited a nonlinear trend, it was decided to consider the model yi = Po +
       PlXj + Cj, where x; = 1/i. The values of the reciprocals of time are shown in
       the last column of the table.

       For these data, the following quantities were calculated:  Sx « 3.598; Sy =
       6.739; Sxx = .949; Syy = .00909; Syx - .0866, y = .337, y - .337,  x =
       .180.

       The estimated  regression coefficients were then calculated as:  bi =
       .0866/.949 = .0913; and bo = .337 - (.0913)0180) = .321.  The fitted
       model is therefore

                         A     ..,,„    ,,..,..0913
                         7i *  bo + biXi ».321 + —j—

       and the associated mean square error is

                                  nnono  •0866^
                           eep   .VAWW -   -.Q
                   MSE * ^=	lg •**   » .000066.

       Figure  6.9  shows a  plot  of the fitted  model against  the observed
       concentration values.
Figure 6.9    Comparison of Observed Mercury Measurements and Predicted Values
             under the Fitted Model (See Box 6.16)
o
                   0.42 i


                   0.40 -


                   0.38-


                   0.36-


                   0.34-


                   0.32-


                   0.30
                          Fitted modal: y - .321 * 0.0913/1
                                    XX'
                               5       10      15
                                     Tbm (Month)
                                        20      25
                                     6-22

-------
CHAPTER 6: DECIDING TO TERMINATE  TREATMENT USING  REGRESSION
                               ANALYSIS
                                Box 6.17
                  Analysis of Residuals for Mercury Example

    Figure 6.10 shows a plot of the residuals for the mercury data in Table 6.2
    based on the fined model, $t » .321 + 0.0913Ai (see  Box 6.16).   The
    residual plot indicates some lack of fit of the model In particular, it appears
    that the fitted model tends to underestimate concentrations at the earlier times
    while overestimating concentrations at the later times. (Since the residuals
    represent the differences  between the actual and predicted values, the
    positive values of the residuals in the earlier months indicate that the actual
    values tend to be larger than the predicted values then.  Hence, the model
    underestimates the earlier concentrations.)

    To see whether the fit could be improved by using a different transformation
    of i, the following alternative model was considered: y4 * 0o + Pi/VT + ej.
    For this model, the estimated regression coefficients are bo = .2957 and fy
    -. 1087, and the coefficient of determination is R2 = .927 (compared to .89
    for the earlier model).  This  indicates a somewhat better fit when 1/VT is
    used as the independent variable (see Figure 6.11). The residual plot under
    the  new model (see Figure 6.12)  seems to support this conclusion.
    Moreover, the standard error of bt is s(bt) » .0072, and hence 95 percent
    confidence  limits around  the true  slope are  given  by  .1087 ±
    (2.101)(.0072), or .094 to  .124.  Since the interval does not include zero,
    we further conclude that the trend is significant.

    Finally, Figure 6.13 shows a normal  probability plot of the ordered
    residuals based on the revised model, where the expected values, EVj were
    computed using formula (5.24) with $„,» VMSE.  There  is a  nonlinear
    pattern in the residuals which suggests that the normality assumption may
    not be appropriate for this model.  If a formal test indicates the lack of
    normality  is  significant,  nonlinear regression procedures should be
    considered.
                                  6-23

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                 ANALYSIS

Figure 6.10   Plot of Residuals Against Time for Mercury Example (see Box 6.17)
           -0.02
            0.01
      2    0.00
           -0.01-
           -0.02
                   X   *«
                                      X    X
                                         X
                       5       10      15     20      25
                             Time (Month)
Figure 6.11   Plot of Mercury Concentrations Against x * 1/VT, and Alternative Fitted
            Model (see Box 6.17)
      I
       8
       I

       I
           0.30
               0.2     0.4     0.6     0.8     1.0
                   X (reciprocal of square root of time)
                                   6-24

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
Figure 6.12   Plot of Residuals Based on Alternative Model (see Box 6.17)
            0.02
            0.01 -
            o.oo
            -0.01
                             *    *  *
                         5       10      15      20      25
                               Tim* (Month)
Figure 6.13   Plot of Ordered Residuals Versus Expected Values for Alternative Model
             (see Box 6.17)
            0.02-1
            0.01 •
            o.oo-
            -0.01
               -0.015       -0.005       0.005

                              Expected value
0.015
                                     6-25

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

              To summarize, if the data are originally linear (such as the data in Table
 6.1), then we may fit the simple linear regression model of Box 6.1.  If the data are more
 complex (e.g. the data in Table 6.2). then a transformation may be used as  was done in
 Box 6.16. One can transform either the independent (i.e., the explanatory) variable or the
 dependent (i.e., the outcome) variable, or both. Finding the appropriate transformation is
 as much an art as it is a science. Consultation with a statistician is recommend in order to
 help identify useful transformations and to help interpret the model based on the
 transformed  data.
6.2          Using Regression to Model the  Progress of Ground Water
              Remediation
              As samples arc collected and analyzed during the cleanup period, trends or
other patterns in the concentration levels may become evident. As illustrated in
Figure 6.14, a variety of patterns are possible. In situation 1, regression might be used to
determine the slope for observations  beyond time 20 to infer if the treatment is effective. If
not, a decision might be made to consider a different remedial program.  For Situation 2,
the concentration  measurements have decreased  below the  cleanup  standard, and  regression
might be used to investigate whether the concentrations can be expected to stay below the
cleanup standard. For Situation 3 in Figure 6.14, which could arise from factors such as
interruptions or changes in the treatment technology or fluctuating environmental condi-
tions, regression can be used to assess trends. However, due to the highly erratic nature of
the data any p&dictions of trends of future concentrations arc likely to be very inaccurate.
Additional data collection will be necessary before conclusions can be reached. Where
appropriate, regression analysis can be useful in estimating and assessing the significance
of observed trends  and  in predicting expected levels  of contaminant concentrations  at future
points in time.

              Figure 6.15 summarizes the steps for implementing a simple linear regres-
sion analysis at Superfund sites. These steps are described in detail in the sections that
follow.
                                       6-26

-------
  CHAPTER 6:  DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

Figure 6.14   Examples  of  Contaminant Concentrations that Could Be Observed During
              Cleanup
            1
            •
16
14
12
10
 •
 6<
 4
 2
                                  Situation  1
                              AtymptMi Starton: Ctoanup
                            standard potanttaNy unatMnabto.
                            Ctoanup standard, Cs - 6 ppm
                     0      10
                 Start Ctoanup
                20     30     40     80
                 Tiaia
               Situation  2
             J
                               bataiv ttw daanup stanoafd.
                                 Oaanup stadard. C* • 8 ppm
                            10     20     30     40     50
                                  Situation  3
                                  Highly vartaMa
             I
                            10     20    30     40     80
                                        6-27

-------
   CHAPTER 6: DECIDING TO TERMINATE  TREATMENT  USING  REGRESSION
                                       ANALYSIS

Figure 6.15   Steps for Implementing Regression  Analysis at Superfund Sites
                Choose a
             linear or nonlinear
                regression
              (Section 6.2.1)
               regression
              (Section 6.2.1)
                                                  Consult a
                                         about nonlinear
                                            models
Estimate model parameters
  and calculate residuals
     (Section 6.23)
f
1
Tnnsfonii

Ron
                                            variables or
                                           use weighted
                                            regression
                                           (Section 6.2.4)
                                                             autocorrelation
                                                             from residuals
                                                             (Section 6.2.5)
            Assess fit of model
              (Section 6.23)
                                            Is variance
                                            of residuals
                                             constant?
                                           (Section 6.23)
      is there
a good fit to the data?
   (Section 6.23)
                                                                  Are the
                                                             errors independent?
                                                               (Section 6.25)
                                       Test for significant trend
                                       and set confidence limits
                                       around predicted values
                                           (Section 6.1.4)
                                       Combine regression re-
                                        sults with other inputs
                                           (Section 63)
                                                                  End

                                          6-28

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                    ANALYSIS
 6.2.1        Choosing a Linear or Nonlinear Regression

              The first step in a regression analysis  is to decide whether a linear or nonlin-
 ear model is appropriate.  An initial choice can often be made by observing a plot of the
 sample data over time. For example, for the data of Figure 6.2, the relationship between
 concentration measurements and time is apparently linear. In this case, the regression
 model (6.1) with x; = i would be appropriate. However, for the data displayed in Figure
 6.16.  some sort of nonlinear model would be appropriate.

              Sometimes it is possible to model a nonlinear relationship such as that
 shown in Figure 6.16 with linear regression techniques by transforming either the depen-
 dent or independent variable.  In some cases, theoretical considerations of ground water
 flows and the type of treatment applied may  lead to the formulaton of a particular nonlinear
 model such as "exponential decay." This, in turn, may lead to consideration of a particular
 type of transformation (e.g., logarithmic or inverse transformations). However, these a
 priori considerations do not preclude testing the model for adequacy of fit.  Choosing the
 appropriate transformation may require the assistance of a statistician; however, if the
 (nonlinear) relationship is  not too complicated,  some relatively  simple transformations may
 be sufficient to "linearize" the model, and the procedures given in Section 6.1  may be used.
 On the other hand, after analysis of the residuals (as described below in Section 6.2.3), if
 none of the  given transformations appears to be adequate, nonlinear regression methods
 should be used (see  Draper and Smith, 1966; Neter, Wasserman,  and Kutner, 1985). A
 statistician should  be  consulted about these methods.

              Figure 6.17 shows examples of two general types of curves that might
reasonably approximate the relationship between observed contaminant levels and time. If
a plot of the concentration measurements versus time exhibits  one of these patterns, the
transformations listed below in Box 6.18 may be helpful in making the model linear. Since
the initial choice of transformation may  not provide a "good" fit,  the process of determining
the appropriate transformation may require several iterations. The procedures described in
 Section 6.2.3 can be used to assess the fit of a particular model.  Box 6.18 contains some
suggested transformations for the two types of curves shown in Figure 6.17 (source:
Neter, Wasserman, and Kutner, 1985).
1 Although a model such as y » ft) + P i ( - J is a nonlinear equation; it is called a linear regression model
  because the coefficients. Po and Pi, occur in a linear form (as opposed to say y » Po +
                                       6-29

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION

                              ANALYSIS
Figure 6.16   Example of a Nonlinear Relationship Between Chemical Concentration

            Measurements and Time
           i
8



7




6




5



4H



3




2
                              X X
                                     10

                                Thiw (Month)
                                     20
Figure 6.17   Examples of Nonlinear Relationships
      o
      o
             6-
             4-
             2-
                TYP.A
                             10     15


                               Time
                         20
                                                    Typ«B
                                                  26
                                  6-30

-------
CHAPTER 6: DECIDING TO TERMINATE  TREATMENT USING REGRESSION
                                ANALYSIS
                                 Box 6. 18
                          Suggested Transformations

    Type  A: Contaminant concentrations following this pattern  decrease
    slowly at tit and then more rapidly later on. A useful transformation to
    consideris
                                   Xj-iP

    where p is a constant greater than 1. If the decline in concentrations is very
    steep, set p = 2, initially, and then try alternative values, if necessary, to
    obtain a  ood fit.

    Type B: Contaminant concentrations following this pattern decrease
    rapidly at first and then more slowly later on. Useful transformations to
    consider in this case are
           •                      xi» VT.

    Alternatively, one can also consider transforming y;; e.g., use the
    transformed variable
                                 yi' - 1/yi

    either in lieu of or together with the transformed time variable, whichever
    appears to be appropriate.

    There is no guarantee that using transformations will help; and its  effective-
    ness  must be determined by checking the fit of the model and examining the
    residuals. Consultation with a statistician is recommended to help identify
    useful transformations and to interpret the model based on the transformed
    measurements.
                                   6-31

-------
    CHAPTER 6: DECIDING TO  TERMINATE TREATMENT USING  REGRESSION ANALYSIS

 6.2.2        Fitting the Model

        In a regression analysis, the process of "fitting the model" refers to the process of
 estimating the regression parameters and associated sampling errors from the observed
 data.  With these estimates, it is then possible  to (1) determine whether the model provides
 an adequate description of the observed chemical measurements; (2) test whether there is a
 significant trend in the chemical measurements over time; and (3) obtain estimates of
 concentration levels  at future points in time.

               Given a set of concentration measurements, y;, i =  1, 2,. .  . , N, and corre-
 sponding time values, x;, the estimated slope and interpret of the fitted regression line can
 be computed  from  the equations in Section 6.1.2. For the fitted model, the error sum of
 squares,  SSE,  and coefficient of determination should also be computed,

              Note that the model fitting will, in general,  be an iterative process. If the
 fitted model is inadequate for any of the reasons indicated  below  in Section 6.2.3, it may be
possible to  obtain a  better fitting model by considering transformations of the  data.


 6.2.3       Regression in the  Presence of Nonconstant  Variances

              If the residuals for a fitted model exhibit a pattern such as that shown in
 Figure 6.14d the assumption of constant variance is  violated, and corrective  steps must be
 taken. The two most common corrective measures  arc: (1)  transform the dependent
 variable to stabilize the variance; or (2)  perform a "weighted least squares regression"
 (Neter, Wasserman, andKutner, 1985).

              Transformations of the dependent variable that are useful for stabilizing
 variances  are the square root transformation, the logarithmic transformation, and the
 inverse transformation. Which transformation to use in a particular situation depends on
 the way the variance increases. To determine this relationship, it is useful to divide the data
 into four or five groups based on the time  at which observations were made.  For example,
 the first group might consist of the first four observations, the second group  might consist
 of the next four observations, and so on. For the g^ group, compute the mean of the
 observed concentrations, yr and the standard deviation of the concentrations, sg (Section
 5.1).  If a plot of sg versus ?K is approximately a straight line, use V7i»the square root
                                       6-32

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS

 transformation, in the regression analysis; if a plot of s, versus pg is approximately a
 straight line, use logty;), the logarithmic transformation, in the analysis; and, finally, if a
 plot of VsTversus y. is approximately a straight line, use —, the inverse transformation, in
         •                                         Ji
 the analysis (Neter, Wasserman, and Kutner, 1985).

              The other major method for dealing with nonconstant variance is weighted
 least squares regression.  Weighted least squares  analysis provides a formal way of
 accommodating nonconstant variance in regression. To apply this method, the form of the
 underlying variance structure must be known or estimated from the data. This method is
 described elsewhere; e.g., Draper and Smith (1966). A statistician should be consulted
 when applying  these  methods.
6.2.4         Correcting for Serial Correlation

              It is sometimes possible to remove the serial correlation in the residuals by
transforming the dependent and independent variables. Applied Linear Statistical Models
by Neter,  Wasserman, and  Kutner (1985), amplifies the  following iterative procedure.
              6.2.4.1      Fitting the Model

              The four steps for fitting the model to remove serial correlations arc
discussed below.
       (1) Calculate the serial correlation of the residuals,   s, using the formula in Box
5.14.

       (2)  For i = 2, 3, . . . , N, transform both the dependent and independent variables
using  equation (6.23) in Box 6.19. Perform an ordinary least squares regression on the
transformed variables. That is, using the procedures of Section 6.1.2, fit the "new" model
given  by equation (6.24).
                                      6-33

-------
   CHAPTER 6: DECIDING TO TERMINATE  TREATMENT USING REGRESSION
                                   ANALYSIS
                                    Box 6.19
                          Transformation to "New" Model

       Transform both the  dependent and independent variables using the
       formulas:
                     yi' * yi - Wi-i   and  Xj1 = Xj - *obsxi-i.         (6-23)


       Fit the following model using the transformed variables:


                              yi' -  Po' + Pi'*i' + «i •                  (6.24)

       Note that one observation is lost in the transformed measurements because
       (6.26) cannot be determined for i = 1.
              Denote  the  least squares  estimates  of the parameters of the new
(transformed) model by bo* and bj' and denote the fitted model  for the transformed

variables by equation (6.25) in Box 6.20.
                                    Box 6.20
                   "New" Fitted Model for Transformed Variables
                               yY  » V +  0,'Xi'                    (6.25)
             Calculate the residuals for the new model:  e2' = y^ - (bo* + bj'Xj1). Note

that the fitted model (6.25) is expressed in terms of the transformed variables and not the

original variables.


       (3) Perform the Durbin-Watson test (or approximate test if the sample  size is large)

on the residuals of the model fitted in step (2).  If the test indicates that the serial correla-

tion is not significant, go to step (4).  Otherwise, terminate the process and consult a

statistician far alternative methods of correcting for serial correlation.
                                      6-34

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                   ANALYSIS
       (4) In terms of the original variables, the slope and the intercept of the fitted
regression line are provided in Box 6.21.
                                    Box 6.21
       Slope and Intercept of Fitted Regression Line in Terms of Original Variables

                                                                    (6.26)
                                                'obs
       where $obs is the estimated autocorrelation determined by using the
       residuals obtained from fitting the untransformed data, and bg' and bj' are
       least squares estimates obtained from the transformed data.
             The approach given above has the effect of adjusting the estimates of
variance to account for the presence of autocorrelation. Typically, the variance of the
estimated regression coefficients is larger when the errors are correlated, as compared with
uncorrelated errors.  An example of the use of this technique is given in Box 6.22.
             6.2.4.2      Determining Whether the Slope is Significant

             The standard error of the slope of the original model is simply the standard
error of the slope, bb obtained from the regression analysis performed on the transformed
data defined in Box 6.21. The formulas given in Section 6.1.4 can be used to compute the
standard error of bp The decision rule in Section 6.1.4.3 can be used to identify whether
the trend is statistically significant Note that for the transformed data, the total number of
observations is N- 1.
                                      6-35

-------
CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                                ANALYSIS
                                 Box 6.22
                       Correcting far  Serial Correlation

    Table 6.3 shows the concentration of benzene in 15 quarterly ground water
    samples taken from a monitoring well at a former manufacturing site.  It
    appeared from a plot of the data (see Figure 6.18) that a simple linear model
    of the form: yj * fa + fM + q might be appropriate in describing the relation-
    ship between concentrations and time.

    A regression analysis was performed on the data with the following results:
    (a) the fitted model was estimated to be & » 29.20 - .4781; (b) R2 = 0.73;
    (c) 95 percent confidence limits around the slope of the line were calculated
    to be -0.478 ± (2.16K.082), or -0.66 to -0.30; and (d) the Durbin-Watson
    statistic was computed to be D ».795.

    For N « 15 and p-l=l (there are two parameters in the model), the critical
    value for the Durbin-Watson test is dy = 1.36 at the .05 significance level.
    Since D < 1.36, it was concluded that there was a significant autocorrela-
    tion. Although the calculated confidence interval for the slope of the line
    apparently indicated that the observed downward trend was significant, it
    was recognized that the presence of autocorrelations could lead to erroneous
    conclusions.  Therefore, the data were re-analyzed using the method of
    transformations described earlier in this section.

    First, the serial correlation was computed from the residuals as $0bs = -57.
    Then the observed concentrations and time variable were transformed as
    follows:  yj' = y{ - -Sly^; and Xj1« i - .57(i-l). A regression of y{ on Xj1
    resulted in least squares estimates of bj* * -.34 and b0' =  11.89 for the
    transformed variables, with s(bi') «  .17. Therefore, using equation  (6.26),
    estimates of the slope and intercept for the original data were calculated as

    b! = bi' = -.34, and bo - rt - ^it2 -  27-65- Note *« *« revised
     11             "   '••5/      .
    estimates are close to the original estimates, except that now the standard
    error of bi is much larger that it was before the effect of the autocorrelations
    was taken into account in the analysis (.17 vs. .082).  Because of this
    increase in variance, 95 percent confidence limits around the true slope are
    now given by -.34 ± (2.179)(.17), or -.71  to .03.  In this case, the interval
    includes zero, and therefore at the five percent significance level, we  cannot
    conclude that the observed trend is significant
                                   6-36

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION
                               ANALYSIS

Table 6.3     Benzene concentrations in 15 quarterly samples (see Box 6.22)
Ye*
1983



1986



1987



1988


Qianer
Pint
Second
HiM
Fourth
First
Second
Hunt
Fourth
First
Second
Hud
Fourth
First
Second
Third
(>vd
quarter (i)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Concentration
inppb(y)
30.02
29.32
28.12
28.32
27.01
24.78
24.00
23.78
24.25
23.24
21.98
25.00
24.10
23.75
23.00
Figure 6.18  Plot of Benzene Data and Fitted Model (see Box 6.22)
      e
      o
      e
      o
32

30

28

26

24 H

22
            20
                                   Fitted modal:
                                   y • 29.2 • .478 i
               0   2   4   6   8   1012141618  20
                          Coded quarter  (I)
                                  6-37

-------
  CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS
              6.2.4.3      Calculating    the  Confidence  Interval  for  a
                           Predicted  Value
              The general procedures in Section 6.1.4 can also be used to develop confi-

dence limits for the predicted concentration at arbitrary time h (as shown in Box 6.23).
                                    Box 623
        Constructing Confidence Limits around an Expected Transformed Value

       Referring to the fitted model  (6.28),  use equation (6.19) to construct
       confidence limits around the expected transformed value at time h:
                                                                    (6-27)

       and

                            W « 9h - ti.«/2*r.3 *# h' >•                <6-2*)

       where, 9h' * bo!  + bi'xh'; xh * the value of the  time variable at time h;
       and s$h') is the standard error of $h' as computed from equation  (6.18)
       using the transformed data.  Note that the "t value" used in die confidence
       interval is based on N-3 (instead of N-2) degrees of freedom because we are
       estimating and additional parameter (the serial correlation) from the data.

       Since the limits given in equations (6.27) and (6.28) are in the transformed
       scale, the upper- and lower-confidence limits in the original scale arc given
       by:
                             Cupper " Uh( +  obrf,,                  (6.29)

       and

                             yh.iow« - W + fobsVh-                  (6.30)
6.3          Combining Statistical Information with Other Inputs to the
             Decision Process
             The statistical techniques presented in this chapter can be used to (1) deter-

mine whether  contaminate concentrations are  decreasing over time,  and/or (2) predict future

concentrations if present trends continue.  Other factors must be used in combination with
                                      6-38

-------
   CHAPTER 6: DECIDING TO TERMINATE TREATMENT USING REGRESSION ANALYSIS


these  statistical results to decide whether the remedial effort has  been successful, and when

treatment should be terminated Several  factors to consider  are:



                     Expert knowledge of the ground Water at this site and experience
                     with, other remedial efforts at similar sites;

                     The results of mathematical models of ground water flow and
                     chemistry with sensitivity analysis and assessment of the accuracy
                     of the modeling  results; and

                     cost and scheduling considerations.


              The sources of information above can be used to answer the following

questions:

                     How long will it take for the ground water system to reach steady
                     state before the sampling  for the attainment decision can begin?

                     What is the chance that the ground water concentrations will
                     substantially exceed the cleanup standard before the ground water
                     reaches  steady state?

                     What are the chances that the final assessment will conclude that the
                     site attains the cleanup standard?

                     what are the costs of (1) continuing treatment, (2) performing the
                     assessment, and (3) planning for  and initiating additional treatment  if
                     it is  decided that  the site does not attain the cleanup standard?


              The answers to these questions should be made in consultation with both

statistical and ground water experts, managers of the remediation effort and the regulatory
agencies.
6.4          Summary


              This chapter discussed the use of regression methods for helping to decide

when to stop treatment. In particular, procedures were given for estimating the and in

contamination levels and predicting contamination levels at future points in time. General

methods for fitting simple linear models and assessing the adequacy of the model were  also

discussed.


              In deciding when to terminate treatment, the chapter emphasized that:


                                        6-39

-------
   CHAPTER 6: DECIDING TO TERMINATE  TREATMENT  USING REGRESSION
                                  ANALYSIS
                    Interpreting the data is usually a multiple-step process of refining the
                    model and understanding the data;

                    Models are a useful but imperfect description of the data. The
                    usefulness of a model can be evaluated by examining how well the
                    assumptions fit the data, including an analysis of die residuals;

                    Correlation between observations collected over time can be impor-
                    tant and must be considered in the model;

                    Changes in treatment over time can result in changes in variation,
                    and correlation and can produce anomalous behavior which must be
                    understood to make correct conclusions from the data; and

                    Consultation with a ground water expert is advisable to help inter-
                    pret the results and to decide when to terminate treatment
             Deciding when to terminate treatment should be based on a combination of

statistical results, expert knowledge, and policy decisions. Note that regression is only one

of various statistical methods that may be used to decide when treatment should be termi-

nated. Regression analysis was discussed in this document because of its relative simplic-

ity and wide range of applicability; however, this does not constitute an endorsement of

regression as a method of choice.
                                     6-40

-------
   7.  ISSUES TO BE  CONSIDERED  BEFORE  STARTING
                      ATTAINMENT  SAMPLING
             After terminating treatment and before collecting water samples to assess
attainment, a period of time must pass to ensure that any transient effects of treatment on
the ground water system have sufficiently decayed. This period is represented by the
unshaded portion in the figure below.  This chapter discusses considerations for deciding
when the sampling for the attainment decision can begin and provides statistical tests,
which can be easily applied, to guide this decision. The decision on whether the ground
water has reached steady state is based on a combination of statistical calculations, ground
water modeling,  and expert advice from hydrogeologists familiar with the site.
Figure 7.1    Example Scenario for Contaminant Measurements During Successful Remedial Action
                        •  Start
                       Treatment
      Measured
       Ground
       Water
     Concentration g.4,
                0.2
                                              Date
The degree to which remediation efforts affect the ground water system at a site is difficult
to determine and depends on the physical conditions of the site and the treatment technolo-
gies used. As previously discussed, the ground water can only be judged to attain the
cleanup standard if both present and future contaminant concentrations are acceptable.
Changes in the ground water system due to treatment will affect the contaminant  concentra-
tions in the sampling wells. For example, while remediation is in progress pumping can
alter water levels,  water flow, and thus the level of contamination being measured at
monitoring wells. To  adequately determine whether the cleanup standard has been attained,
the ground water conditions for sampling must approximate the expected conditions in the
                                       7-1

-------
  CHAPTER 7:  ISSUES  TO BE CONSIDERED  BEFORE  STARTING ATTAINMENT
                                   SAMPLING

future. Consequently, it is important to establish when the residual effects of the treatment
process (or any other temporary intervention) on the ground water appear to be negligible.

When this point is reached, sampling to assess attainment can be started and inferences on

attainment can be drawn. We will  define the state of the ground water when temporary

influences no longer affect it as a "steady state." "Steady state," although sometimes

defined in the precise technical sense, is used here in a less formal manner as indicated in

Section 7.1.
7.1           The Notion of "Steady State"
components:
              The notion of "steady state" may be characterized by the following



              l.a.   After  treatment, the water levels  and  water flow,  and the
                    corresponding variability associated with these parameters (e.g.,
                    seasonal patterns), should be essentially the same as for those from
                    comparable periods of time prior to the remediation effort.

                    or

              l.b.   In cases where the treatment technology has resulted in permanent
                    changes in the ground water system, such as the placement of slurry
                    wells, the hydrologic conditions may not return to their previous
                    state. Nevertheless, they should achieve a state of stability which is
                    likely to reflect future conditions expected at  the site. For this steady
                    state, the residual effects of the treatment will be small compared to
                    seasonal changes.

              2.     The pollutant levels should have statistical characteristics (e.g., a
                    mean and standard deviation) which will be similar to  those of future
                    periods.


              The first component implies that it is important to establish estimates of the

ground water levels  and flows prior to remediation or to predictively model the effect of

structures or other features  which may have permanently affected the ground  water.

Variables such as the level of ground water  should be measured at the monitoring wells for

a reasonable period of time prior to remediation,  so that the general behavior and character-

istics of the ground water at the site are understood.


              The second component is more judgmental. Projections must be made as

to the future characteristics of the ground water and the source(s) of contamination, based
                                       7-2

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING

on available, current information. Of course, such projections cannot be made with cer-
tainty, but reasonable estimates  about the likelihood of events may be established.

              The importance  of identifying when ground water has reached a steady state
is related to the need to make inferences about the future.  Conclusions drawn from tests
assessing the attainment of cleanup standards assume that the current state of the ground
water will persist into future. There must be confidence that once a site is judged clean,
it will remain clean. Achieving a steady state gives credence to future projections derived
from current data.
7.2           Decisions to be Made in Determining When a Steady State is
              Reached

               Immediately after remediation efforts have ended, the major concern is
determining when ground water achieves steady state. In order to keep expenditures of
time  and money to  a minimum, it is desirable to begin collecting data to assess attainment as
soon as  one is confident that the ground water has reached a steady state.

              When sampling to determine whether the ground water system is at steady
state, three decisions arc possible:

                     The ground water has reached steady state and sampling for assess-
                     ing attainment can  begin;
                     The measurements of contaminant concentrations during this period
                     indicate that the contaminant(s) are unlikely to attain the cleanup
                     standard and  further treatment must be considered;  or
                     More time and sampling must occur before it can be confidently
                     assumed that  the ground water has reached steady state.

              Next, various criteria will be  considered that can be used in determining
whether a steady state has been reached
7.3           Determining When a Steady State Has  Been Achieved

              In the following sections, qualitative and quantitative criteria involved in
making the decision as to whether the ground water has returned to a steady state following
                                       7-3

-------
  CHAPTER 7:  ISSUES TO BE CONSIDERED BEFORE  STARTING ATTAINMENT
                                    SAMPLING
remediation are  discussed.  Some of these criteria are based on a comparison of present
ground water levels with comparable levels before treatment. Others are based solely on
measurements and conditions after treatment has terminated. To a certain extent, the
decision as to  when steady  state has been reached is judgmental. It is not possible to prove
that a ground water system has achieved steady state. Thus, it is important to examine data
obtained from the ground water system to see if there are patterns which suggest that steady
state has not been achieved. If there are no such patterns (e.g., in the water level or speed
and direction of water flow), it may be reasonable to conclude that a steady state has been
reached.

              Any data on the behavior of the ground water prior  to the undertaking of
remediation may  serve as a useful baseline, indicating what "steady  state" for that system
had been and, thus, to  what it might return. However, the actions of remediation and the
resulting physical  changes in the area may  change the characteristics of steady state. In this
case, such a comparison may be less useful. When it seems clear that steady state charac-
teristics have changed after remediation efforts, it is usually prudent to allow mom time for
remediation effects to  decay.

              Collection of data to determine whether steady state has been achieved
should begin at the various monitoring wells at the site after remediation has been termi-
nated. The variables for which data will be obtained should include measures related to the
contaminant levels, the ground water levels, the speed and direction of the flow, and any
other measures that will aid in determining if the ground water has returned to a steady
state.  The frequency  of data collection will depend  on the correlation among consecutively
obtained values (it is  desirable to have a low correlation). A period of three months
between data collection  activities at the wells may be appropriate  if them appears to be some
correlation between observations. With little or no correlation,  monthly observations may
prove useful. If the serial  correlation seems to be high, the time interval between data
collection efforts should be lengthened.  With little or no information about seasonal
patterns or serial correlations in the data, at least six observations per year are recom-
mended.  After several  years of data collection, this number of observations will allow an
assessment of seasonal patterns, trends,  and serial correlation. It may be useful to consult
with a statistician if there is some concern  about the appropriate sampling frequency.

              All data collected should be plotted over time in order to permit a visual
analysis of the extent to which a steady state exists for the ground water. In Section 7.4,
                                        7-4

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING

the chatting of data and the construction of plots arc discussed.  Section 7.4.3 provides
illustrations of such plots and their interpretation. In Section 7.4.4, statistical tests that can
be employed for identifying departures from randomness (e.g., trends) in the data are
indicated.  Suggestions far seasonally adjusting data prior to plotting are provided,  and
graphical methods  are  discussed.


7.3.1        Rough Adjustment of Data for Seasonal Effects

              One concern in applying graphical techniques is that the data points being
plotted are assumed to be independent of each  other.  Even  if the serial correlation between
observations is low, there may be a seasonal effect on the observations. For example,
concentrations may be typically higher than the overall average in the spring and lower in
the fall. To  adjust for seasonal effects, one may subtract a measure of the "seasonal"
average from each data value and then add back the  overall average (Box 7.1). The addi-
tion of the overall average will bring the adjusted values back to the original levels of the
variable to maintain the same reference frame as the original data.
                                     Box 7.1
                           Adjusting  for Seasonal  Effects

       Suppose we let xjlc be the jth individual data observation in year k, Xj be the
       average for period j obtained from the baseline period prior to treatment for
       period j, and x be the overall average for all data collected for the baseline
       period.  For example, if six data values per  year have been collected
       bimonthly for each of three years during the baseline period, six X: values
       would be computed, each based on three data points taken from the three
       different years for which data  were collected.  The value x  would be
       computed over all 18 data values. The adjusted jth data observation in year
       k, X£, can then be computed from:
                                 i
                                xjk  «  Xjfc-Xj + X                      (7.1)

       If there are missing values, calculate Xj as in Box 5.4.
                                       7-5

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
              Plot the values of Xjk versus time.  In examining these plots, checks for
runs and trends can be made for the adjusted values.
7.4
  Charting the Data
              In general, it is useful to plot the data collected from a monitoring program.
Such plots are similar to "control charts" often used to monitor industrial processes, except
control limits will not appear on the charts discussed here.  Use the horizontal, or X-axis,
to indicate the time at which the observation was taken; and use the vertical, or Y-axis, to
indicate the value of the variable of interest  (e.g.,  the contaminant level or water table  level
or the value of other variables after adjustment for seasonal effects). Figure 7.2 gives an
example of a plot which may  be used to assess  stability during the period immediately
following treatment.

              Notice that in Figure 7.2, the "prior average" has also been placed on the
plot. This line represents the average of the baseline data collected before remediation
efforts began. For example, this value could be the average of eight points collected
quarterly over a two-year  period.  It may also be useful to plot separately the individual
observations gathered to serve  as the baseline data, so that information reflecting seasonal
variability and the degree  of serial correlation associated with the baseline period can be
readilyexamined.

Figure 7.2     Example of Time Chart for Use in Assessing Stability
              1
            0.9
            0.8
       £   0.7-
       «   0.6-
            0.5  .
       1
0.4 .
0.3
02.
0.1 .
  0
PRORAVERAGE
                1
                           3          4
                      Time (Recorded Quarterly)
                                        7-6

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
7   .  4   .  1 A Test for Change of Levels Based on Charts

              If the ground water conditions after remediation are expected to be compa-
rable to the prior conditions, we would expect that the behavior of water levels and flows to
resemble  that of those same variables prior  to the remediation effort in terms of average and
variability. One indication that a steady state may not have been reached is the presence of
a string of measurements from the post treatment period which arc consistently above or
below the average prior to beginning remediation. A common rule of thumb used in indus-
trial Statistical Process  Control (SPC) is that if eight consecutive points  are above or below
the average (often called a  "run" in SPC terminology), the data are likely to come from a
different process than that from which the average was obtained (Grant and Leavenworth,
1980). This rule is based on the assumption that the observations are independent.  This
assumption is not strictly  applicable in ground water  studies since there  is likely to be  serial
correlation between observations as well  as seasonal variability. Assuming independent
observations, an eight-point run is associated with a 1 in 128 chance of concluding that the
mean of the variable of interest has changed when, in fact, there has been no change in the
mean.

              The above discussion suggests that for the purpose of deciding whether the
ground water has achieved steady state, a string of 7 to 10 consecutive points above or
below the prior average might serve as evidence  indicating that the state  of the ground  water
is  different from that in the baseline period. If  it is suspected that a high degree of serial
conelation exists, it would be appropriate to require a larger number of consecutive points.


742        A Test for Trends Based on Charts

              The charts described here provide a simple way of identifying trends.  If six
consecutive data points arc increasing (or decreasing)1  ~ sometimes stated as "5
consecutive intervals of data" so that it is understood  that the first point in the string is  to be
counted ~ then there is  evidence that the variable being monitored (e.g., water levels or
flows, or  contaminant  concentrations) has  changed  (exhibits a trend). Again, independence
'This rule of 6 is based on the assumption that all 720 orderings of the points are equally likely. This is
  not always true. Hence such rules are to be considered only as quick but reasonable approximations.

                                        7-7

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
                                   SAMPLING
of the observations is assumed. A group of consecutive points that increase in value is
sometimes referred to as a "run up," while a group of consecutive points that decrease in
value is referred to as a "run down."

             With the rule of six consecutive data points described above, the chance of
erroneously concluding that a trend exists is only 1 in 360, or about 0.3 percent. In
contrast, a rule based on five consecutive points has a 1 in 60 chance (1.6 percent) of
erroneously concluding that there is  a trend, while  a rule based  on  seven  consecutive points
would have a corresponding 1 in 2,520 chance (0.04 percent) of erroneously concluding
that there is a trend. Thus, depending on the degree of serial correlation expected, a "and"
of 5  to 7 points may suggest that the ground water  levels and flows  are not at steady  state.

              In  practice, data  for many ground water samples may be collected before
any significant runs are identified. For example, in a set of 30 monthly ground water flow
rate measurements, there may be a run up of seven points and several shorter runs.  Such
patterns of runs can be analyzed by examining the length or number of runs in the series.
Formal  statistical procedures for analyzing trends in a time series are  given by Gilbert
(1987).

              A quick check for a general trend over a long period of time can be accom-
plished  as follows. Divide the total number of data points available, N, by 6. Take the
closest integer smaller than N/6 and call it I. Then select the I  data value over time, the
2(1*), the 3(Ith), etc. For example, if N = 65, then I = 10, and we would select the 10th,
20th, etc., points over time.  If  there are  six consecutive points increasing or decreasing
over time, there is evidence of a  trend. This test will partially compensate for serial
con-elation.
7.4.3        Illustrations and Interpretation

              Once the plotting of data has begun, there are various patterns that may
appear. Figures 7.3 through 7.8 represent six charts which indicate possible patterns that
may be encountered.  Evidence of departures from stability is being sought. The first five
charts, except Figure 7.4, indicate evidence of instability  (or in the cases of Figures 7.5  and
7.6, suspicions of possible instability), i.e.,  changes in characteristics over time.
Figure 7.3 shows "sudden" apparent outliers or spikes that indicate unexpected variability
                                       7-8

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT
                                    SAMPLING
in the variable being monitored. Figure 7.4 illustrates a six-point trend in the variable being
monitored. Figures 7.5 and 7.6 suggest that a trend may exist but there is insufficient
evidence to substantiate it.  Attention should be paid to the behavior of subsequent dam in
these cases.  (In particular, the data in Figure 7.5 could indicate a general trend using the
"quick check" discus& in the previous section depending on the randomly selected set of
points included in the test.) Figure 7.7 reflects a change (around observation 15) in both
variability (the spread of the data becomes much greater) and average (the average appears
to have  increased). Figure 7.8 indicates a variable that appears to be stable.

              In interpreting the plots, the return to a steady state will generally be indi-
cated by a random scattering of data points about the prior average. The existence of
patterns such as runs or trends suggests instability. Pattern, associated with seasonality
and serial correlation should be consistent with those seen prior to remediation.   At the
very least, the average value for levels of contaminants after remediation should be lower
than that prior to remediation.  A run below the prior average fur contaminant level
measures would certainly not be evidence that the ground water is not at steady state, since
the whole point of the remediation effort is to reduce the level of contamination.  A trend
downwards in contamination levels may be an indication that a steady state has not been
reached. Nevertheless,  if substantial  evidence suggests that this decline or an eventual
leveling  off will be the future state of that  contaminant  on the site, tests for attainment  of the
cleanup  standards would  be  appropriate.

              On the other hand, if it seems that the average contamination level after
remediation  will be above the prior average or that there is a consistent trend upwards in
contamination levels, it may be decided that the previous remediation efforts were not
totally successful, and further remediation efforts must be undertaken.  This may be done
with a minimal amount 'of data, if, based on the data available, it appears unlikely that the
cleanup standard will be met. However, what should  be taken into account is the relative
cost of making the wrong decision. Two  costs should be weighed against each other:  the
cost of obtaining further observations from the monitoring wells if it turns out that the
decision to resume remediation is made at a later date (the loss here is in terms of time and
the cost of monitoring up to the time that remediation actually is resumed) against the cost
of resuming  remediation when in fact a steady state would eventually have been achieved
(the loss here is in terms of the cost of unnecessary cleanup effort and time).  In addition,
the likelihood of making  each  of these  wrong decisions, as estimated  based on the available
information,  should be incorporated  into the decision process.
                                        7-9

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
Figure 7.3 Example of Apparent Outliers
  45
  40
  35
  30
  25 .
  20 .
  15 .
  10
   5
   0
                                                                    —I
                                                                     25
10
15
20
Figure 7.4    Example of a Six-point Upward Trend in the Dam
                              10
             15
             20
              25
                                   7-10

-------
   CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
 7.4.4        Assessing Trends via Statistical Tests

              The discussions-  in Section 7.4.3 considered graphical techniques for
 exploring the possible existence of trends in the dam.  Regression techniques discussed in
 Chapter 6 provide a formal statistical procedure for considering possible trends in the
 data.

              Other formal procedures for testing for trends also exist. Gilbert (1987)
 discusses several of them, such as the Seasonal Kendall Test, Sen's Test for Trend, and a
 Test for  Global Trends (the original articles in which these tests ate  described were: Hirsh
 and Slack, 1984; Hirsch, Slack, and Smith, 1982; Farrell, 1980; and van Belle and
 Hughes,  1984).

              The Seasonal Kendall Test provides a test for trends removes seasonal
 effects. It has been shown to be applicable in cases where monthly observations have been
 gathered  for at least three years. The degree to which critical values obtained from a normal
 table approximate the true critical  values apparently has  nut been established for other time
 intervals of data collection-e.g.,  quarterly or semi-annually. This test would have to be
 carried out for each monitoring well separately at a site.   Sen's Test for Trend is a more
 sensitive test far detecting monotonic trends if seasonal effects exist, but requires more
 complicated computations if there are missing data.  The Test for Global Trends provides
 the capability for looking at differences between seasons and between monitoring  wells, at
 season-well interactions, and also provides an overall trend test.  All three of these tests
 (the Seasonal Kendall, Sen's, and the Global tests) require the assumption of independent
 observations. (Extensions of these tests allowing for serial correlations require that much
more data be  collected—for example, roughly 10 years worth of monthly data for the
 Seasonal Kendall test extension.) If this assumption is violated these tests tend to  indicate
that a trend exists at a higher rate than specified by the chosen a level when  it actually does
not.  Thus, these tests may provide useful tools for detecting trends,  but the finding of a
trend via such a test may not necessarily represent conclusive evidence that a trend exists.
Gilbert provides a detailed  discussion of all three tests as well  as computer code that can be
used for  implementing the tests. However, this discussion does not consider the power of
these trend tests, i.e., the likelihood that such tests identify a trend when a trend  actually
                                       7-13

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING

exists is not addressed.  If the power of these tests is low, existing trends may not be
detected in a timely fashion.


7.4.5        Considering the Location of Wells

              In addition to assessing the achievement of steady state in a well over time,
it is also useful to consider the comparison of water and contamination levels across wells
at given points in time. This can readily be done by constructing either (1) a scatter plot
with water or contamination levels on the vertical axis and the various monitoring wells
indicated on the horizontal axis, or (2) constructing a contour plot of concentrations or
water levels across the site and surrounding area. Commercial computer programs are
available for preparing contour plots.  In  particular, see  the discussion in Volume 1  (Chapter
10) on  kriging. If there are, large, unexpected differences in water or contamination levels
between wells, this  may suggest that steady state has not yet been reached.


7.5           Summary

              Finding that the ground water has returned to a steady state after terminating
remediation efforts is an essential step in the establishment of a meaningful test of whether
or not the cleanup standard have been attained. There arc uncertainties in the process, and
to some extent it is 'judgmental.  However, if an adequate amount of data are carefully
gathered  prior to beginning  remediation and after ceasing remediation, reasonable  decisions
can be made as to  whether or not the ground water can be considered to have reached a
state of stability.

              The decision on whether the ground water has reached steady state will be
based on a combination of statistical calculations, plots  of data, ground water modeling,
use  of predictive models,  and  expert advice from hydrogeologists familiar with the  site
                                       7-14

-------
  CHAPTER 7: ISSUES TO BE CONSIDERED BEFORE STARTING ATTAINMENT SAMPLING
Figure 7.5    Example of a Pattern in the Data that May Indicate an Up ward Trend
   70



   60



   so .




   40



   30




   20 -




   10



   0
PPDRAVBWGE
                               10
                                   15
20
Figure 7.6    Example of a Pattern in the Data that May Indicate a Downward Trend
                                   7-11

-------
  CHAPTER 7:  ISSUES TO BE  CONSIDERED BEFORE STARTING ATTAINMENT
                               SAMPLING

Figure 7.7    Example of Changing Variability in the Data Over Time
Figure 7.8    Example of a Stable Situation with Constant Average and Variation
                                  7-12

-------
   8.  ASSESSING ATTAINMENT USING FIXED  SAMPLE
                               SIZE  TESTS
             After the remediation effort and after the ground water has achieved steady
state, water samples can be collected to determine whether die contaminant concentrations
attain the relevant cleanup standards. The sampling and evaluation period for making this
attainment decision is represented by die unshaded portion in the figure below.
Figure 8.1    Example Scenario for Contaminant Measurements During Successful
             Remedial Action
              1.2 -
                        Start
                     Treatment
   Measured
    Ground
    Water
 Concentration
             In this chapter statistical procedures are present for assessing the attain-
ment of cleanup standards for ground water at Superfund sites. As discussed previously,
the procedures presented arc suitable for assessing the time series of chemical concentra-
tions measured in individual wells relative to a cleanup standard. Note that attainment
objectives, as discussed in Chapter 3, must be specified by those managing the site
remediation before the sampling for assessing attainment begins.

             The collection of samples for assessing attainment of the cleanup standards
will occur after the remedial action at the site has been completed and after a subsequent
period has passed to allow transient affects due to the remediation to dissipate. This will
allow the ground water concentrations, flows, and water table levels to reach equilibrium
with the surrounding environment. It will be important to continue to chart the ground
                                     8-1

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

 water dam to monitor the possibility of unexpected departures from an apparent steady
 state.  Some such departures are  illustrated in Figures 7.3 through 7.7.

            The attainment decision is an assessment of whether the post-cleanup
 contaminant concentrations are acceptable compared to the cleanup standard and whether
 they are likely to remain acceptable. To assess whether the contaminant concentrations are
 likely to remain acceptable, the statistical procedures provide methods for determining
 whether or not a long-term average concentration or a long-term percentage  of the well
                   are  below  the  established  cleanup  standards.
              It is assumed in this chapter that the periodic or seasonal patterns in the data
repeat on a yearly  cycle. It may be that another, perhaps shorter, period of time would be
appropriate. In such a case, the reference to "yearly" averages may be adjusted by the
reader to reflect the appropriate period of time for the site under consideration. In the text,
mention of alternative "seasonal cycles or periods" indicates where such adjustments  may
be  appropriate.

              This chapter presents statistical procedures for  determining whether:
                     The mean concentration is below the Cleanup standards; or
                     A selected percentile of all samples is below the cleanup standard
                     (e.g., does the 90th percentile of the distribution of concentrations
                    fall below  the  cleanup standard?).

              Many different statistical procedures can be used to assess the attainment of
the cleanup standard. The procedures presented here have been selected to provide reason-
able results with a small sample size in the presence of correlated  dam. They require
minimal  statistical  background  and expertise. If other procedures are  considered, consulta-
tion with a statistician is recommended. In particular, in the unlikely event that the
measurements are not serially correlated, the methods presented in  chapter 5 which assume
a random sample can be used.

              The procedures presented arc of two types:  fixed sample size tests are
discussed in this chapter, and sequential tests arc discus& in Chapter 9.  Figure 8.2 is a
flow chart outlining the steps involved in the cleanup process when using a fixed sample
size test.   Section 8.6 discusses testing for trends if the levels of contaminants are
scccptdblc.
                                        8-2

-------
CHAPTER  8:  ASSESSING  ATTAINMENT USING FIXED SAMPLE SIZE TESTS

    Figure 8.2    Steps in the Cleanup Process When Using a Fixed Sample Size Test
                    Reuwtt Cleanup
                     Technology
                                                Sun

                                                I
                                         Wau for Ground Water
                                         to Reach Steady Stato
                                               i
                                          Specify SanplaDMifn
                                           Collect the Data
                                                I
                                            Dettaninelfihe
                                          Ground Water Attains
                                          the Ckannp Standard
                                    8-3

-------
   CHAPTER 8: ASSESSING  ATTAINMENT USING FIXED SAMPLE SIZE TESTS
 8.1           Fixed Sample Size Tests

               This chapter discusses assessing the attainment  of cleanup standards using a
 test based on-a predetermined sample size. For a fixed sample size test, the ground water
 samples are collected on a regular schedule, such as every two months, for a predetermined
 number of years. After all the data have been collected, the data are analyzed to determine
 whether the concentrations in the  ground water attain the cleanup standard. Even if the
initial measurements suggest that the ground water may attain the cleanup standard, all
 samples must be collected before the statistical  test can be performed.  An advantage of this
 approach is that the number of samples required to perform the statistical test will be known
 before the sampling begins, making some budgeting and planning tasks easier than when
 using a sequential test (Chapter 9).

              Three procedures are presented for testing the mean when using fixed
 sample size tests.  The first and second procedures use yearly averages concentrations.
 The first method, based  on the assumption that the yearly means have a normal distribu-
 tion, is recommended when there are missing values in the data and the missing values are
 not distributed  evenly throughout the year. The second procedure assumes that the distri-
 bution of the yearly average is skewed, similar to a lognormal distribution, rather than
 symmetric.  If there are few or no missing values, the second method using the log trans-
 formed yearly averages is recommended even if the data are not highly skewed. The third
 method requires calculation of seasonal effects and serial correlations to determine the
 variance of the mean.  Because the third method is sensitive to the skewness of the data, it
 is recommended  only if the distribution of the residuals is reasonably symmetric.
 Regardless of the procedure used, the sample size for assessing the mean should be deter-
 mined using the steps described in Section 8.2.1.
8.2          Determining Sample Size and Sampling Frequency

              Whether the  calculation procedures  used for assessing  attainment use yearly
averages or individual measurements, the formulas presented below for determining the
required sample size use the characteristics of the individual observations. In the unlikely
event that many years of observations are available for estimating the variance of yearly
average, the number of years of sampling (using the same sample frequency as in the
available data) can also be determined from the yearly averages using equation (5.35). The

                                       8-4

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE  SIZE TESTS

following sections discuss the calculation of sample size for testing the mean and testing
proportions.


8.2.1      Sample Size for Testing Means

              The equations for determining sample size require the specification of the
following quantities: Cs, m, a, and f) (see Sections 3.6 and 3.7) for each chemical under
investigation.  In addition, estimates of the serial correlation $ between monthly observa-
tions and  the standard deviation 
-------
  CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                    Box 8.1
                Steps for Determining Sample Size for Testing the Mean

       (1)  _  Determine the estimates of o and ^ which describe the data. Denote
             these estimates by ft and $.

       (2)    Estimate the ratio of the annual overhead cost of maintaining
             sampling operations at the site to the unit cost of collecting   process-
             ing,  and analyzing one ground water sample.  Call this ratio $R.

       (3)    Based on the values of $R and $, use Appendix Table A.4 to deter-
             mine the approximate number, np, of samples to collect per year or
             seasonal period.  The value np may be modified based on site-
             specific considerations, as discussed in the text

       (4)    The sampling frequency (i.e., the number of samples to  be taken per
             year) is np or 4, whichever is  larger.  Denote this sampling
             frequency as n.  Note that, under this rule, at least four samples per
             year per sampling  well will be collected.

       (5)    For given values  of n and  $, determine a "variance factor" from
             Appendix Table A.5.  Denote this factor by F.  For example, for
             $ = 0.4 and n = 12, the factor is F * 5.23.

       (6)    A preliminary estimate of the required number of years to sample,
             mj, is
                                                                     (8.1)
             where zj.p and zi - are the critical values from the normal distribu-
             tion with probabilities of 1-a and 1-fJ (Table A.2).

       (7)    The number of years of data will be denoted by m and will be
             determined by rounding m^ to the next highest integer.  The total
             number of samples per well will be N=nm.
             Appendix Table A.4 shows the approximate number of observations per

year (or period) which will result in the minimum overall cost for the assessment (see

Appendix F for the basis for Table A.4). Note that the sampling frequencies given in Table

A.4 are  approximate  and are based on numerous assumptions  which  may only approximate

the situation and costs at a particular Superfund site. Using the table requires knowledge of

the serial cotrelations between observations separated by one month (or one-twelfth of the
                                       8-6

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

seasonal cycle)  and the cost of extending the sampling period for one more year relative to
taking an additional ground water sample.

             Find the column in Table A.4 that is closest to the estimate of $R being
used. Find the  row which most closely corresponds to $.  Denote the tabulated value by
rip. For example, suppose that the cost ratio is estimated to be 25 and $ * 0.3. Then from
Table A.4 under the fifth column (ratio * 20), Up « 9. Since die costs and serial correla-
tions will not be known exactly, die sample frequencies in Table A.4 should be considered
as suggested frequencies. They should be modified to a sampling frequency which can be
reasonably implemented in the field. For example, if collecting a sample every month and a
half (np «8) will allow easy coordination of schedules, Dp can be changed from 9 to 8.

             For determination of sample frequency, these quantities need not be precise.
If there arc several compounds to be measured in each sample, calculate the sample
frequency for each compound.  Use the average sample frequency for the various
compounds

             It is recommended that at least four samples per year (or seasonal period) be
collected to reasonably reflect the variability in the measured concentration within the year.
Therefore, the sampling  frequency (i.e., number of samples to be taken per year) is the
maximum of four and np. Denote the sampling frequency by n. Note that, under this rule,
at least four samples per year per sampling well will be collected.

             As more  observations per year arc collected, the number of years of
sampling required for assessing attainment can be reduced. However, there  arc limits to
how much the sampling time can be reduced by increasing the  number of observations per
year.  If the cost of collecting, processing, and analyzing the ground water samples is very
small compared  to the cost of maintaining the overall sampling  effort many samples can be
collected each year and the primary cost of the assessment sampling  will be associated with
maintaining the  assessment effort until a decision is reached. On the other hand, if the cost
of each sample is very large and a monitoring effort is to be maintained at the site regardless
of the attainment decision, the costs of waiting for a decision may be minimal  and the
sampling frequency should be  specified so  as to minimize the sample  collection, handling,
and analysis costs. It  should be noted that  it is assumed that the ground water remains in
steady state throughout the period of data collection.
                                       8-7

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

              The frequency of sampling discussed in this document is the simplest and
 most straightforward to implement: determine a single time interval between samples and
 select a sample at all wells of interest after that period of time has elapsed (e.g., once every
 month,  once every six weeks, once a quarter, etc.). However, there are other approaches
 to determining sampling frequency, for example,  site specific data may suggest that time
 intervals should vary among wells or groups of wells in order to achieve approximately the
 same precision for each well.  Considering such  approaches is beyond the scope of this
 document, but the interested reader may reference such articles as Ward, Loftis, Nielsen,
 and Anderson (1979), and Sanders and Adrian  (1978). It should be noted that these arti-
 cles arc oriented around issues related to sampling surface rather than ground water but
 many of the general principlesapply to both. In general, consultation with a statistician is
 recommended when establishing sampling procedures.

              Use the sample frequency per year, the estimated serial correlation between
 monthly observations, and Appendix Table A.5 to determine a "variance factor" for esti-
 mating the required sample size. For the given values of n and $, determine the variance
 factor in Table A.5. Denote this factor by F. For example, for $ « 0.4 and  n =  12, the
 factor is F « 5.23.  For values of $ and n not  listed in Table A.5, interpolation between
 listed values may be used to determine F.  Alternatively, if a conservative approach is
 desired  (i.e., to take a larger sample of data), take the smaller value of F associated with
 listed values of $ and n.  For values outside the  range of values covered in Table A.5, see
 Appendix F.

              A preliminary estimate of the required number of years of sampling, m,j is
 given by equation (8.1).  The first ratio in this equation is the estimated variance of the
 yearly average, o^ = ^r. The final addition of 2 to the sample size estimate improves the
estimate with small sample sizes (see Appendix F).

              Because the statistical tests require a full year's worth of data, the number of
years  of data collection, md,  is rounded to the next highest integer, m.  Thus, n samples
will be collected in each of m years, for a total number of samples per well of N where N is
the  product m*n. An example of using these procedures to calculate  sample size for testing
the mean is provided in Box 8.2.
                                       8-8

-------
  CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                    Box 8.2
              Example  of Sample Size Calculation for Testing the Mean

       Suppose that, for a » .01, it is desired to detect a difference of .2 ppm
       from the cleanup standard of J ppm (for example: Cs ».5, m ».3) with a
       power of .80 (i.e., |5 * .20). Also suppose that the ratio of annual overhead
       costs to per-unit sampling and analysis costs ($R) is close to 10. Further, it
       is estimated that & -.43 and $ - .20. Then for $ - .20 and cost($n) * 10,
       Table A.4 gives np = 9. For np = 9 and $ = .20, F = 7.17  from Table A.5.
       Further, using equation (8.1):
       to determine the number of years, 014, to collect data, we find

                                 .842 + 2.3262
       where zi.p » .842 and z\^ • 2.326, as can be found from Table A.2 or any
       normal probability table.

       Rounding up gives a sampling duration of nine years and  a total sample size
       of 9*9=  81 samples.
8.2.2        Sample Size for Testing Proportions


              The testing of proportions is similar to the testing of means in that the

average coded observation (e.g., the proportion of samples fop which the cleanup standard
has been exceeded) is compared to a specified proportion. The method for determining
sample size described below works well when there is a low con-elation between observa-
tions and no or small seasonal  patterns in the data.  If the correlation between monthly

observations is high or there arc large seasonal changes in the measurements, then consul-
tation  with a  statistician is recommended.  If the parameter to be tested is the proportion of

contaminated samples from either  one well or an array of wells, one can determine the
sample size for a fixed sample size  test using the procedures in Box 8.3. These procedures
for determining sample size require the specification of the following quantities: O, |J, PO,

and P! (see Section 3.7 and Section 5.4.1). In general, many samples are required for
testing when testing small proportions.
                                      8-9

-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                 Box 8.3
                Determining Sample Size for Testing Proportions

    (1)    Compute the estimates of o and + which describe the measurements
           (not  the coded values).  Denote this estimates by 6  and lm.

           Let $ » Y?, ($ is the  estimated correlation between the coded
           observations).

    (2)    Estimate the ratio of the annual overhead cost of maintaining
           sampling operations at the site to the unit cost of collecting, pro-
           cessing and analyzing one ground water Sample. Call this ratio $R.

    (3)    Based on the values of $R and |, use Table A.4 to determine the
           approximate number, np, of samples to collect per year or seasonal
           period. Based on site-specific considerations, the value np may be
           modified to a number which is administratively convenient

    (4)    The sampling frequency (i.e., the number of samples to be  taken per
           year)  is np  or 4, whichever is larger. Denote this sampling
           frequency as n. Note that, under this rule,  at least four samples per
           year per sampling well will be collected.

    (5)    For given values of n and $, determine a "variance factor"  from
           Table A.5.  Denote this factor by F.

    (6)    For given values of F, a, f), PQ, and Pt a preliminary
           estimate of the number of years to sample is
                                    P0 - P,
          where ZI_P and z\^ are critical values from the normal distribution
          associated with probabilities of l-o and 1^ (Appendix Table A.2).
          If aid is less than -^-, use mj » ^- instead. Equation (8.2) is an
          adaptation of (8.1), using equation (5.25) of Chapter 5.

    (7)    The number of years of data will be denoted by m, and will be
          determined by rounding n^ to the next highest integer. The total
          number N or samples per well will be N=nm.
                                   8-10

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FORCED SAMPLE SIZE TESTS

8.2.3        An Alternative Method for Determining Maximum  Sampling
              Frequency

              The maximum sampling frequency can be determined using the hydrogeo-
logic parameters of ground water wells. The Darcy equation (Box 8.4) using the hydraulic
conductivity, hydraulic gradient,  and effective porosity of the aquifer, can be used to
determine the horizontal component of the average linear velocity of ground water. This
method is useful for determining the sampling frequency that allows sufficient time to pass
between sampling events to  ensure,  to the greatest extent technically feasible, that there is a
complete exchange of the water in  the sampling well between collection of water samples.
Although samples collected at the maximum sampling frequency  may be  independent  in the
physical sense, statistical independence is unlikely. Other factors such as the effect of
contamination history, remediation, and seasonal influences can also result in correlations
over time periods  greater than that required to flush the well. As a result, we recommend
that the sampling frequency be less than the maximum frequency based on Darcy's
equation. Use of the maximum frequency can be approached only if estimated correlations
based on ground-water samples are close to zero and the cost ratio, $R, is high. A detailed
discussion of the hydrogeologic components of this procedure  is beyond the scope of this
document. For further information refer to Practical Guide for Ground-Water Sampling
(Barcelona et al.,  1985) or  Statistical  Analysis of Grnnnfl-Water Monitoring Data at RCRA
Facilities (U.S. EPA, 1989b).
                                     Box 8.4
                Choosing  a  Sampling Interval  Using the Darcy Equation
       The sampling frequency can be based on estimates using the average linear
       velocity  of ground water. The  Darcy  equation  relates ground water  velocity
       (V) to effective porosity (Nc), hydraulic gradient (i),  and hydraulic
       conductivity  (k):

                                    V»^                        (8.3)

       The values for k, i,  and Ne can be determined from a well's hydrogeologic
       characteristics. The time required for ground water to pass through the well
       diameter can be determined by dividing  the monitoring well  diameter by the
       average linear velocity of ground water (V). This value represents the
       minimum  time interval required between sampling events  which  might yield
       an independent ground water sample.
                                       8-11

-------
    CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS


 8.3          Assessing Attainment of the Mean Using Yearly Averages

               When using yearly averages  for the analysis, the effects  of serial  correlation
can generally be ignored (except for extreme conditions unlikely to be encountered in
 ground water).  For the procedures  discussed in this section, the variance of the observed
 yearly averages is  used to estimate the variance of the  ova-all average  concentration.  First,
 data are collected using the guidelines indicated in Chapter 4. Values recorded below the
 detection limit should be recorded according to the procedures  in Section 2.3.7.  Wells  can
 be tested individually or a group of wells can be tested jointly.  In the latter case, the data
 for the individual wells at each point in time are used to produce a summary measure (e.g.,
 the mean or  maximum) for the group as a whole.

              Two calculation procedures for assessing attainment are described below.
 Both procedures use the yearly average concentrations. The first is based on the assump-
 tion that the yearly averages can be described by a (symmetric) normal distribution. This is
 based on a standard t-test described in many  statistics books. The second procedure uses
 the log transformed yearly averages and is based on the assumption that the distribution of
 the yearly averages can be described by  a (skewed) lognormal distribution. Because the
 second procedure performs well  even when the data  have a symmetric distribution,  the
 second method is recommended in most situations.  Only when  there arc missing data
 values for  which the sampling dates  axe not evenly distributed  throughout the year and there
 is also an  apparent  seasonal pattern in the  data is the first procedure recommended.

              The calculations and procedures when using the untransformed yearly
 averages are described below and summarized in Box 8.5. This procedure is appropriate in
 all situations but is not preferred, particularly if the data axe highly skewed. The calcula-
 tions can be  used (with some minor loss in efficiency)  if a some observations are missing.
 If the proportion of missing observations varies considerably from season to season and
 there are differences in the average measurements among seasons, consultation with a
 statistician is recommended. If there arc few missing values and the data arc highly
 skewed, the procedures described in Box 8.12 which use the log transformed yearly
 averages are recommended.
                                        8-12

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                     Box 8.5
                Steps far Assessing Attainment Using Yearly Averages
       (1J    Calculate the yearly averages (see Box 8.6)
       (2)    Calculate the mean, xm, and variance. s|, of the yearly averages
              (see Box 8.7)
       (3)    If there are no missing observations, set
                                     x-xm                          (8.4)
              Otherwise, if there are missing observations calculate the seasonal
              averages arid the mean of the seasonal averages, f ms, (Box 8.8)
              and set
                                     x - xms                         (8.5)
              where x is the mean to be compared to die cleanup standard.
       (4)    Calculate the uper 1-a percent one-sided confidence interval for the
pper
 8.9)
              mean, x.  (Box
       (5)     Decide whether the ground water attains the cleanup standards
              (Box 8.10).
              Use the formulas in Box 8.6 for calculating the yearly averages.  If there ate
missing observations within a year, average the non-missing observations. Using the
yearly averages for the  statistical analysis, calculate the mean and variance of the yearly
averages using the equations in Box 8.7. The variance will have degrees of freedom equal
to one less than the number of years over which the data was collected.

              If there are no missing observations, the mean of the yearly averages, xm,
will be compared to the cleanup standard for assessing attainment If however, there are
missing observations, the mean of the yearly averages may provide a biased estimate of the
average concentration during the sample period. This will be true if the missing observa-
tions occur mostly at times when the concentrations are generally higher or lower  than
throughout most of the year. To correct for this bias, the average of the seasonal averages
will be compared to the cleanup standard when there are missing observations. Box 8.8
provides equations for calculating the seasonal averages and ?ms. the mean of the seasonal
averages. Using x to designate the mean which is to be compared to the cleanup standard;
set x » xm  if there are no missing observations, otherwise set x * f ms-
                                      8-13

-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                 Box 8.6
                      Calculation of the Yearly Averages

    Let Xjfc * the measurements from an individual well or a combined measure
    from a group of wells obtained for testing whether the mean attains the
    cleanup standard;  Xjk represents the concentration for season j (the jth
    sample collection  time out of n) in year k (where data is collected for m
    years.

    For each year, the yearly average is the average of all  of the observations
    taken within the year. If the results for one or more sample times within a
    year are missing, calculate  the average of the non-missing observations.
    If there are nk (nk £ n) non-missing observations in year k, the yearly
    average, xk, is:
                                                                 (8.6)


    where the summation is over all non-missing  observations within the year.
                                 Box 8.7
           Calculation of the Mean and Variance of the Yearly Averages

    The mean of the yearly averages, xm is:



                                          **                     (8'7)
    where Xk is the yearly average for year k and the summation covers m years.
    The variance of the yearly averages, sj-, can be calculated using either of the
    two equivalent equations below:


                                            £(*>-   *»
                                            W
             *            (m-l)                   (m-1)

    This variance estimate has m- 1 degrees of freedom.
                                   8-14

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                     Box 8.8
        Calculation of Seasonal Avenges and the Mean of the Seasonal Averages
       For the n     sample collection times within the year, the jth seasonal average is
       the average of all the measurements taken at the jth collection time. If there
       is a missing observation, the measurement from the jth sample collection
       time may be different from the jth sequential  measurement within the year.
       Note that observations below the detection limits should be replaced by the
       detection limit and are  not counted  as missing observations.
       For all collection times j, from 1 to n, within each  year, calculate the
       seasonal average, &, where the number of observations at the jth collection
       time is mi £ m. If there are missing observations, sum over the mj non-
       missing observations.

                                                                     (8-9)
       The mean of n seasonal averages is:
                                  ^ms--Il                      (8.10)
              Using the mean which is to be compared to the cleanup standard, x, and the
standard deviation of the mean calculated from the yearly averages, calculate the upper one
sided 1-a percent confidence interval for the mean using equation 8.11 in Box 8.9. The
standard deviation, is the square root of the variance calculated from equation (8.7).
Calculation of the upper confidence interval requires use of a, specified in the attainment
objectives, and the degrees of freedom for the standard deviation, the number of years of
data minus one, to determine the relevant t-statistic from Table A.1 in Appendix A. If the
lower one-sided confidence limit is desired, replace the plus sign in equation (8.11) with a
minus sign.

              Finally, if the upper one-sided confidence interval is less than  the cleanup
standard and if the  concentrations  are not increasing over time,  decide that the tested ground
water attains the cleanup standard. If the ground water from all wells or groups of wells
attains the cleanup standard then conclude that the ground water at the site attains the
cleanup standard. The steps in deciding attainment of the cleanup standard are shown in
Box 8.10.
                                       8-15

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE  TESTS
                                     Box 8.9
             Calculation of Upper One-sided Confidence Limit for the Mean

       The tipper one-sided confidence limit is:
       where X is the mean level of contamination, and sj is the square root of the
       variance of the yearly means. The degrees of freedom associated withsx is
       m-1, and the appropriate value of tj^yn.1 can be obtained from Table A.l.
                                    Box 8.10
           Deciding if the Tested Ground Water Attains the Cleanup Standard
              < Cs» conclude that the average ground water concentration in the
       well (or group of wells) attains the cleanup standard.

       If the average ground water concentration in the wells is less than the
       cleanup standard, perform a trend test using the regression techniques
       described in Chapter 6 to determine if there is a statistically significant
       increasing trend to the yearly averages over the sampling period (also see
       Section 8.6).  Note that at least 3 years' worth of data are required to iden-
       tify a trend. If there is not a  statistically significant increasing trend
       conclude that the ground water attains the cleanup standard (and possibly
       initiate a follow-up monitoring  program).  If a significant trend does exist,
       resume sampling  or  reconsider  treatment effectiveness.
                   conclude that the average ground water concentration in the
       wells does not attain the cleanup standard.
              When the data are noticeably skewed, the calculation procedures in Box
8.12 (using the log transformed yearly averages) are recommended over those in Box 8.5.
Because the procedures in Box 8.12 also perform well when the data have a symmetric
distribution, these procedures are generally recommended in all cases where there are no
missing data.  There is no easy adjustment for missing data when using the log transformed
yearly averages.  Therefore, if the number of observations per season (month etc.) is not
the same for all seasons and if there is any seasonal pattern in the data, use of the proce-
dures in Box 8.5 is recommended
                                       8-16

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                    Box 8. 11
         Sample of Assessing Attainment of the Mean Using Yearly Averages

       To test whether the cleanup standard  (Cs - 0.50) has been attained for a
       particular chemical, 48 ground water samples were collected for four years
       at monthly intervals.  All 48 ground water samples were collected and
       analyzed, and three values which were below the detection level were
       replaced in the analysis by the detection limit Based on the sample data, the
       overall mean concentration was determined to be .330 ppb.  The  corre-
       sponding yearly means were computed as: X} * .31;  Kj  • -32;  £3 = .34;
       and *4 » .35.  The variance of the yearly means is sj » .000333.

             The one-sided 99 percent confidence interval extends from zero to
       Since the cleanup standard is Cs = 0.5 ppm the average is significantly less
       than the cleanup standard. However, the yearly averages are consistently
       increasing and regression analysis indicates that the trend is statistically
       significant at the 5 percent level (p = .0101). Therefore, it cannot be
       concluded that the attainment objectives  have been achieved. If the present
       trend continues, the concentrations would exceed the cleanup standard in
       about 10 years. Possible options include continued monitoring to determine
       if the trend will continue or to reassess the treatment effectiveness and why
       the upward trend exists.
             The calculations when using the log transformed yearly averages are  slightly

more difficult than when using the untransformed yearly averages. After calculating the

yearly averages,  the natural log is  used to transform the data. The transformed averages are

then used in the subsequent analysis. The upper confidence interval for the mean concen-

tration is based on the mean and variance of the log transformed yearly averages. The

formulas are based on the assumption that the yearly averages have a log normal

distribution.
                                      8-17

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                    Box 8.12
           Steps far Assessing Attainment Using the  Log Transformed  Yearly
                                    Averages
       (1)   ~ Calculate the yearly averages  (see Box 8.6)
       (2)     Calculate the natural tog of the yearly averages (see Box 8.13)
                                                  *y
       (3)     Calculate the mean, xm, and variance, Sg, of the log transformed
              yearly averages  (see Box 8.14)
       (4)     Calculate the upper 1-a percent one-sided confidence interval for the
              overall mean. (Box 8.15)
       (5)     Decide whether the ground water attains the cleanup standards
              (Box 8.10).
             Use the  formulas in Box 8.6 for calculating the yearly averages. If there are
missing observations within a year, average the non-missing observations. Calculate the
log transformed yearly averages using equation (8.12) in Box 8.13. The natural log trans-
formation is available on many calculators and computers, usually designated as "LN",
"In", or "loge." Although the equations could be changed to use the base 10 logarithms,
use only the base e logarithms when using the equations in Boxes 8.13 through 8.15.
Calculate  the mean and  variance  of the  log transformed yearly averages using  the equations
in Box 8.14. The variance will have degrees of freedom equal to one less than the number
of years over which the data was collected.
                                    Box 8.13
                Calculation of the Natural Logs of the Yearly Averages
       The natural log of the yearly average is:

                                   yk«ln(xk)                        (8.12)
                                      8-18

-------
   CHAPTER  8:  ASSESSING ATTAINMENT USING FIXED SAMPLE  SIZE TESTS
                                    Box 8.14
         Calculation of Mean and Variance of the Natural Logs of the Yearly
The average of the m log transformed yearly averages, ym:

                           *m  -5JE yk                      (8-13)

The variance of the tog transformed yearly averages, s»:

                                                      y  }2
                                                        —   (8.14)
                                                                 2
                         Jk  ~  m\£*J*\    Z-Vyk  ~   ym/
                 2    k.1
                S
                 y            (m-1)                   (m-1)
       This variance estimate has m-1 degrees of freedom.
              Calculate the upper one sided 1-a percent confidence interval for the mean
using equation 8.x in Box 8.15.  Calculation of the upper confidence interval requires use
of a, specified in the attainment objectives, and the degrees of freedom for the standard
deviation, the number of years of data minus one, to determine the relevant t-statistic from
Table A.2 in Appendix A.  If the lower one-sided confidence limit is desired,  replace the
second plus sign in equation (8.15) with a minus sign.

              Finally, if the upper one-sided confidence interval is less than the cleanup
standard and if the log transformed concentrations arc not increasing over time, decide that
the tested ground water attains the cleanup standard If the ground water from all wells  or
groups of wells attains the cleanup standard then conclude that the ground water at the site
attains the cleanup standard. The steps  in deciding attainment of the cleanup standard are
shown in Box 8.10.
                                      8-19

-------
  CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS
                                    Box 8. 15
         Calculation of the Upper Confidence Limit for the Mean Based on Log
                           Transformed Yearly Averages
       The Upper one-sided confidence limit for die mean is:
                          f     4           -v /"4     4~1
                          (ym •»• f + li.a-.m-l  \ 5f£f + ID? J   (8'
       where the degrees of freedom (Df) associated with so is  m-1, and the
       appropriate value of t^oj,,.) can be obtained from Table A.I.  The term
                                              4
       under the square root is the variance of ym + -£ and was calculated from the
       variance of the two terms, which are independent if the data have a lognor-
       mal distribution.
8.4          Assessing Attainment of the Mean After Adjusting for Seasonal
             Variation
             This section provides an alternative procedure for testing the mean concen-
tration.  It is expected to provide more accurate results with large sample sizes, correlated
data, and data which is not skewed.  Because this procedure is sensitive to skewed data, it
is recommended  only, if the distribution of the residuals is reasonably symmetric.

             After the data have been collected using the guidelines indicated in
Chapter 4, wells can be tested individually or a group of wells can be tested jointly. In
the latter case, the data for the individual wells at each point in time arc used to produce a
summary measure for the group as a whole.  This summary measure may be an average,
maximum, or some  other measure (see Section 2.35). These  summary measures will be
averaged over the entire sampling period. The tests for attainment and the corresponding
calculations required  when removing seasonal  averages arc described next.

             The calculations and procedures when using the mean adjusted fop seasonal
variation arc described below and summarized in Box 8.16. This procedure is not recom-
mended if the data are noticeably skewed. The following calculations and procedures are
appropriate if the number of observations per year is the same far all years. However, they
                                      8-20

-------
   CHAPTER 8: ASSESSING  ATTAINMENT USING FIXED  SAMPLE SIZE TESTS



can still be used (with some minor loss  in efficiency) if a  few observations are  lost  as  long

as the loss is not concentrated  in a particular season (note example in Section 8.3). If the

proportion of observations varies considerably  from season to season,  consultation

with  a  statistician is recommended.  If  the data  are  obviously skewed, the procedures

described  in Box  8.15 which use the log transformed  yearly  averages are  recommended.
                               Box 8.16
    Steps for Assessing Attainment  Using  the  Mean After Adjusting  for
                           Seasonal Variation

(1)     Calculate the seasonal  averages and the mean of the seasonal
       averages, £ m$, (Box 8.8)

(2)     Calculate the deviations from the seasonal averages (residuals) (Box
       8.17)

(3)     Calculate the variance, sj| of the residuals (see Box 8. 18)

(4)     Calculate the lag 1 serial correlation of the residuals using equation
        8.18) in Box 8.19. Denote the computed  serial correlation by
              £8.18
              *obs-
       (5)    Calculate the upper 1-a percent one-sided confidence interval for the
              mean, x. (Box 8.20)

       (6)    Decide whether the ground water attains the cleanup standards
              (BpxS.10).
              Use  the formulas in  Box 8.8 for calculating the seasonal averages and the

mean of the seasonal averages. If there are missing observations within a season, average

the non-missing  observations. Calculate the  residuals,  the  deviations  of the measurements

from the  respective seasonal means using equation  (8.16) in  Box 8.17.  Box  8.18  shows

how to calculate the variance of the  residuals. The variance  will have degrees  of freedom

equal to  the  number of measurements less the  number of seasons.  Calculate the serial

correlation of the residuals using  equation  (8.18) in  Box  8.19.  If the serial correlation  is

less than zero, use  zero when  calculating  the confidence interval.
                                        8-21

-------
CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE  SIZE TESTS
                                 Box 8.17
                         Calculation of the Residuals

    From each sample observation, subtract the corresponding seasonal mean.
    That is, compute the 6jk, the deviation from the mean:


                              e^-Xk-X.                      (8.16)
                                 Box 8.18
                  Calculation of the Variance of the Residuals

    Calculate the variance of the residuals ejk after adjustments for possible
    seasonal differences:
    Alternatively, the ANOVA approach described in Appendix D can be used
    to  compute the required  variance.
                                 Box 8.19
       Calculating the Serial correlation from the Residuals After Removing
                             Seasonal Averages

    The sample estimate of the serial correlation of the residuals is:

                                     N
                                                                (8-18)
    Where eit i = 1,2, ...,N are the residuals after removing seasonal averages,
    in the time order in which the samples were collected.
                                   8-22

-------
  CHAPTER 8: ASSESSING ATTAINMENT USING  FIXED  SAMPLE  SIZE TESTS

             Using the mean of the seasonal averages and the standard deviation of the
mean, calculated from the residuals, calculate the upper one sided 1-ot percent confidence
interval for the mean using equation (8.19) in Box 8.20.  The standard deviation is the
square root of the variance calculated from equation (8.17). If the observed serial correla-
tion is less than zero, use zero in equation (8.19). Calculation of the upper confidence
interval requires use of a, specified in the attainment objectives, and the degrees of
freedom for the standard deviation, the number of yean of data minus one, to determine the
relevant t-statistic from Table A.2 in Appendix A. If the lower one-sided confidence limit
is desired, replace the plus sign in equation (8.19) with a minus sign.
                                   Box 8.20
       Calculation of the Upper Confidence Limit for the Mean After Adjusting for Seasonal Variation

       Calculation  of the Upper One-Sided Confidence Limit

                                                                   (!U9)
       where x is the computed mean level of contamination computed from
       equation (8.8), and s is die square root of the variance of the observations
       taking into account possible seasonal variation as computed from equation
       (8.17).  The degrees of freedom, Df, associated with s is Df = *yi and the
       appropriate value of M^f can be obtained from Table A.I.  If $obsis less
       than zero, set^j, to zero. For the derivation of the term under the square
       root, see Appendix F.
                                      8-23

-------
   CHAPTER 8:  ASSESSING ATTAINMENT USING FIXED SAMPLE! SIZE TESTS
                                    Box 8.21
                     Example calculation of confidence Intervals
       Table 8.1 and Figure 8.3 show hypothetical arsenic measurements for
       ground water samples taken at quarterly intervals for four years. For these
       data, the four seasonal (quarterly) means are: Xj  * 6.688; X2  * 6.013; X3
       »  5.078; and £4 » 5.878, and the overall mean is X  * 5.914 ppb. The
       adjusted  arsenic measurements labeled "residuals," shown in  the  last
       column of the table, are obtained by subtracting the seasonal means from the
       original observations.
       The estimated variance of the data, taking into account possible seasonal
       differences, is  s2 = ^= .163 (equation (8.11)) with 4 (i.e. ^. ^)
       degrees of freedom, and the corresponding auto correlation is  $0b8 = .37
       (eq.  8.18).
       The upper one-sided 90 percent confidence interval extends from zero to
5.914 + 1.533
                                                  6.142 ppb.
       If the cleanup standard were 6 ppb, it would be concluded that the ground
       water has not attained the cleanup standard.
Figure 8.3    Plot of Arsenic Measurements for 16 Ground Water Samples (see Box
             8.21)
                           Arsenic  M«a«ur«m«nts:  1984-1987
               8.00
               7.00
               6.00
               5.00
         AnMfc 4.00
               3.00
               2.00
               1.00
               0.00
     2  3
                                        7  8  9  1011121314151ft
                                      TblW m QutttM*
                                      8-24

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED  SAMPLE SIZE TESTS


Table 8.1     Arsenic measurements (ppb) for 16 ground water samples (see Box 8.21)
Year
1984
1984
1984
1984
1985
1985
1985
1985
1986
1986
1986
1986
1987
1987
1987
1987
Quarter
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Arsenic
Measurement
6.40
5.91
4.51
5.57
7.21
6.19
4.89
5.51
6.57
5.70
5.32
5.87
6.57
6.25
5.59
6.56
Residual
-.288
-.103
-.568
-.308
.522
.177
-.188
-.368
-.118
-.313
.242
-.008
-.118
.237
.512
.682
8.5           Fixed Sample Size Tests for Proportions


              If the parameter to be tested is the porportion of contaminated samples from

either one well or an array of wells, the sample collection and analysis procedures are the

same as those outlined above for testing the mean with the following changes:


                     To apply this nonparametric test, each measurement is either coded
                     "1" (me actual measurement was equal to or above the relevant
                     cleanup standard Cs), or "0" (below Cs). The statistical analysis is
                     based on the resulting coded variable of O's  and 1's.

                     Only the analysis procedure which  used yearly averages,  in Box  8.6
                     is appropriate for the calculations. Do not use the calculation proce-
                     dures which correct for the seasonal pattern in the data and the serial
                     correlation  of the residuals or  which use the log transformed data.

                     See Section 8.22 for procedures far estimating the sample size.
                                       8-25

-------
   CHAPTER 8: ASSESSING ATTAINMENT USING FIXED SAMPLE SIZE TESTS

8.6          Checking for Trends in Contaminant Levels After Attaining the
              Cleanup  Standard

              Once a fixed sample size statistical test indicates that the cleanup standard
for the site has been met, there remains one final concern. The model  we have used
assumes that ground water at the site has reached a steady state and that there is no reason
to believe that contaminant levels will rise above the cleanup standard in the future.  We
need to check this assumption. Regression models, as discussed in Chapter 6, can be used
to do so. By establishing a simple regression model with the contaminant  measure as the
dependent variable and time as the independent variable, a test of significance can be made
as to whether or not the estimated  slope of the resulting linear model is positive (see  Section
6.1.3).   Scatter plots of the data will prove useful in assessing the model. When using the
yearly  averages,  the regression can be  performed  without adjusting for serial correlation.

              To minimize the  chance of incorrectly concluding that the concentrations  are
increasing over time,  we recommend that the  alpha level for testing the  slope (and selecting
the t statistic in Box 6.11) be set at a small value, such as 0.01 (one percent). If,  on the
basis of the test, there is not  significant  evidence that the slope is positive, then the evidence
is  consistent with the preliminary  conclusion that the ground water in the well(s)  attains  the
cleanup standard.  If the slope is significantly greater than zero, then the concern that
contaminant levels may later exceed the cleanup standard still exists and the assumption of a
steady state is called into question. In this case, further consideration must be given to the
reasons  for this  apparent  increase  and, perhaps, to additional remediation efforts.

8.7          Summary

              This chapter presented the procedures for assessing attainment of the
cleanup standards for ground water measurements using a fixed  sample size test. The
testing procedures can be applied to samples from either individual wells or wells tested as
a group. These procedures are used after the ground water has achieved steady state. Both
parametric and  nonparametric methods for evaluating attainment arc discussed. If the
ground water at  the site is judged to attain the cleanup standards because the concentrations
are not increasing and the long-term average is significantly less than the cleanup standard,
follow-up monitoring is recommended to check that the steady state  assumption holds.
                                        8-26

-------
    9. ASSESSING  ATTAINMENT  USING  SEQUENTIAL
                                  TESTS
             After the remediation effort ha» been terminated and the ground water has
achieved steady state, ground water samples can be collected to determine whether the
resulting concentrations of contaminants attain the relevant cleanup standard The
sampling and evaluation period making this attainment decision is represented by the
unshaded portion in  the figure  below.
Figure 9.1     Example  for Contaminant  Measurements  During  Successful
             1.2
               1

   Measured   °'8
    Ground    oe
    Water    0'6
 Concentration Q4 .

             0.2
               o:
                       Start
                     Treatment
 End!
Dedare<
 Contaminated
                                             Dale
             In this chapter statistical procedures are presented for assessing the attain-
ment of cleanup standard for ground water at Superfund sites using sequential statistical
tests. Note that attainment objectives, as discussed in Chapter 3,  must be specified before
the  sampling for assessing attainment begins.

             The collection of samples for assessing attainment of the cleanup standards
will occur after the remedial action at the site has been completed and after a subsequent
period has passed to allow transient affects due to the remediation to dissipate. The attain-
ment decision is an assessment of whether the remaining contaminant concentrations are
acceptable compared to the cleanup standard and whether they are likely to remain accept-
                                     9-1

-------
    CHAPTER 9: ASSESSING ATTAINMENT USING  SEQUENTIAL SAMPLING

able. To assess whether the  contaminant concentrations are likely  to remain acceptable, the
statistical procedures provide methods for determining whether or not a long-term average
concentration or a long-term percentage of the well water concentration measurements are
below the established cleanup standards. In particular, in the unlikely event that the
measurements are not  serially correlated, the methods presented in chapter 5, which assume
a random sample, can be used and consultation with a statistician is recommended.  If
sequential tests are being considered, note that on the average, the sequential tests will
require fewer samples than the fixed sample size tests in Chapter 8 or,  if applicable, those
in chapter 5.

              This chapter discusses assessing the  attainment of cleanup  standards  using  a
sequential statistical test. For a sequential test, the ground water samples are collected on a
regular schedule, such as every two months.   Starting after the collection of three years of
data, a statistical test is performed every year to determine whether (1) the ground water
being sampled attains the cleanup standard, or (2) the ground water does not attain the
cleanup standard, or  (3) more data are required to make a decision.  If more data are
required, another year's worth of data is collected before the next statistical test is per-
formed. Figure 9.2 is a flow chart outlining the steps involved in the cleanup process
when using a sequential statistical test.

              Unlike  the fixed  sample size test, the number of samples required  to reach a
decision using the sequential test is not known at the beginning of the sampling period. On
the average, the sequential tests will require fewer samples and a corresponding shorter
time to  make the attainment decision than for the tests in Chapter 8. If the ground water
clearly attains the cleanup standard, the sequential test will almost always require fewer
samples than a fixed sample size test. Only when the contaminant concentrations are less
than the cleanup standard and greater than the mean for the alternate hypothesis might the
sequential test be likely to require more samples than the fixed sample size test.

              This chapter presents statistical procedures  for determining whether:
                     The mean concentration is below the cleanup standard;  or
                     A  selected percentile of all samples is below the cleanup standard
                     (e.g.,  does the 90th percentile of the distribution of concentrations
                     fall below the cleanup  standard?).
                                        9-2

-------
    CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
Figure 9.2     Steps in the Cleanup Process When Using a Sequential Statistical Test
                                           Objectives
                                         Treat the ground
                                       Wait for ground water
                                        to reach steady state
                                       Specify Sample Desigi
                                        and Analysis Plan
                                        Collect the Data for
                                           TwoY<
                                        Collect the Data for
                                        an Additional Year
                                              etfl
Determine If the Ground
water in wells Attains the
   Cleanup Standard
                 Reassess Cleanup
                   Technology
                                             Is the
                                            Cleanup
                                            Sondanl
                                            Attained?
                                              Do
                                    Yes /Concentrations
                                          Increase Over
                                             Tune?
                  More Data is
                   Required
                                        9-3

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

              The measured ground water concentrations may fluctuate over time due to
many  factors including:

                     Seasonal and short-term  weather patterns  affecting the  ground water
                     levels and flows;
                     Variation in ground water concentrations due to historical fluctua-
                     tions in the contamination introduced the ground water; and
                     Sampling errors and laboratory measurement errors  and fluctua-
                     tions.

              The effects of periodic seasonal fluctuations in concentration can be elimi-
nated from the analysis, resulting in a more precise statistical test, by either averaging the
measurements over a year or correcting for any seasonal patterns found in the data These
two statistical analysis procedures arc presented in sections 9.3 and 9.4, respectively.  The
method of using yearly averages is, in general, easier to implement and preferred.
Correcting for the seasonal pattern may provide more precise statistical tests in situations
where large correlations exist between measurements and when  the measurement errors
have a symmetric distribution.

              Three procedures are presented  for testing the mean when using sequential
tests. The first and second procedures use yearly average concentrations. The first
method, based on the assumption that the yearly means have a normal distribution, is
recommended when there are missing values  in the data and the missing values are not
distributed evenly throughout the year. The second procedure assumes that the distribution
of the yearly average is skewed, similar to a lognormal distribution, rather than symmetric.
If there are no missing values, the second method using the log transformed yearly
averages is recommended even if the data are not highly skewed. The third method
requires calculation of seasonal effects and serial correlations to determine the variance of
the mean.  Because the third method is  sensitive to the skewness of the data, it is recom-
mended only if the distribution of the residuals  is reasonably symmetric. Regardless of the
procedure  used,  the sample frequency for assessing the mean should be determined using
the steps described in Section 9.1.

             These sequential procedures arc an adaptation of Wald's sequential proba-
bility ratio test, specifically a version of the sequential t-test They assume that the data is
normally distributed or can be made so by a log transformation.  See Hall (1962). Hayre
(1983). and Appendix F for details.

                                       9-4

-------
    CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
9.1          Determining Sampling Frequency for Sequential  Tests


              The ground water samples will be collected at regular intervals using a

systematic sample with a random start as described in Chapter 4. An important part of
determining the sample collection procedures  is to select the time interval between samples

or the number of samples to collect per seasonal period usually per year. Asdiscussed in
Chapter 8, the term "year" will be used to mean a full seasonal cycle, which in most cases

can be considered a calendar year.


             The steps for determining sample frequency when testing the mean are

provided in Box 9.1 and are discussed in Section 8.2 in more detail. The procedures for
determining sample frequency require the specification of die serial correlation, 4>, and the

measurement error, o, for the chemical under investigation. The procedures described in

Section 5.3 may be used to obtain rough estimates of the serial correlation. Denote these
estimates by $. An example of calculating sample frequency is presented in Box 9.3.
                                    Box 9.1
             Steps for Determining Sample Frequency  for Testing the Mean

       (1)    Determine the estimates of o and $ which describe the data. Denote
             these estimates by ft and $.

       (2)    Estimate  the ratio of the annual overhead cost of maintaining
             sampling operations  at the  site to the unit cost of collecting  process-
             ing, and analyzing one ground water sample. Call this ratio $R.

       (3)    Based on the values of $R and $, use Appendix Table A.4 to deter-
             mine the approximate number, np. of samples to collect per year or
             seasonal period. The value np may be modified based  on  site-
             specific considerations, as discussed in the text

       (4)    The sampling frequency (i.e., the number of samples to be taken per
             year) is np or 4, whichever is larger.  Denote this sampling fre-
             quency as n. Note that, under this rule, at least four samples per
             year per sampling well will be collected.
             The steps for determining sample frequency when testing a proportion are

provided in Box 9.2 and are discussed in Section 8.2 in more detail.
                                      9-5

-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                Box 9.2
       Steps for Determining Sample Frequency far Testing a Proportion

   (1)    Compute the estimates of o and t which describe the measurements
          (not  the coded  values).   Denote this estimates  by d and <$m.

          Let $ = y-k  ($ is the estimated correlation between the coded
          observations, the constant 2.5 was determined from simulations).

   (2)     Estimate the ratio of the  annual  overhead cost of maintaining
          sampling operations at the site to the unit cost of collecting,  process-
          ing, and analyzing one ground water sample. Call this ratio $R.

   (3)    Based on the values of $R and $, use Appendix Table A.4 to deter-
          mine the approximate number, np, of samples to collect per year or
          seasonal period. The  value np may be modified based on site-
          specific considerations, as discussed in the text

   (4)     The sampling frequency (i.e., the number of samples to be taken per
          year) is np or  4, whichever is larger. Denote this sampling fre-
          quency as n. Note that, under this rule, at least four samples per
          year per sampling well  will  be collected.
                                Box 9.3
                 Example of Sample Frequency  Calculations

   In Box 8.2, an example of determining the sample frequency is provided for
   a fixed sample size test The determination of the number of samples to be
   taken per year is required for sequential sampling also. In that example, it
   was found that np = 9, so that 9 samples per year (practically speaking,
   once every 1.5 months) should be collected. This is all that is needed for
   sequential sampling. Samples will then be collected until a decision can be
   made. Note that in Box 8.2, a further calculation was done (computing 1x14)
   to determine the number of years for which data are to be collected for the
   fixed sample size approach. After this period of time (eight years in the
   example) a statistical test would be made to determine whether the ground
   water could be considered clean or not On average, a sequential test will
   require a shorter time period to reach a decision than a fixed sample size
   test, but this is not guaranteed.
                                  9-6

-------
    CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

 9.2          Sequential  Procedures for  Sample  Collection  and  Data
              Handling

              The  samples are assumed to be collected using a systematic sample as
 discussed Chapter 4.

              The  sample collection and  analysis procedures require the following limita-
 tions on the quantity and frequency of data collected:

                     To provide the minimal amount of data required for the statistical
                     tests, at least three years of data must be collected before any statisti-
                     cal test can be performed.
                     It is strongly recommended that at least four samples be collected in
                     each period or year to capture any seasonal differences or variation
                     within a year or  period.
                     The statistical tests are  performed only on data representing a
                     complete  year of data collection. Thus, the first statistical test would
                     be  performed  after three full years of data  collection, and the second
                     after four  full years of data collection, etc.
                     If the proportion of contaminated samples is required to be below a
                     specified  value of Pa collect at least a number of samples N' such
                     that N1*?^  before doing the first sequential test
              Handling of outliers and measurements below the detection limit is dis-
cussed in Section 2.3.7.
9.3           Assessing Attainment of the Mean Using Yearly Averages

              As noted in Chapter 8, the approach of using yearly averages substantially
reduces the effects of any serial correlation in the measurements. For the procedures
discussed in this section, the variance of the observed yearly averages is used to estimate
the variance of the overall average concentration.  Wells can be tested individually or a
group of wells can be tested jointly.  In the latter case, the data for the individual wells at
each point in time are used to produce a summary measure for the group as a whole. This
may  be an average, a maximum, or some other measure for all data values collected at a
particular point in time (see Sections 2.3.5). These  summary measures will be averaged
over  the yearly  period.
                                       9-7

-------
    CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

              Two calculation procedures for assessing attainment are described in this
section. Both procedures use the yearly average concentrations. The first is based on the
assumption that the yearly averages can be described by a symmetric normal distribution.
The second procedure uses the log transformed yearly averages and is based on the
assumption that the distribution of the yearly averages can be described by a (skewed)
lognormal  distribution. Because the second procedure performs well even when the data
have a symmetric distribution, the second method is recommended in most situations.
Only when there are missing data values which are not evenly distributed throughout the
year and there is also an apparent seasonal pattern in the data is the first procedure recom-
mended.

              The  calculations and procedures when  using the untransformed yearly
averages are described  below and summarized in Box 9.4. This  procedure is  appropriate in
most situations but is not preferred particularly if the data are highly skewed.  The calcula-
tions can-be used (with some minor loss in  efficiency) if  some observations (are missing. If
the proportion of missing observations varies considerably from season to season and there
are differences  in the average measurements among  seasons,  consultation  with a statistician
is recommended.  If the data are highly skewed,  the procedures described in Box 9.12
which use  the log transformed yearly averages are recommended unless the data exhibit
both a seasonal pattern and missing observations.

              Use the formulas in Box 9.5 for calculating the yearly averages for the m
years of data collect&l so far.  If there are missing  observations  within a year, average the
non-missing observations. Calculate the mean and variance of the yearly averages using
the equations in Box  9.6. The variance will have degrees of freedom equal to m-1, one
less than the number of years over which the data was  collected.

              If there are no missing observations, the mean of the yearly averages, ?m>
will he compared to the cleanup standard for assessing attainment.  If however, there are
missing observations, the  mean of the  yearly averages may provide  a biased estimate of the
average concentration during the sample period. This  will be true if the missing observa-
tions occur mostly at times when the concentrations arc generally higher or lower than the
mean concentration. To correct for this bias, the mean of the seasonal  averages will be
compared to the cleanup standard when there are missing observations. Box 9.7 provides
equations for calculating the seasonal averages and f ms, the mean of the seasonal averages.
                                       9-8

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUfcN I1AL SAMPLING



Using x to designate the mean value which is to be compared to the cleanup standard, set x

- f m if there are no missing observations, otherwise set x * xms>
                                    Box 9.4
               Steps for Assessing Attainment Using Yearly Averages

      Q)    Calculate the yearly averages for the m years of data collected so far
             (see Box 9.5)

      (2)    Calculate the mean, ?m. and variance, sf, of the yearly averages
             (see Box 9.6)

      (3)    If there are no missing observations, set

                                    x-xm                         (9.1)

             Otherwise, if there are missing observations calculate the seasonal
             averages and the mean of the seasonal averages, f ms. (Box 9.7)
             and set

                                    x*fms                        (9.2)

             where x is the mean to be compared to the cleanup standard.

      (4)    Calculate the t and 8 for the likelihood ratio. (Box 9.8)

      (5)    Calculate the likelihood ratio for the statistical test (Box 9.9)

      (6)    Decide whether the ground water attains the cleanup  standards
             (Box 9.10).

      (7)    If more data are required, collect an additional years samples and
             repeat the procedures in this Box.
                                      9-9

-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                 Box 9.5
                      Calculation of the Yearly Averages

   Let Xjk the measurements from an individual well or a combined measure
   from a group of wells obtained for testing whether the mean attains the
   cleanup standard,   xjk represents the concentration for season j (the jth
   sample collection time out of n) in year k (where data has been collected for
   m years).

   The yearly avenge is the average of all of the observations taken within the
   year. If the results for one or more sample times within a year are missing,
   calculate the average of the non-missing observations. If there are
   n) non-missing observations in year k, the yearly average, 3^, is:
   where the summation is over all non-missing observations within the year.
   Calculate the yearly average for all m years.
                                 Box 9.6
          Calculation of the Mean and Variance of the Yearly Averages

   The mean of the m yearly averages, f m is:


                              fm = ££  *k                    (9.4)


   where Xk is the yearly average for year k.
                                    2
   The variance of the yearly averages, Sg. can be calculated using either of the
   two equivalent formulas below:
            x            (m-1)                   (m-1)

   This variance estimate has m-1 degrees of freedom.
                                  9-40

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                     Box 9.7
        Calculation of Seasonal Averages and the Mean of the seasonal Averages
       Far the n sample collection times within the year, the j1  seasonal average is
       the average of all the measurements taken at the jl collection time. Note
       that if there is a missing observation  at one collection time, the tnieasurement
       from the jl sample collection time may be different than the j  sequential
       measurement within the  year.
       For all collection times j, from 1 to n, within each year, calculate the
       seasonal average, Xj. The number of observations at the j111 collection time
       is mj £ m. If there are missing observations, sum over the mj non-missing
       observations.
       The mean of n seasonal averages is:
                                                                    (9.7)
       The total number of observations is:
                                   N  » 2  inj                       (9.8)
              Using the mean x, and the standard deviation of the mean calculated from
the yearly averages, s^, calculate t and 5 using equations (9.9) and (9.10) in Box 9.8.
These values are used in the calculation of the likelihood ratio. The standard deviation is
the square root of the variance calculated from equation (9.5). The t-statistic used here is
slightly different from that used in the standard t-test. Use of this definition of t makes
calculation of the likelihood ratio easier.

              Use equation (9.11) in Box 9.9 to calculate the likelihood ratio for the
sequential test 'This equation provides a good approximation to the actual likelihood ratio
which is difficult to calculate exactly. For references and more details about this approxi-
mation, see Appendix F.
                                      9-H

-------
    CHAPTER 9:  ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                    Box 9. 8
         Calculation of t and 8 When Using the Untransfonned Yearly Averages
                                                                  (9.9)
                                 8- ^"f!                     (9.10)
                                     V?
       where x is the mean level of contamination, and sj is the square root of the
       variance of the yearly means. The degrees of freedom associated with s* is
       m-1
                                    Box 9.9
               Calculation of the Likelihood Ratio for the Sequential Test
       The likelihood ratio is:
       where m is the number of years of data collected so far and t and 8 are
       calculated from the m years of data.
             Finally, the likelihood ratio, a, and p* are used to decide if the average
concentration is less than the cleanup standard.  If the average is less than the cleanup
standard and if the concentrations are not increasing over time (see Section 9.7), conclude
that the tested ground water attains the cleanup standard. If the ground water from all wells
or groups of wells attains the cleanup standard then conclude that the ground water at the
site attains the cleanup standard. If the average concentration is not less than the cleanup
standard or if the concentrations are increasing over time, conclude that the  ground water in
the well does not attain the cleanup standard. The steps in deciding attainment of the
cleanup standard are shown in Box 9.10.
                                     9-12

-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                Box 9.10
     Deciding if the Tested Ground Water Attains the Cleanup Standard
   Calculate:

                                                               (912)
   If LR £ A, conclude that the ground water in the wells does nnt attain the
   cleanup standard.,

   If LR > B, conclude that the average ground water concentration in the well
   (or group of wells) is less than the cleanup standard. Perform a trend test
   using the regression techniques described in Chapter 6 to determine if there
   is a statistically significant increasing trend in the yearly averages over the
   sampling period (also see Section 9.7).

   If there is  not a statistically significant increasing trend, conclude that the
   ground water attains the cleanup standard (and possibly initiate a follow-up
   monitoring program). If a significant trend does exist, conclude that the
   ground water in the wells does not attain the cleanup standard and resume
   sampling or reconsider treatment effectiveness.

   If A < LR £ B then collect an additional years worth of data before perform-
   ing the hypothesis test again.
                                  9-13

-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                Box 9.11
           Example Attainment Decision Based on a Sequential Test

   In this example we will use the arsenic measurements appearing in Table
   8.1.  Suppose we wish to compare the cleanup standard (Cs * 6) with a
   targeted cleanup average Oi}) of 5.72 Oi} is the value for which the false
   negative rate |5 is to be controlled).  Box 8.21 indicates the four yearly
   means *k and the overall average Xm = 5.914.  Using equation (9.5), the
   value of sj = .0706 for m = 4. Thus,

                                      6+5.72

                                              -. 406 and
                                      .0706

   With a = .1 and J3 = .1, then A = .111, B = 9.0. Since 0.618 is neither
   less than A or greater than B, we have insufficient data to conclude that the
   cleanup standard has been either attained or not attained Thus, more data
   must be gathered

   Suppose data continue to be collected for  seven more years without a
                                                                   2
   decision being reached. At that time, the overall average Xm = 5.77 and s*
   = .1024 for m » 11. Thus,

              ...  6+5.72
              J.// -   5                   r •••)  f-
                             -.933 and  8 =  -^t=== -2.902
                                              .1024
                                                11
        LR - exp [-2.902 ^ (-.933) Vn-ilW] = 9'29

   Since LR = 9.38 > 9.0, we conclude that the mean ground water concentra-
   tions are less than the cleanup standard.
                                 9-14

-------
   CHAPTER  9: ASSESSING  ATTAINMENT USING SEQUENTIAL SAMPLING

             When the data are noticeably skewed, the calculation procedures using the
log transformed yearly averages (Box 9.12) arc recommended over those in Box 9.4.
Because the procedures in Box 9.12 also perform well when the data have a symmetric
distribution, these procedures arc generally recommended, in all cases where there are no
missing data. There is no easy adjustment for missing data when using the log transformed
yearly averages. Therefore, if the number of observations per season (month etc.) is not
the same for all seasons and if there is any seasonal pattern in the data, use of the proce-
dures in Box 9.4 is recommended.

             The calculations procedure  when using the log transformed yearly  averages
is  described below  and summarized  in Box 9.12. The calculations arc slightly more
difficult than when using the transformed yearly averages. After calculating  the yearly
averages, take the natural  log is used to transform the data. The transformed averages are
then used in the subsequent analysis. The upper confidence interval for the mean concen-
tration is based on the mean and variance of the log transformed yearly averages. The
formulas are based on the assumption that the yearly averages have a log  normal
distribution.
                                    Box 9.12
           Steps for Assessing Attainment Using the Log Transformed Yearly Averages

       (1)    Calculate the yearly averages (see Box 9.5)
       (2)    Calculate the natural log of the yearly averages (see Box 9.13)
       (3)    Calculate the mean, ym, and variance, s«, of the log transformed
             yearly averages (see Box 9.14)         y
       (4)    Calculate the t and 6 for the likelihood ratio.  (Box 9.15)
       (5)    Calculate the likelihood ratio (Box 9.9)
       (6)    Decide whether the ground water attains  the cleanup standards
             (Box 9.10).
       (7)    If more data are required, collect an additional years samples and
             repeat the procedures in this Box.
                                      9-15

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

             Use the formulas in Box 9.5 for calculating the yearly averages.  If there are
missing observations within a year, average the non-missing observations. Calculate the
log transformed yearly averages using equation (9.13) in Box 9.13. The natural  log
transformation is available on  many calculators and computers, usually designated as
"LN", "In", or "loge."  Although the equations could be changed to use the base 10 loga-
rithms, use only the base e logarithms when using the equations in Boxes 9.13 through
9.15.  Calculate the mean and variance of the log transformed yearly averages using the
equations in Box 9.14. The variance will have degrees of freedom equal to one less than
the number of years over which the data was collected.
                                   Box 9.13
                Calculation of the Natural Logs of the Yearly Averages
      The natural log of the yearly average is:

                                  yk=ln(Xk)                       (9.13)
                                   Box 9.14
         Calculation of the Mean and Variance of the Natural Logs of the Yearly
                                   Averages

      The average of the m log transformed yearly averages, ym:
                                                     2
      The variance of the log transformed yearly averages, s«:
               4 -    -    -   -    -  (9.15)
                y            (nvl)                  (m-1)
       This variance estimate has m-1 degrees of freedom.
                                      9-16

-------
    CHAPTER  9:  ASSESSING  ATTAINMENT USING SEQUENTIAL SAMPLING


              Using the mean ym, and the variance of the mean calculated from the log
                           M
transformed yearly averages, s£, calculate t and 5 using equations (9.16) and (9.17) in Box
9.15. These values are used in the calculation of unlikelihood ratio.
                                    Box  9.15
        Calculation of t and 5 When Using the Log Transformed Yearly Averages
                                    .2
                                    *y  ln(Cs)+ln(u-i)
                              ym •*•  9 •       2
                        .   t-	*         L 	              (9.16)
                                                                  (917)
       where the degrees of freedom (Df) associated with s| is m-1
             Use equation (9.11) in Box 9.9 to calculate the likelihood ratio for the
sequential test.  Finally, the likelihood ratio, a, and (J are used to decide if the average
concentration is less than the cleanup standard.  If the average is less than the cleanup
standard and if  the concentrations are not increasing over time, conclude that the tested
ground water attains the cleanup standard. If the ground water from all wells or groups of
wells attains the cleanup standard then conclude that the ground water at the site attains die
cleanup standard. If the average concentration is not less than the cleanup standard or if die
concentrations are increasing over time, conclude that the ground water in the well does not
attain the cleanup standard.  The steps in deciding attainment of the cleanup standard are
shown in Box 9.10.
                                      9-17

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

9.4          Assessing Attainment of the Mean After Adjusting for Seasonal
             Variation

             This  section provides an alternative procedure for testing if the mean
concentration is  less than the cleanup standard. It is expected  to provide more accurate
results when there are many samples per year and the data is both serially correlated and the
distribution of the data is not skewed. Because this procedure is sensitive to skewness in
the data, it is  recommended only if the distribution of the measurement  errors  is  reasonably
symmetric.

             After the data have been collected using the guidelines indicated in
Chapter 4, wells can be tested individually or a group of wells can be tested jointly. In
the latter case, the data for the individual wells at each point in time are used to produce a
summary measure for the group as a whole. This summary measure may be an average,
maximum, or some other measure (see Chapter 2). These summary measures will be
averaged over the entire sampling period.  The steps involved for incorporating seasonal
adjustments and serial correlations into the calculations associated with the statistical tests
arc discussed.

             The calculations and procedures for assessing the mean after adjusting for
seasonal variation arc described below and summarized in Box 9.16. An example is
provided in Box 9.21. The calculations can be used (with some minor loss in efficiency) if
some observations are missing.  With a large proportion  of missing  observations in any
season, consultation with a statistician is recommended. If the data are obviously skewed,
the procedures described in Box 9.12 which use the log transformed yearly averages arc
recommended.

             Use the formulas in Box 9.7 for calculating the seasonal averages and the
mean of the seasonal averages. If there are missing observations within a season, average
the non-missing observations.  Calculate the residuals, the  deviations of the measurements
from the respective seasonal means, using equation (9.18) in Box 9.17. Box 9.18 shows
how to calculate the variance of the residuals. The variance will have degrees of freedom
equal to the number  of measurements  less the number of seasons.
                                      9-18

-------
    CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                    Box 9.16
       Steps for Assessing Attainment Using the Mean After Adjusted for Seasonal
                                    Variation

       (1)    Calculate the seasonal averages and the  mean of the seasonal
              averages, ?ms, (Box 9.7)

       (2)    Calculate the residuals, the differences between the observations and
              the corresponding seasonal averages (Box 9.17)

       (3)    Calculate the variance, s2, of the residuals (see Box 9.18)

       (4)    Calculate the lag 1 serial correlation of the residuals using equation
              £9.20) in Box 9.19.  Denote the computed serial correlation by
              +obs-

       (5)    Calculate the t statistic based on the mean, x*mst the standard devia-
              tion s, and $obs. (Box 9.20)

       (6)    Calculate the likelihood ratio (Box 9.21)

       (7)    Decide whether the ground water attains  the cleanup standards
              (Box 9.10).
                                   Box 9.17
                            Calculation of the Residuals

       From each sample observation, subtract the corresponding seasonal mean.
       That is, compute the, 6jk the deviation from the seasonal mean:


                                 Cjk-Xjk-Xj.                     (9.18)
             Using the mean of the seasonal averages and the variance of the residuals,

s2, calculate t and 5 using equations (9.21) and (922) in Box 9.20. These values are used

in die calculation of the likelihood ratio.
                                     9-19

-------
CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                Box 9.18
                 Calculation of the Variance of the Residuals

   Calculate the variance of the observations 6jk reflecting adjustments for
   possible seasonal  differences using  the equation in  Box 8.12.
                                          jk2-                  (9.19)


   Alternatively, the ANOVA approach described in Appendix D can be used
   to compute the required variance.
                                Box 9.19
    Calculating the Serial Correlation from the Residuals After Removing
                            Seasonal Averages

   The sample estimate of the serial correlation of the residuals is:

                                    N
                                    ICiCM
                             Sobs =4	                    (8-18)

                                      I'?
                                     i-i  '

   Where Cj, i =.1,2, ...,N are the •residuals after removing seasonal averages,
   in the time order in which the samples were collected
                                  9-20

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                   Box 9.20
          Calculation of t and 5 When Using the Mean Corrected for Seasonal
                                   Variation

                                        Cs+m

                                           2                     (9.20)
                                       2  14-

                                     TTT
                                                                 (9.21)
      where f ms " me mean level of contamination computed from equation
      (9.7), and s2 is the variance of the observations computed from equation
      (9.16).  The degrees of freedom, Df, associated with these estimates is
             Use the formula in Box 9.21 to calculate the likelihood ratio for the sequen-

tial test  Although this formula for calculating the likelihood ratio looks different than when

using the yearly averages (see Box 9.9). the two formulas are equivalent
                                   Box 9.21
       Calculation of the Likelihood Ratio for the Sequential Test When Adjusting
                              for Serial correlation

      The likelihood ratio is:


                                      -                           ,a~~
                                                                  (9>22)
      where Df is the degrees of freedom for s2.
                                     9-21

-------
   CHAPTER 9:  ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING
                                    Box 9.22
         Example Calculation of Sequential Test Statistics after Adjustments for
                       Seasonal Effects and serial Correlation

       In Box 8.21, a test was performed for a fixed sample size after adjusting for
       seasonal effects and seasonal correlation. We will use the same data (from
       Table 8. 1) to conduct the corresponding sequential test after four years of
       data collection. From Box 8.21 we have X * 5.914, s2 * .163, $obs = .37,
       cs » 6.0, m « 4, and N » 16.  We will stipulate that a * .1, f) = .1, and m
       »5.72. Thus,
   .Cs+Ui              6+5.72
                5.914 -
                           .s+Ui
                          -- 2
                              Z
 VS2
TT
                                          .0706  1+37
                                                        = 0.551
                8=     El"01     -      5'72"6     ,.2858
                         VS2 n-2^    _ / .0706  1+.37
                         v   *  T OOa    ^ I -^^""^^^^^ ^^™^™«^
                        TTTC    ^   16    *'*

       and                     Df=^ = l|i = 4



       «-«P(5 ^Tl V ^? ) -«P(«8 JL6.5S1

       With a = .1 and P = .1,  then A = .111, B » 9.0.  Since 0.746 is neither
       less than A or greater than B, we have insufficient data to conclude that the
       cleanup standard has been either attained or not attained. Thus, more data
       must be gathered.
9.5          Sequential Tests for Proportions


In general, sequential procedures for testing proportions require that more samples be

collected before starting the fast test of hypothesis than when testing the mean. If the

parameter to be tested is the  proportion of contaminated samples from either one  well or an

array of wells, the sample  collection and  analysis procedures arc the same  as those outlined

above for testing  the mean, with the following changes:
                                      9-22

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

                    To apply this test, each ground water sample measurement is either
                    coded "1" (the actual measurement was equal to or above the
                    cleanup standard Cs), or "0" (below Cs). The statistical analysis is
                    based on the resulting coded variable of O's and 1's.
                    Only the analysis procedure which used yearly averages  is  appro-
                    priate for the calculations (Box 9.4). Do not use either of the
                    calculation procedures in Boxes 9.12  or 9.16.

             •      A total of at least p- samples should be collected before using the
                    statistical procedures to determine, on a yearly basis, whether
                    sampling can be stopped and a decision can be made.
9.6           A Further Note on Sequential Testing

              It should be noted that sequential testing, as discussed in this chapter, has a
small chance of continuing for a very long time if the data gathered provide insufficient
evidence for making a clear-cut determination. A stopping rule, such as the following can
be implemented to handle such cases: determine the sample size necessary for a fixed
sample test for the specified values of Cs, |ii, a, and P (data collected during the sampling
for assessing attainment can  be used to estimate the  variance so the sample size can be
computed).  Call this  sample size mfixed. If the number of years of sample collection
exceeds twice mfixed, determine the likelihood  ratio. If the likelihood ratio is less than 1.0,
conclude that the ground water does not attain the cleanup standard.  If the likelihood ratio
is greater than 1.0 conclude that the mean concentration is less than the cleanup standard
and test if there is  a significant positive slope in the data.
9.1           Checking for Trends in Contaminant Levels After Attaining the
              Cleanup Standard

              Once a fixed sample size statistical test indicates that the cleanup standard
for the site has been met, there remains one final concern. The model we have used
assumes that ground water at the site has reached a steady state and that there is no reason
to believe that contaminant levels will rise above the cleanup standard in the  future. We
need to check this assumption. Regression models, as discussed in Chapter 6, can be used
1A likelihood ratio of one occurs when the sample mean is at the mid-point between the cleanup standard
  and die mean for the alternate hypothesis.

                                      9-23

-------
     CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL SAMPLING

 to do so.  By establishing a simple regression model with the contaminant measure as the
 dependent variable and time as the independent variable,  a test  of significance can be made
 as to  whether or not the estimated slope of the resulting linear model is positive (see Section
 6.1.3). Scatter plots of the data will prove useful in assessing the model.  When using the
 yearly averages,  the regression can be performed without adjusting for serial correlation.

               To minimize the chance of incorrectly concluding  that the concentrations are
 increasing over time, we recommend that the alpha level for  testing the  slope (and selecting
the t statistic in Box 6.11) be set at a small value, such as 0.01  (one percent). If, on the
 basis of the test,  there  is not significant evidence that the slope is  positive, then the evidence
 is  consistent with the preliminary conclusion that the ground water in  the well(s) attains the
 cleanup  standard. If the slope is  significantly greater than zero, then the concern that
 contaminant levels may later exceed the cleanup standard  still  exists and the assumption of a
 steady state is called into question. In this case, further consideration must be given to the
 reasons for this apparent  increase  and, perhaps, to  additional remediation efforts.
 9.8           Summary

               This chapter presented the procedures for assessing attainment of the
 cleanup standard for ground water measurements using a sequential statistical test. For
 most statistical tests or procedures, the analysis is performed after the entire sample has
 been collected and the laboratory results are complete. However, in sequential testing, the
 samples  are analyzed as they are collected. A  statistical  analysis of the data collected so far
 is used  to determine whether another years worth of samples should be collected or
 whether  the  analysis should terminate.

               We presented three alternate procedures for assessing attainment using
 sequential tests. Two procedures use the yearly average concentrations, one assumes the
 yearly average has a normal distribution, the other assumes a log normal distribution. The
 third procedure uses the individual observations and makes a correction for seasonal
 patterns  and serial correlations.  In  general, the method which assumes the yearly averages
 have  a log  normal distribution  is recommended.

               These testing procedures can be applied to samples from either individual
 wells or wells tested as a group. These procedures are used after the ground water has
                                         9-24

-------
   CHAPTER 9: ASSESSING ATTAINMENT USING SEQUENTIAL  SAMPLING

achieved steady state. If the ground water at the site is judged to attain the cleanup
standards because the concentrations arc not increasing and the long-term average is
significantly less than the cleanup standard, follow-up monitoring is recommended to check
that the steady state assumption holds.
                                    9-25

-------
CHAPTER 9: ASSESSING ATTAINMENT USING  SEQUENTIAL SAMPLING
        blank page
                           9-26

-------
                             BIBLIOGRAPHY
Abraham, B. and Ledolter, J., 1983, Statistical Methods for Forecasting. New
          York John Wiley and Sons, Inc.

Albers, W., 1978, "One Sample Rank Tests Under Autoregressive Dependence,"
          Annals of Statistics, Vol. 6, No. 4: 836-845.

Albers, W., 1978,  'Testing the Mean of a Normal Population Under Dependence,"
          Annals of Statistics, Vol. 6, No. 6: 1337-1344.

Armitage, P. 1947, "Some Sequential Tests of Student's Hypothesis," Journal of the
          Royal Statistical Society, Series B, Vol. 9 : 250-263.

Armitage, P. 1957, "Restricted Sequential Procedures," Biometrika, Vol. 44 : 9-26.

Barcelona, M.,  Gibb, J.,  and Miller, R., 1983, A Guide to the  Selection of
          Materials For Monitoring Well Construction and Ground-Water Sampling.
          Illinois State Water Survey, Champaign, Illinois, US EPA- RSKERL, EPA


Barcelona, M., Gibb, J., Helfrich, J., and Garske, E., 1985, Practical Guide
          for Ground-Water Sampling, Illinois State Water Survey, Champaign, Illinois,
          USEPA-RSKERL, EPA 600/2-85/104.

Barnett, V., and Lewis, T., 1984, Outliers in Statistical Data. New York: John
          Wiley and Sons, Inc.

Bartels, R., 1982 "The Rank Version of von Newmann's Ratio Test for Randomness,"
          Journal of the American Statistical Association, Vol. 77, No. 377: 40-46.

Bauer,  P., and Hackl, P., 1978, "The Use of MOSUMS for Quality Control,"
          Technometrics, Vol.  20, No. 4: 431-436.

Bell,  C.,  and  Smith, E.,  1986,  "Inference for Non-Negative Autoregressive
          Schemes," Communications in Statistics: Theory and Methods, Vol. 18, No.
          8: 2267-2293.

Berthouex, P.,  Hunter, W.,  and  Pallisen, L.,  1978,  "Monitoring  Sewage
          Treatment Plants: Some Quality Control Aspects," Journal of Quality
          Technology, Vol. 10, No.  4:  139-149.

Bisgaard, S., and Hunter, W.  G., 1986, Report  No. 7, Studies in Quality
          Improvement: Designing Environmental Regulations,.  Center-for Quality and
          Productivity  Improvement. University  of Wisconsin-Mason, (February
                                   BIB-1

-------
                     APPENDIX F: DERIVATIONS AND EQUATIONS


    Simulations

 Preliminary simulations using lognormally distributed data and a factorial design  with 100
 simulations for each set of parameters was used to determine which factors affected the power of
 the sequential tests.  The factors in the simulations were: scale factor, proportion of the random
 variance which is correlated versus independent; lag 1 correlation; presence of a seasonal pattern;
 proportion of the observations which were censored; number of samples per year, and \L Analysis
 of the factorial design clearly indicated that the skewness and scale factor were most important in
 determining the power of the test The serial correlation and censoring were also important  The
 presence of a cyclical component (which resulted in significant changes in the variance throughout
 the year) did not significantly affect the power of the test.

 As a result of these preliminary simulations,  further simulations were run using  scale factors
 ranging from 1.6 to 4.8, a = (J = .05, p. = HQ or m, and the  following distributions and sampling
 designs:

       (1)    Normal distribution with independent errors and 4 samples per year,
       (2)    Lognormal distribution with coefficient of variation of 0.5, independent errors and
              4 samples per year. This is die basic distribution. The following simulations all are
              based on changes to the basic distribution.
       (3)    The basic distribution with 12 observations per year,
       (4)    The basic distribution but more skewed, with  a coefficient of variation of 1.5;
       (5)    The basic distribution with censoring of 30% of the data (censored values were set
              equal to the detection limit);
       (6)    The basic distribution with correlated errors, the serial correlation between log
              transformed monthly observations is 0.8; and
       (7)    Data which are both skewed and correlated, with coefficient of variation of 1.5 and
              serial correlation between log transformed monthly observations is 0.8. For this set
              of simulations, the random error was the sum of two components,  one random,
              representing  random measurement error, and  the second correlated, reflecting
              correlations in the the groundwater concentrations. The correlated error made up
              75% of the total error variance.

For each test and each set of simulations with the same distributional assumptions, Figure 8 shows
the range in the false positive rate across simulations. Figure 9  shows similar information for the
false negative rate.

As can be seen from Figure 8, the false positive rate for the  tests are close to the nominal level of
0.05 when the data have a normal distribution, as desired.  For skewed and correlated data, the
false positive rate generally exceeds the nominal  level.

For skewed and correlated data, the false positive rate for the standard  sequential t-test exceeds the
nominal value for all simulations. The performance of the modified test and the modified test with
adjustments  for seasonal patterns and serial correlations had similar false positive rates. Both of
these tests are sensitive to correlated and skewed data. The false positive rate for the modified test
adjusted for  skewness is lower than for the other three tests.  Only  for correlated data  does this test
have a false positive rate consistently greater than the nominal level. Censoring resulted in a
relative decrease in the false positive rate.  Of the tests based on the modified sequential t-test, the
test with adjustments for skewness had the lowest average sample sizes and lowest false positive
rates.

Based on both the average sample sizes and false positive rates from the simulations, the modified
test adjusted for  skewness is  preferred over the  other sequential tests. To the extent that the false
positive rate exceeds the nominal level for skewed and correlated data, the power can be improved


                                          F-21

-------
                     APPENDIX F: DERIVATIONS  AND  EQUATIONS


by using two year averages instead  of one year averages. Results for the skewed and correlated
data using two year averages are also shown in Figure 8.

As shown in Figure 9, the false negative rate for all tests was generally similar to or less than the
nominal level. The false negative rate for the standard sequential t-test exceeded that for the
procedures based on the modified test For all tests, the false negative rate increased greatly in the
presence of censoring. Procedures  based on the modified test, the modified test adjusted for
skewness had a false negative rate closest to the nominal level under the simulated conditions.
Although the average sample  sizes for the tests  were similar, the test adjusted for skewness had
highest average sample sizes.  At the alternate hypothesis no  one calculation procedure is clearly
preferred, however, the  modified test has false negative rates  lower than the nominal value for all
but censored observations and  is  the  simplest to calculate.

The  sample sizes for the skewed data were similar to those for the normally distributed data for
which the sequential test required fewer samples, on the average, than the equivalent fixed sample
size  test. Therefore, it is likely that the sequential tests would also have lower average sample size
than  for a fixed  sample size  test where the sample size calculations accounted  for the skewed and/or
correlated nature of the data.


6.     Conclusions and  Discussion

For assessing attainment of Superfund cleanup standards based on the mean  contaminant levels
using sequential  tests,  the conclusions from this  simulation study  are:

       Given the situations found at Superfund  sites, a sequential test can reduce the number of
       samples compared to  the that  for an equivalent fixed sample size test;

       The  standard sequential  t-test can have false negative rates greater than the nominal value.

•      An  adjustment factor can  be used to  improve the power performance  of the sequential t-test
       without greatly increasing the sample sizes. Different criteria will result in the selection of
       different adjustment factors,  however, all of the adjustment factors  considered improved
       the  performance  of the  test. In this paper,  the adjustment  factor (n-2)/n  was evaluated.

       Use  of a simple  approximation to the likelihood ratio performs well compared to that based
       on the  non-central t  distribution;

•      Sampling rules which terminate the sequential test if the number of samples exceeds twice
       the  sample  size for the  equivalent fixed sample size test  are likely to have little effect  on the
       power of the sequential t-test;

       A modified sequential  t-test with an adjustment for skewness has  the lowest false positive
       rate  among the tests considered and has acceptable false negative rates and sample sizes
       relative to the other tests;  and

•      All  test procedures were sensitive to  censored data.


The  procedures used here set censored values equal to the detection limit. Other possible
approaches place censored values at  half the detection limit or at zero. Further  work is required to
determine how the sequential tests perform using different rules for handling values below the
detection limit The decision rule which places censored values at the detection level was chosen to
protect human  health and the environment when  assessing  attainment  at Superfund sites.
                                           F-22

-------
                     APPENDIX F: DERIVATIONS AND EQUATIONS
The problem of testing multiple wells and contaminants is particularly troublesome when the
decision rule requires that  all wells and all contaminants must attain the relevant cleanup standards.
Even if all concentrations are below the cleanup standard, the probability of a false negative on any
one of several statistical tests increases  the probability of falsely concluding that additional  cleanup
is required. The false negative rate for the modified sequential tests considered in this paper are
generally lower than the nominal value for all but censored data. Therefore, use of these tests will
generally not contribute, beyond that planned for in the sample and analysis plan, to incorrectly
concluding that the ground water attains the Cleanup standard unless the data are censored.

All of the power curves  are based on the assumption that the standard deviation will  remain
constant as the mean changes. Another possible  assumption is that the coefficient of  variation will
remain constant as the mean changes. While the assumption about how the standard deviation
changes as the mean changes does not affect the conclusions presented, the actual shape of the
power curves will depend on the assumptions made.

Finally, these modified sequential t-tests can also be used when the alternate hypothesis is greater
than the null hypothesis. The results above can be applied if the false negative and false positive
labels are reversed. For compliance monitoring,  i.e., to answer the question: do the concentrations
exceed an action level?,  all of the modified sequential tests  perform well if the  data  arc not
censored.  With censored data, alternate rules for handling the observations below the detection
level should  be considered.


Bibliography

Ghosh, B. K., 1970, Sequential Tests of Statistical Hypotheses, Reading MA, Addison
          Wesley.

Hall, W. J., 1962, "Some Sequential Analogs of Stein's Two Stage Test," Biometrika, Vol 49,:
          367-378.

Hayre, L. S,, 1983, "An  Alternative to the Sequential T-Test." Sankhya : The Indian Journal of
          Statistics, 45, Series A, Pt.  3, 288-300
                        "Water Quality Sampling: Some Statistical Considerations,"
                        mrch, Vol. 16, No. 6: 1717-1725.
Liebetrau, A. M., 1979, "Water Quality Sampling: Some Statistical Considerations," Water
          Resources Research,
Loftis, J., Montgomery,  R., Harris, J., Nettles, D.,  Porter, P., Ward,  R., and
           Sanders, T., 1986, "Monitoring Strategies for Ground Water Quality Management,"
           Prepared for the United States Geological Service by Colorado State University, Fort
           Collins Colorado.

Rushton, S., 1950, "On a Sequential t-Test," Biometrika, Vol. 37: 326-333.

Wald, A.,  1947, Sequential Analysis. New York: Dover Publications.
                                          F-23

-------
Figure 1  Example of Simulated Monthly Ground Water Data
2
.*».
Monthly
Measurements

Cleanup
Standard
0
1
2
3
Years
4
5
6
                                                                               rn
                                                                               b
                                                                               i
                                                                               on
                                                                               i
                                                                               01

-------
           Figure 2  Power Curve and Average Sample Size for a
                               Sequential t-Test
7>
s>
Ui
                  0.3  0.4  0.5  0.6  0.7 0.8 0.9  1  1.1  1.2

                   Mean of Simulated Measurements
' Power, sequential
test

Nominal power
                                                                  ~ ~ • Average sample
                                                                      size, sequential
                                                                      test
                                                                      Sample size,
                                                                      fixed test
                                                                                              3
i
in
                        I

-------
          Figure  3  False Decision Rate and Sample Size versus Scale
                               Factor (Centered test)
            0.16 T
K>
               0
0     0.5
                                   1.5    2     2.5

                                    Scale Factor
3.5
••  160

••  140

••  120

   ioo  J8
       "55
   80  -f[
       S
   60  £

   40

   20

   0
                          1 Power,
                          sequential test

                          Nominal false
                          decision rate

                          • Average
                          sample size,
                          sequential test

                          Sample size,
                          fixed test
                                                                                     i
                                                                                     V)
                                                                                                      s
                                                                                                      I
                                                                                                      CO

-------
          Figure 4   Distribution of Sample Sizes for the Centered and
                    Modified Sequential t-test, by Test Result
                             True mean = alternate hypothesis
N>
-J
      
-------
          APPENDIX F: DERIVATIONS AND EQUATIONS
=
o>
          O
          oo
           I—f
I s
H g-
 ee
          I    I




*

I



c e
•"•  s
< 'S


                        azis
                         O
                         ro
           S    o
VO    Tf    CS


odd
H	h


 8   ?
 d   d
                                               u
                                               O
                                             cs
                                             p
                                             d
                       / 9AIJISOJ
                           F-28

-------
            Figure 6   Power Curve and Average Sample Size for
                       the Modified Sequential t-Test
1
 u.
I
                                                       0
                    .30 .40  .50  .60 .70 .80  .90 1.001.101.20

                    Mean of Simulated Measurements
                                                     •    Power, sequential
                                                          test
 Nominal power

' Average sample
 size, sequential
 test

• Sample size,
 fixed test
                                                                                            R
                                                                                            rp
                                                                                            b
i
C/J
                                                                                            1
                                                                                            CO

-------
     Figure 7  Sample Size Distribution for Modified Sequential t
                         Test versus Mean
71
I**
o
ioo :
.
•
•
OJ
^N
C^
QJ
"H-
5 •
CO
crt


•

1 •
»







•








0.25




-
_
•



—
-






—

















•N


••
-

•
1
•

—

-






W
\
•
•







•
•





0.50

\ '
:\





•
•
• •







• _ •
•
t.VfS.VSf*f.
\
\
\
\
m
	







m m \ m
\




0.75

•
•
V
•






•
•
• •
\
\
\




I
•







•
•


X.
^s




•
-

•
1

M

-



S.





•
•




•
•



— .—

1.00
•
•
•

» •
•
•

•
•
•
• •
•

•


1 >
• 0.9 |
• 0.8 *
• 0.7

• 0.6 .
u
.o.5|
• 0.4 ^

• 0.3


5% 10% 25% 50%
75% 90% 95%

• Average sample size,
sequential test

"v™v"^~~~~~~' Sample size, Rued test



O
2
§
d
§
00
1
m
%
•0.2 §
z
• 0.1
0
1.25
                Mean of Simulated Measurements

-------
                              BIBLIOGRAPHY
Bishop, T., 1985, Statistical View of Detection and Determination Limits in Chemical
         Analyses prepared for the Committee on Applications of Statistical Techniques
         in Chemical Problems. Columbus, Ohio: Battelle Columbus Laboratories,
         [January 5, 1982].

Box, G. E. P., and Jenkins, G.M., 1970, Time Series Analysis Forecasting and
         Control. San Francisco Holden-Day.

Box,  G. E.  P.,  Hunter W.  G., and Hunter, J. S., 1987,  Statistics For
         Experimenters. New York: John Wiley and Sons, Inc.

Bross, Irwin D., 1985, "Why Proof of Safety is Much More Difficult Than Proof of
         Hazard," Biometrics, Vol.  41: 785-793.

Brown, G. H,  and Fisher, N. I., 1972, "Subsampling  a Mixture of Sampled
         Material," Technometrics, Vol. 14, No. 3: 663-668.

Brown, M. B., and  Wolfe, R. A.,  1983b, "Estimation of the Variance of Percentile
         Estimates," Computational Stat&&s and Data Analysis, Vol. 1: 167-174.

Brow&,  K.  A.,  1965,  Statistical Theory  and Methodology  in Science and
         Engineering, 2nd. New York: John Wiley and Sons, Inc.

Cantor, L. W., and Knox, R. C.,  1986,  Groundwater Pollution Control. Chelsea,
         Michigan:  Lewis Publishers.

Cantor, L. W., Knox, R. C., and Fairchild, D. M.,  1987, Groundwater Quality
         Protection. Chelsea, Michigan: Lewis Publishers.

Casey, D., Nemetz, P: N. and Uyeno, D., 1985, "Efficient Search Procedures for
         Extreme Pollutant Values,'  Environmental Monitoring and Assessment, Vol. 5:
          165-176.

Clayton, C. A., Hines, J. W., Hartwell, T. D.,  and Burrows, P. M.,  1986,
         Demonstration of a  Technique for Estimating Detection Limits with Specified
         Assurance Probabilities. Washington, D.C.: EPA, [March 1986].

Cochran, W., 1977,  Sampling Techniques.  New York: John Wiley and Sons, Inc.

Cohen, A. C., 1961,  Tables for Maximum Likelihood Estimates: Singly Truncated and
         Singly Censored Samples,"  Technometrics,  Vol.3, No.4: 535-541.

Conover, W. J., 1980, Practical Nonparametric Statistics. New York: John Wiley and
         Sons, Inc.

D'Agostino, R. B.,  1970,   "A Simple Portable Test of Normality: Geary's Test
         Revisited," Psychological  Bulletin, Vol. 74, No. 2: 138-140.

Draper, N., and Smith, H.,  1966, Applied Regression Analysis. New York:  John
         Wiley and Sons, Inc.
                                   BIB-2

-------
                              BIBLIOGRAPHY
DuMouchel, W.H., Govindarajulu,  Z., and Rothman,  E.,  1973,  "Note on
         Estimating the Variance  of the Sample Mean in Stratified  Sampling," Canadian
         Journal of Statistics, Vol. 1, No.2: 267-274.

Duncan, A, 1974, Quality Control and Industrial Statistics, Fourth Edition. Homewood
         IL: Richard Irwin, Inc.

Elder, R S., Thompson, W.  O., and Myers, R. H., 1980, "Properties  of
         Composite Sampling Procedures," Technometrics, Vol.  22, No. 2:  179-186.

Environ Corporation, 1985a, principles of Risk Assessment: A Nontechnical Review.
         EPA Workshop on Risk Assessment. Easton, Md.,  March 17-18, 1985.

Environ Corporation; Jellinek,  Schwartz,  Connolly, and Freshman;  and
         Temple, Barker, and Sloan,   Inc., 1985c, Case Study on  Risk
         Assessment: Part  I. EPA Workshop on Risk Assessment. Easton, Md., March
          17-18,1985.

Environ Corporation; Jellinek,  Schwartz,  Connolly, and Freshman;  and
         Temple, Barker, and Sloan, Inc., 1985b, Additional Data on the  Risk
         Assessment Case: Part II. EPA Workshop on Risk  Management. Easton,
         Md., March 17-18, 1985.

Fairbanks,  K., and Madsen, R.,  1982, "P Values for  Tests Using  a Repeated
         Significance Test Design," Biometrika, Vol. 69, No. 1: 69-74.

Farrell, R.,  1980,  Methods for  Classifying Changes  in Environmental Conditions,
         Technical Report VRF-EPA7.  4-FR80-1, Vector  Research Inc., Ann Arbor,
         Michigan.

Filliben, J J. 1975, "Probability Plot Correlation Coefficient Test for Normality,"
         Technomerics, Vol. 17, No.l: 111-117.

Ford, P., Turina, P., and  GCA Corporation, 1985, Characterizatipn of Hazardous
         Waste Sites-A Methods Manual, Volume I-Site Investigations. Las Vegas,
         Nevada: EPA Environmental Monitoring Systems Laboratory, [April 1985].

Fuller. F., and Tsokos, C   1971, 'Time Series Analysis  of Water Pollution Data,"
         Biometrics, Vol.  27: 1017-1034.

Garner, F.C., 1985, Comprehensive Scheme for Auditing  Contract Laboratory Data
          [interim report]. Las Vegas, Nevada: Lockheed-EMSCO.

Gastwirth, J.L., and Rubin, H., 1971, "Effect of Dependence on the Level of Some
         One-Sample Tests,"  Journal of the American Statistical Association, Vol.  66:
             §16-§20

Geraghty  and Miller,  Inc.,  1984,  "Annual Report, August  1984, Rollins
         Environmental Services, Baton Rouge, Louisiana," Baton Rouge, Louisiana.

Ghosh,  B. K. 1970, Sequential Tests of Statistical Hypotheses, Reading MA, Addison
                                   BIB-3

-------
                               BIBLIOGRAPHY
Gilbert, R.  O., 1987, Statistical  Methods for Environmental Pollution Monitoring.
          New York: Van Nostrand  Reinhold.

Gilbert, R, O., and Kinnison, R. R., 1981, "Statistical Methods for Estimating the
          Mean and Variance from. Radionuclide Data Sets Containing Negative,
          Unreported or Less-Than Values," Health Physics, Vol. 40: 377-390.

Gilliom, R. J., and Helsel, D. R., 1986, "Estimation of Distributional Parameters
          for Censored Trace Level Water Quality   Data, 1. Estimation Techniques,"
          Water Resources Research, Vol. 22, No. 2: 135-146.

Gleit, A., 1985, "Estimation for Small Normal Data Sets with Detection Limits,"
          Environmental Science and Technology, Vol. 19, No. 12: 1201-1206.

Goldstein,  B.,  1985, Elements  of Risk Assessment.  EPA  Risk Assessment
          Conference. Easton, Md., March 18, 1985.

Goodman, I., 1987, "Graphical and Statistical Methods to Assess the Effect of Landfills
          on Groundwater Quality," Land Resources Program, University of Wisconsin-
        Madison.

Grant, E. L., and Leavenworth, R. S.,  1980,  Statistical Quality  Control Fifth
          Edition. New York:  McGraw-Hill.

Groeneveld, L., and Duval, R., 1985, "Statistical Procedures and Considerations for
          Environmental Management (SPACEMAN)," Prepared for the Florida
          Department of Environmental Regulation,  Tallahassee,  Florida

Grubbs, F. E., 1969, "Procedures for Detecting Outlying Observations in Samples,"
          Technometrics, Vol. 11, No.l:  1-21.

Guttman, I., 1970, Statistical Tolerance Regions: Classical and Bayesian.  (Being
          Number Twenty-Six of Griffin's Statistical Monographs and Courses edited by
          Alan Stuart.) Darien, Conn.: Hafner Publishing.

Hall, W. J.,  1962, "Some Sequential Analogs of Stein's Two Stage Test," Biometrika,
          Vol 49,: 367-378.

Hansen, M., Hurwitz, W, and Madow, W., 1953, Sample Survey Methods and
          Theory, Volume 1. New York:  John Wiley and Sons, Inc.

Hayre, L.  S., 1983, *An Alternative to the Sequential T-Test." Sunkhya : The Indian
          Journal of Statistics, 45, Series A, R. 3, 288-300

Hazardous Materials Control Research Institute,  1985, 6th National  Conference
          on Management of Uncontrolled Hazardous Waste Sites. Washington, B.C.:
          HMCRI, [November 4-6, 1985].

Hazardous Materials Control Research Institute,  1986, 7th National  Conference
          on Management of Uncontrolled Hazardous Waste Sites. Washington, B.C.:
          HMRCI, [Becember 1-3, 1986].
                                    BIB-4

-------
                              BIBLIOGRAPHY
Hazardous Materials Control Research Institute, 1988, 9th National Conference
          on Management of Uncontrolled Hazardous Waste Sites. Washington, B.C.:
          HMRCI, [November 28-30, 1988].

Helsel, D.R., and Cohn, T.A., 1988, "Estimation of Descriptive Statistics for
          Multiply Censored Water Quality Data," Water Resources Research, Vol. 24,
          No. 12: 1997-2004.

Helsel, D. R., and Gilliom, R., 1986, "Estimation of Distributional Parameters for
          Censored Trace Level Water Quality Data, 2. Verification and Applications,"
          Water Resources Research, Vol.  22, No. 2: 147-155.
Hem, J.D.,  1989, Study and Interpretation of the Chemical Characteristics of Natural
          Water, Third Edition, U.S. Geological Survey Water-Supply Paper 2254.

Hipel,  K.,  Lennox,  W.,  Unny, T., and McLeod, A., 1975,  "Intervention
          Analysis in Water Resources,"  Water Resources Research, Vol. 11,  No. 3:
          567-575.

Hipel,  K.,  McLeod,  A.,  and  Lennox, W, 1977,  "Advance in Box-Jenkins
          Modeling, 1. Model Construction," Water Resources Research, Vol. 12, No.
          6: 855-861.

Hirsch, R. M., and Slack, J. R, 1984, "ANonparametric Trend  Test for Seasonal
          Data with Serial Dependence," Water  Resources Research, Vol. 20, No. 6:
          727-732.

Hirsch, R.  M., Slack, J.  R. and Smith R. A.,  1982, 'Techniques for  Trend
          Analysis for Monthly Water Quality Data," Water Resources Research, Vol. 18,
          No. 1: 107-121.

Hoaglin,  D.C., Mosteller, F., and Tukey, J.W., 1983, Understanding Robust
          and Exploratory Data Analysis.  New York: John Wiley and Sons, Inc.

Johnson,  N.  L., and  Kotz, S.,   1970,  Distributions in Statistics: Continuous
          Univariate Distributions -  2, Houghton Mfflin Co.

Johnson, Norman L. and Leone, F. C, 1977, Statistics and Experimental Design
          in Engineering and the Physical Sciences. Vol. I, Second Edition. Johns
          Wiley and Sons, Inc.

Joiner, B.L., and Rosenblatt, J. R., 1975,  "Some  Properties of the Range in
          Samples from Tukey's Symmetric  Lambda Distributions," Journal  of the
          American Statistical Association Vol. 66: 394.

Kedem, B.,  1980, "Estimation of Parameters in Stationary Autoregressive Processes
          After Hard Limiting," Journal of the American Statistical Association, Vol. 75,
          No. 369: 146-153.

Land, C. E.,  1971, "Confidence Intervals for Linear Functions of the Normal Mean and
          Variance," Annals of Mathematical  Statistics,  Vol. 42 No.4 1187-1205.
                                    BIB-5

-------
                               BIBLIOGRAPHY
 Land, C. E., 1975, Tables of Confidence for Linear Functions of the Normal Mean and
          Variance. Selected Tables in Mathematical Statistics, Vol. Ill, pp. 385-419.
          Providence,  R.I.: American Mathematical  Society.

 Lehmann, EL, 1975, Nonparametrics: Statistical Methods -Based on Ranks. San
          Francisco:  Holden-Day.

 Lettenmaier, D., 1976, "Detection of Trends in Water Quality Data From Records With
          Dependent Observations," Water Resources Research, Vol. 12, No. 5: 1037-
          1046.

 Liebetrau, A.  M., 1979, "Water Quality Sampling: Some Statistical Considerations,"
          Water Resources Research, Vol. 16, No.  6: 1717-1725.

 Liggett, W., 1985, "Statistical Designs for Studying Sources of Contamination," Quality
          Assurance for Environmental Measurements, ASTM STP 867, J. K. Taylor
          and T. W. Stanley Eds. American Society for Testing Materials, Philadelphia,
          22-40.

 Locks, M. O., Alexander,  M.  J., and Byars,  B. J., 1963, "New Tables of the
          Noncentral t-Distribution," Report ARL63-19, Wright-Patterson Air Force
          Base.

 Loftis, J. and  Ward, R., 1980, "Sampling Frequency for Regulatory Water Quality
          Monitoring," Water Resources Bulletin,  Vol. 16, No. y. 501-507.

 Loftis, J. and  Ward, R., 1980, "Water Quality Monitoring-Some Practical  Sampling
          Frequency Considerations,"  Environmental Management, Vol.  4, No. 6: 521-
          526.

Loftis, J., Montgomery, R., Harris, J., Nettles, D., Porter, P., Ward, R.,
          and  Sanders, T.,  1986, "Monitoring Strategies for Ground Water Quality
          Management," Prepared for the United States Geological  Service by Colorado
          State University, Fort Collins Colorado.

 Madow, W.  C., and Madow,  L.  H.,  1944,  "On  the Theory of Systematic
          Sampling" Annals of Mathematical Statistics Vol. 15:1-24.

 Mage, D. T., 1982, "Objective Graphical Method for Testing Normal  Distributional
          Assumptions Using. Probability Plots," American Statistician, Vol.  36, No.2:
          116-120.

 McLeod, A., Hipel, K., and Comancho, F., 1983, "Trend Assessment of Water
          Quality Time Series,"  Water Resources Bulletin, Vol.  19, No. 4: 537-547.

 McLeod, A., Hipel,  K., Lennox,  W., 1977, "Advance in Box-Jenkins Modeling,
          1.  Applications,"  Water Resources  Research, Vol.  13, No. 3: 577-586.

 Mee, Robert W., 1984, 'Tolerance  Limits and Bounds for Proportions Based on Data
          Subject to Measurement Error," Journal of Quality Technology.  Vol.16, No.2:
          74-80.
                                    BIB-6

-------
                              BIBLIOGRAPHY
Mee, Robert W.,  Owen, D.B., and Shyu,  Jyh-Cherng.,  1986, "Confidence
          Bounds for Misclassification Probabilities  Based on Data Subject to
          Measurement Error'" Journal of Quality Technology, Vol. 18, No. 1: 29-40.

Mendenhall, W., and Ott, L., 1980, Understanding Statistics. N. Scituate, Mass.:
          Duxbury Press.

Millard, S., Yearsley, J., and Lettenmaier, D., 1985, "Space-Time Correlation
          and Its Effects on Methods for Detecting Aquatic Ecological Change'" CAN. J.
          FISH. AQUAT.  SCI, Vol. 42: 1391-1400.

Montgomery, R., and Loftis, J., 1987, "Applicability of the t-test for Detecting
          Trends in Water Quality Variables'" Water Resources Bulletin, Vol.  23, No. 4:
          653-662.
Montg
      ;omery, R., and Reckhow, H., 1984, "Techniques for Detecting Trends in
          Lake Water Quality," Water Resources Bulletin, Vol. 20, No. 1: 43-52.

Natrella, M., 1963, Experimental Statistics. Washington, D.C.: U.S. Department of
          Commerce, National Bureau of Standards.

Nelson, J., and Ward,  R.,  1981,  "Statistical  Considerations and  Sampling
          Techniques for Ground-Water Quality Monitoring," Ground Water,  Vol. 19,
          No. 6: 617-625.

Neter, J., Wasserman, W., and  Kutner, M., 1985,  Applied Linear Statistical
          Models. Homewood Illinois: Irwin.

Neter, J., Wasserman, W., and Whitmore, G., 1982, Applied Statistics.  Boston:
          Allyn-Bacon.

Noether, C., 1956, "Two Sequential Tests Against Trend'" Journal of the American
          Statistical Association, September  1956: 440-450.

Ness,  R., 1985,  "Groundwater Quality," Journal-Water  Pollution  Control  Fe&ration,
          Vol. 57, No. 6: 642-649.

Nyer, Evan K., 1985, Groundwater Treatment Technology. New York:  Van Nostrand
          Reinhold Co.

Oak Ridge National Laboratory, 1984, "Results of the Groundwater Monitoring
          Performed at the Former St. Louis  Airport Storage Site for the Period January
          1981 Through January 1983," ORNL/TM-8879, Oak Ridge, Tennessee.

Owen, D.B., 1963, Factors for One-Sided Tolerance Limits and for Variables Sampling
          Plans. Albuquerque, N.M.: Sandia  Corporation, [March 1963].

Patel, J.K., 1986, "Tolerance Limits-A Review,"  Communications in Statistics: Theory
          and Methods, Vol. 15, No. 9: 2719-2762.
                                    BIB-7

-------
                              BIBLIOGRAPHY
Pederson, G.L., and Smith, M.M., 1989, U.S. Geological Survey Second National
          Symposium of Water Qu
          Florida, November 1989.
Symposium of Water Quality; Abstracts of the Technical Sessions.  Orlando,
 lo "
Pettyjohn, W., 1976, "Monitoring Cyclic Fluctuations in Ground-Water Quality,"
          Ground Water, Vol. 14, No. 6: 472-480.

Pucci, A., and Murashige, J.,  1987,  "Applications of Universal Kriging to an
          Aquifer Study in New Jersey,"  Ground Water, Vol. 25, No. 6: 672-678.

Rendu, J. M., 1979, "Normal and Lognormal Estimation," Mathematical Geology,
          Vol. ll,No.4: 407-422.

Resnikoff, G. J.,  and  Lieberman,  G.  J.,  1957, Tables  of the Non-central
          t-distribution. Stanford:  Stanford University Press.

'Rockwell International, 1979, "Hanford Groundwater Modeling-Statistical Methods
          for Evaluating Uncertainty and Assessing Sampling Effectiveness," Rockwell
          Hanford Operations, Energy Systems Group, Richland Washington, RHO-C-
          18.

Rohde, C. A., 1976, "Composite Sampling," Biometrics,  Vol. 32: 278-282.

Rohde, C. A., 1979, "Batch, Bulk, and  Composite Sampling," Sampling Biological
          Popularions, pp. 365-367. Edited by  R.M. Cormack. Fairland, Md.:
          International Cooperative Pub. House.

Rushton, S.. 1950, "On a Sequential t-Test," Biotnetrika, Vol. 37: 326-333.

Rushton, S., 1952, "On a Two-Sided Sequential t-Test," Biometrika, Vol. 39: 302-308.

Sanders,  T. and Adrian, D.,  1978,  "Sampling Frequency for River Quality
          Monitoring," Water Resources Research, Vol. 14, No. 4: 569-576.

SAS Institute, 1985, Sas Users Guide: Statistics. Gary, North Carolina.

Schaeffer, D. and  Kerster, H.,  1988, "Quality Control Approach to NPDES
          Compliance Determination,"  Journal-Water Pollution Control Fe&ration, Vol.
          60: 1436-1438.

Scheaffer, R.  L.,  Mendenhall, W.,  and Ott, L.,  1979,  Elementary  Survey
          Sampling, Second Edition. Boston:  Duxbury Press.

Schmid, C. F., 1983, Statistical Graphics: Design Principles and Practices. New York:
          John Wiley and Sons, Inc.

Schmidt, K., 1977,  "Water Quality Variations for  Pumping Wells," Ground  Water,
          Vol. 15, No. 2: 130-137.

Schwartz, J. E., 1985, "Neglected Problem of Measurement Error in Categorical Data,"
          Sociological Methods and Research, Vol. 13, No. 4: 435-466.
                                    BIB-8

-------
                               BIBLIOGRAPHY
Schweitzer,  G.  E., and Santolucito, J.  A.,  editors,  1984, Environmental
          Sampling for Hazardous Wastes. ACS Symposium Series 267. Washington,
          D.C.: American Chemical  Society.

Shapiro, S.S., and Wilk, M.B., 1965, "Analysis of Variance Test for Normality
          (Complete Samples)," Biometrika, Vol.  52:  591-611.

Sharpe, K., 1970, "Robustness of Normal Tolerance Intervals," Biometrika. Vol.  57,
          No.l: 71-78.

Siegmund, D., 1985, Sequential Analysis: Tests and Confidence Intervals. New York:
          Springer-Verlag.

Sirjaev, A.N., 1973, Statistical Sequential Analysis. Providence, R.I.: American
          Mathematical society.

Size, W. B., editor,  1987, Use and Abuse of Statistical Methods in the Earth Sciences.
          New York: Oxford University Press.

Snedecor, G. W., and Cochran, W. G., 1980, Statistical Method. Seventh Edition
          Ames Iowa: The Iowa State Press.

Sokal, R. R., and  Rohlf, F. J., 1981, Biometry: The Principles and Practice of
          Statistics in Biological Research. Second Edition. New York: W. I-L Freeman.

Stoline  M., and Cook, R., 1986, "A Study of Statistical Aspects of the Love Canal
          Environmental Monitoring Study," American Statistician, Vol. 40, No. 2: 172-


Switzer, P., 1983, When Will a Pollutant Standard Be Exceeded: Model Prediction and
          Uncertainty. Technical Report No.  67. New Canaan, Ct: SIMS, [January


Temple, Barker, and Sloan, Inc. 1986, Case Study on Risk Management. EPA
          Workshop on Risk Management. Easton, Md., April 13-14, 1986.

Tomqvist, L., 1963,  "Theory of Replicated Systematic  Cluster Sampling With Random
          Stan," Review of the International Statistical Institute, Vol. 31, No. 1: 11-23.

Tukey, J. W., 1977, Exploratory Data Analysis. Reading, Mass.:  Addison-Wesley.

U.S. Congress. Office  of  Technical Assessment,  1985,  Superfund Strategy.
          Washington, D.C.:  G.P.O.

U.S. Department of Energy, 1985a How Clean is Clean: A Review of Superfund
          Cleanups.  Washington D. C. (GJ/TMC-08-ED.2).

U.S. Department of Energy,  1985b, Procedures for Collections and Preservation of
          Groundwater and Surface Water Samples  and  For the Installation of Monitoring
          Wells. Washington D. C. (CONF-87  1075-21).
                                    BIB-9

-------
                             BIBLIOGRAPHY
U.S. Environmental Protection Agency, 1982, The Handbook for Sampling and
         Sample Preservation of Water and Wastewater. Washington, D. C., September
         1982  (EPA-600/4-82-029).

U.S. Environmental Protection Agency, 1984, Sampling Procedures for Ground
         Water Quality Investigations. Washington D.C, May 1984 (EPA-600/D-84-
         137).

U.S. Environmental Protection Agency, 1985a, Data Quality Objectives for the
         RI/FS  Process:  Accuracy Testing  Definitions, Appendix  F [draft].
         Washington, D.C., [November 5, 1985].

U.S. Environmental Protection Agency, 1985b, EPA Guide for Minimizing the
         Adverse Environmental Effects of Cleanup of Uncontrolled Hazardous Wastes
         Sites. Washington, D.C., June 1985 (EPA/600/8-85/008).

U.S. Environmental Protection Agency, 1986a, Guidance Document for Cleanup
         of Surface Impoundment Sites.  Washington D. C., June  1986.

U.S. Environmental  Protection  Agency,  1986b,  Resource Conservation and
         Recovery Act (RCRA) Ground-Water Monitoring Technical Enforcement
         Guidance Document. Washington D.C.,  September 1986 (OSWER-9950.1).

U.S. Environmental Protection Agency,  1986c,  Superfund Public Health
         Evaluation Manual. Washington D. C.: EPA [October 1986].

U.S. Environmental  Protection  Agency,  1987a, Data Quality Objectives  For
         Remedial Response Activities, Development Process. Washington D. C.,
         March 1987 (EPA 540/G-87/003).

U.S. Environmental  Protection  Agency,  1987b, Data Quality Objectives  For
         Remedial  Response Activities, Example  Senario: RI/FS Activities at a Site with
         Contaminated Soils and Ground Water.  Washington D. C., March 1987 (EPA
         540/G-87/004).

U.S.  Environmental  Protection Agency,  1987c,  EPA Journal,  The New
         Superfund: Protecting People and the Environment. Washington D. C., Vol.
          13, No. 1.

U.S. Environmental  Protection  Agency,  1987d,  Surface Impoundment Clean
         Closure Guidance Manual [draft]. Washington D. C., March 1987.

U.S. Environmental Protection Agency, 1987e, Using Models in Ground-Water
         Protection Programs. Washington D.C.,  January 1987 (EPA/600/8-87/003).

U.S. Environmental Protection Agency, 1988, Guidance on Remedial Actions for
         Contaminated Ground Water at Superfund Sites [Interim Final]. Washington
           D.C.
                                  BIB-10

-------
                               BIBLIOGRAPHY
U.S.  Environmental Protection  Agency,  1989a,  Methods for Evaluating the
          Attainment of Cleanup  Standards,  Volume 1:  Soils and Solid Media. Office of
          Policy, Planning, and Evaluation, Washington, D.C., February 1989  (EPA
          230/02-89-042).

U.S. Environmental Protection Agency, 1989b, Statistical Analysis of Ground-
          Water Monitoring Data at RCRA  Facilities.  Office of Solid Waste,
          Washington, D.C., April 1989

van Belle, G., and Hughes, J. P.,  1984, "Nonparametric Tests for Trend in Water
          Quality," Water Resources Research, Vol. 20, No.  1: 127-136.

Wald, A., 1947, Sequential Analysis. New  York  Dover Publications.

Ward,  C., Loftis, J.,  Nielsen,  K., and Anderson,  R.,  1979, "Statistical
          Evaluation of Sampling Frequencies in Monitoring Networks," Journal-Water
          Pollution  Control Federation, Vol. 51,  No. 9:  2292-2300.

Wesolowsky, G.O.,  1976, Multiple Regression and Analysis of Variance. New York:
      John  Wiley.

Wetherill, G. B., 1975, Sequential Methods in Statistics. New York: Halsted Press.

Wilson,  J., 1982, Ground Water:  A Non-Technical Guide. Academy of Natural
          Sciences, Philadelphia, Pa.

Wolter, K. M., 1984, "Investigation of Some Estimators of Variance for Systematic
          Sampling," Journal of the American Statistical Association, Vol. 79, No.388:
          781-790.

Wolter, Kirk M.,,  1985, Introduction to Variance Estimation. New York:  Springer-
          Verlag.

Wood,  E. F., Ferrara, R. A.,   Gray, W.  G.,  and  Pinder, G. F.,  1984,
          Groundwater Contamination From Hazardous Wastes. Englewood Cliffs, New
          Jersey: Prentice-Hall.
                                   BIB-11

-------
                   APPENDIX A: STATISTICAL TABLES
Table A. 1     Tables of t for selected alpha and degrees of freedom
Use alpha to determine which column to use based on the desired parameter,
Use the degrees of freedom to determine which row to use. The t value will be found at the
intersection of the row and column. For values of degrees of freedom not in the table, interpolate
between those values provided.
                              When detemiining t^jy for a specified t
                            .10     .05     .025    .01     .005
.0025
.001
When determining i\^j/2jxtor a specified ac
.50 .20 .10 05 .02 01 OO5



Degrees of
Freedom
Df






























Df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
120
400
infinite

1.000
0.816
0.765
0.741
0.727
0.718
0.711
0.706
0.703
0.700
0.697
0.695
6.694
0.692
0.691
0.690
0.689
0.688
0.688
0.687
0.686
0.686
0.685
0.685
0.684
0.684
0.684
0.683
0.683
0.683
0.681
0.679
0.677
0.675
0.674

3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1319
1.318
1.316
1315
1314
1.313
1.311
1.310
1.303
1.296
1.289
1-2*4
1.282

6.314
1920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.684
1.671
1.658
1.649
1.645

12.706
4.303
3.182
1776
1571
1447
1365
2.306
1262
2.228
1201
1179
1160
1145
1131
1120
1110
2.101
2.093
2.086
1080
1074
1069
1064
1060
1056
1052
1048
1045
1042
2.021
1000
1.980
1.966
1.960

31.821
6.965
4.541
3.747
3365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
2.423
2390
2358
2336
2326

63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
1977
1947
2.921
1898
1878
1861
1845
1831
1819
1807
1797
1787
1779
1771
1763
1756
1750
1704
2.660
2.617
1588
1576

127321
14.089
7.453
5.598
4.773
4317
4.029
3.833
3.690
3.581
3.497
3.428
3.372
3.326
3286
3.252
3.222
3.197
3.174
3.153
3.135
3.119
3.104
3.091
3.078
3.067
3.057
3.047
3.038
3.030
1971
1915
1860
1823
2.807
.002

318.309
22327
10215
7.173
5.893
5.208
4.785
4.501
4.297
4.144
4.025
3.930
3.852
3.787
3.733
3.686
3.646
3.610
3.579
3.552
3.527
3.505
3.485
3.467
3.450
3.435
3.421
3.408
3396
3385
3.307
3232
3.160
3.111
3.090
                                         A-l

-------
                        APPENDIX A: STATISTICAL TABLES
Table A.2     Tables of z for selected alpha
Use alpha to determine which column to read. Use the desired parameter, z^ or zlKX^, to
determine which row to use. Read the z value at the intersection of the row and column.
zi-a
0.674
.842
1.282
1.645
1.960
2.326
2J76
2.807
3.090
zi-a/2
1.150
1.282
1.654
1.960
2.326
2.576
2.807
3.090
3.29
                                        A-2

-------
                         APPENDIX A:  STATISTICAL TABLES
Table A.3     Tables of k for selected alpha, PQ, and sample size for use in a tolerance interval test
Use alpha to determine which table to read.  The value k is found at the intersection of the column
with the specified PQ and the row with the sample size n. When testing tolerance intervals, let
T » x + ks. If T is less than the cleanup standard, the sample area attains the cleanup standard
based on the statistical test.
Alpha  =  0.10
                                                    10%)
          n
                   0.25
     0.1
0.05
0.01
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
70
100
200
500
infinity
5.842
2.603
1.972
1.698
1.540
1.435
1.360
1.302
1.257
1.219
1.188
1.162
1.139
1.119
1.101
1.085
1.071
1.058
1.046
1.035
1.025
1.016
1.007
1.000
0.992
0.985
0.979
0.973
0.967
0.942
0.923
0.894
0.857
0.825
0.779
0.740
0.674
0.253
4.258
3.188
2.742
2.494
2.333
2.219
2.133
2.066
2.011
1.966
1.928
1.895
1.867
1.842
1.819
1.800
1.782
1.765
1.750
.737
.724
.712
.702
.691
.682
.673
.665
.657
.624
.598
.559
.511
.470
.411
.362
.282
3.090
5.311
3.957
3.400
3.092
2.894
2.754
2.650
2.568
2.503
2.448
2.402
2.363
2.329
2.299
2.272
2.249
2.227
2.208
2.190
2.174
2.159
2.145
2.132
2.120
2.109
2.099
2.089
2.080
2.041
2.010
1.965
1.909
1.861
1.793
1.736
1.645
8.500
7.340
5.438
4.666
4.243
3.972
3.783
3.641
3.532
3.443
3.371
3.309
3.257
3.212
3.172
3.137
3.105
3.077
3.052
3.028
3.007
2.987
2.969
2.952
2.937
2.922
2.909
2.896
2.884
2.833
2.793
2.735
2.662
2.601
2.514
2.442
2.326
                                          A-3

-------
                        APPENDIX A:  STATISTICAL  TABLES
Table A. 3     Tables of k for selected alpha, Po, and sample size far use in a toleranace interval test
             (Continued)
                           Alpha  =  0.05   (i.e., 5%)
n
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
70
100
200
500
infinity
0.25
1.763
3.806
2.618
2.150
1.895
1.732
1.618
1.532
1.465
1.411
1.366
1.328
1.296
1.268
1.243
1.220
1.201
1.183
1.166
1.152
1.138
1.125
1.114
1.103
1.093
1.083
1.075
1.066
1.058
1.025
0.999
0.960
0.911
0.870
0.809
0.758
0.674
0.1
20.581
6.155
4.162
3.407
3.006
2.755
2.582
2.454
2.355
2.275
2.210
2.155
2.109
2.068
2.033
2.002
1.974
1.949
1.926
1.905
1.886
1.869
1.853
1.838
1.824
1.811
1.799
1.788
1.777
1.732
1.697
.646
.581
.527
.450
.385
.282
PO
0.05
26.260
7.656
5.144
4.203
3.708
3.399
3.187
3.031
2.911
2.815
2.736
2.671
2.614
2.566
2.524
2.486
2.453
2.423
2.396
2.371
2.349
2.328
2.309
2.292
2.275
2.260
2.246
2.232
2.220
2.167
2.125
2.065
1.990
1.927
1.837
1.763
1.645
0.01
37.094
10.553
7.042
5.741
5.062
4.642
4.354
4.143
3.981
3.852
3.747
3.659
3.585
3.520
3.464
3.414
3.370
3.331
3.295
3.263
3.233
3.206
3.181
3.158
3.136
3.116
3.098
3.080
3.064
2.995
2.941
2.862
2.765
2.684
2.570
2.475
2.326
                                        A-4

-------
                        APPENDIXA: STATISTICAL TABLES
Table A.3     Tables of k for selected alpha, Po, and sample size far use in a tolerance interval test
             (Continued)
n
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
70
100
200
500
infinity
0.25
>8.939
8.728
4.715
3.454
2.848
2.491
2.253
2.083
1.954
1.853
1.771
1.703
.645
.595
.552
.514
.481
.450
.423
.399
.376
.355
.336
.319
1.303
1.287
1.273
1.260
1.247
1.195
1.154
1.094
1.020
0.957
0.868
0.794
0.674
Alpha =
0.1
103.029
13.995
7.380
5.362
' 4.411
3.859
3.497
3.240
3.048
2.898
2.777
2.677
2.593
2.521
2.459
2.405
2.357
2.314
2.276
2.241
2.209
2.180
2.154
2.129
2.105
2.085
2.065
2.047
2.030
1.957
1.902
1.821
1.722
1.639
1.524
1.430
1.282
0.01 (i.*, 1%)
PO
0.05
131.426
17.370
9.083
6.578
5.406
4.728
4.258
3.972
3.738
3.556
3.410
3.290
3.189
3.102
3.028
2.963
2.905
2.854
2.808
2.766
2.729
2.694
2.662
2.633
2.606
2.581
2.558
2.536
2.515
2.430
2.364
2.269
2.153
2.056
1.923
1.814
1.645
0.01
185.61
23.896
12.387
8.939
7.335
6.412
5.812
5.389
5.074
4.829
4.633
4.472
4.337
4.222
4.123
4.037
3.960
3.892
3.832
3.777
3.727
3.681
3.640
3.601
3.566
3.533
3.502
3.473
3.447
3.334
3.249
3.125
2.974
2.850
2.679
2.540
2.326
                                       A-5

-------
APPENDIX A: STATISTICAL TABLES
i aoic J\.<*
Cost ratio $R
Yearly cost
Sample cost
Kecomn
for asses
1
2
5
10
20
50
100
200
1000
2000
5000
10000
lenoea numoer 01 sam
sing attainment
Estimated Lag 1
0.05 0.1 0.15
8
10
12
15
18
23
30
36
61
73
91
183
7
8
10
12
15
20
24
30
52
61
91
91
6
7
9
10
13
17
21
26
46
61
73
91
pies per seasonal penoa \np) to minimize roau cost
serial correlation between monthly observations
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
5
6
8
9
11
15
19
24
40
52
73
91
4
5
6
8
9
13
16
20
34
40
61
73
4
4
5
6
8
10
13
16
28
36
46
61
4
4
4
5
6
9
11
14
23
30
40
52
4
4
4
4
5
7
9
11
19
24
32
40
4
4
4
4
4
6
7
9
15
19
25
32
4
4
4
4
4
4
5
6
11
14
19
23
4
4
4
4
4
4
4
4
7
8
11
14
              A-6

-------
                       APPENDIX A: STATISTICAL TABLES
Table A.5    Variance factors F for determining sample size









Samples
per year or
seasonal
period

























4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
28
30
32
34
36
40
46
52
61
73
91
183
365
0.05
4.00
4.99
5.97
6.92
7.83
8.69
9.48
10.22
10.89
11.51
12.07
12.57
13.03
13.44
13.81
14.15
14.45
14.72
14.97
15.20
15.41
15.59
15.76
16.06
16.32
16.53
16.71
16.87
17.13
17.40
17.59
17.79
17.95
18.08
18.27
18.31
0.1
3.99
4.96
5.89
6.74
7.53
8.23
8.85
9.40
9.88
10.30
10.67
11.00
11.28
11.53
11.75
11.95
12.12
12.27
12.41
12.53
12.65
12.75
12.84
12.99
13.12
13.23
13.32
13.40
13.52
13.66
13.75
13.84
13.91
13.98
14.06
14.08
Estimated Lag 1
0.15 0.2
3.97
4.90
5.75
6.50
7.15
7.71
8.19
8.60
8.95
9.24
9.50
9.72
9.90
10.07
10.21
10.33
10.44
10.54
10.62
10.70
10.77
10.83
10.88
10.98
11.05
11.12
11.17
11.21
11.29
11.36
11.42
11.47
11.51
11.54
11.59
11.60
3.94
4.80
5.55
6.19
6.73
7.17
7.53
7.83
8.09
8.30
8.47
8.62
8.75
8.86
8.96
9.04
9.11
9.17
9.23
9.28
9.32
9.36
9.39
9.45
9.50
9.54
9.58
9.60
9.65
9.70
9.73
9.76
9.79
9.81
9.84
9.85
serial correlation between monthly observations
0.3 0.4 0.5 0.6 0.7 0.8 0.9
3.80
4.49
5.04
5.46
5.80
6.05
6.26
6.42
6.55
6.66
6.75
6.82
6.88
6.93
6.97
7.01
7.05
7.07
7.10
7.12
7.14
7.16
7.17
7.20
7.22
7.24
7.25
7.26
7.28
7.30
7.32
7.33
7.34
7.35
7.36
7.37
3.53
4.03
4.38
4.64
4.83
4.97
5.08
5.16
5.23
5.28
5.32
5.35
5.38
5.41
5.43
5.45
5.46
5.47
5.49
5.50
5.50
5.51
5.52
5.53
5.54
5.55
5.56
5.56
5.57
5.58
5.58
5.59
5.60
5.60
5.60
5.61
3.13
3.44
3.64
3.78
3.88
3.95
4.00
4.04
4.07
4.09
4.11
4.13
4.14
4.15
4.16
4.17
4.18
4.18
4.19
4.19
4.20
4.20
4.20
4.21
4.21
4.22
4.22
4.22
4.22
4.23
4.23
4.23
4.24
4.24
4.24
4.24
2.61
2.77
2.87
2.93
2.97
3.00
3.03
3.04
3.06
3.07
3.07
3.08
3.09
3.09
3.09
3.10
3.10
3.10
3.10
3.11
3.11
3.11
3.11
3.11
3.11
3.12
3.12
3.12
3.12
3.12
3.12
3.12
3.12
3.12
3.13
3.13
1.99
2.05
2.09
2.11
2.13
2.14
2.15
2.15
2.16
2.16
2.16
2.17
2.17
2.17
2.17
2.17
2.17
2.17
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
2.18
1.31 0.64
1.33 0.64
1.34 0.64
1.35 0.64
1.35 0.64
1.35 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.36 0.64
1.37 0.64
1.37 0.64
1.37 0.64
1.37 0.64
1.37 0.64
1.37 0.64
                                      A-7

-------
 DRAFT 3
A-8

-------
                     APPENDIX B:  EXAMPLE WORKSHEETS

              The worksheets in this appendix have been completed to serve as an example in
understanding the forms and making the necessary calculations.

              Please note that to maintain adquate precision in doing the computations appearing
in the worksheets, (particularly in the calculations of estimated variances, standard deviations, or
standard errors), the number  of decimal places retained should be as high as possible, with a
minimum of four.


                                      A Scenario

              To help understand how to use the worksheets provided,  a scenario has been
constructed with associated data concerning a site for which a cleanup effort has been undertaken.
In order that undue time is not spent on data manipulation and data entry, parameters were set in
such a way that the number of years for which data needed to be collected in the example was kept
artificially low. For example, in Worksheet 3, a and P were set higher than will generally be the
case in practice while \L\ and ft were set relatively low.  As a consequence, the number of years
required for a fixed sample size test was limited to three years, which is highly unlikely to be the
case in practice.

              The scenario involves a Superfund site with a treatment well and 5 monitoring
wells. Two of the' monitoring wells are close to the source of contamination  and have been
monitored  individually (involving Worksheets 2 through 7b). The remaining three wells are
relatively far from the source of contamination and have been analyzed as a  group (Worksheets 8
through  14b). Two chemicals were of interest in monitoring for cleanup. The example
worksheets have been provided for one of the two  chemicals for one of the two wells being
monitored individually and for the group of three wells.  For illustrative purposes, for the single
well being examined, both a fixed sample test and a sequential test have been carried out.
However, in practice, a decision would be made before hand about which of the two approaches
would be used, and only that test would be employed.  It is interesting  to note that,  for the example
data set, it rums out that the fixed sample size test indicates that the site is clean while the sequential
test indicates that more data are needed before a decision can be reached. On average, the
sequential test will yield a result more quickly, but since the parameters were specified so as to
require only-three years for the fixed sample test, which is the minimum amount of time required
                                       B-l

-------
                        APPENDIX B: EXAMPLE WORKSHEETS

for a sequential test, it is not altogether surprising that a decision could not be made via the
sequential test

              Worksheets 15 and 16 have been filled out with data independent of the five well
example. They  were used simply to indicate how a serial correlation could be estimated via the
worksheets. The number of observations on which the estimated serial correlation  is based,
twelve, is fewer than should normally be used in practice.

              The  number of samples per year used in the example was six. Note that in
Worksheet 3 the estimated serial correlation between monthly data was .2, so that the correlation
between observations obtained between two-month periods would be estimated to be .2 =.04.
Since .04 represents a rather low correlation  between Observations, data could be reasonably
gathered on a bimonthly schedule without  great concern about a lack of independence between
observations.

             Worksheets 1R and 2R present the computation of regression coefficients and
related tests of significance using the three  sample means obtained during the three years of data
collection for the test of the single well to serve as the three data observations from which a linear
model was to be constructed. Since the fixed sample test indicated that the  cleanup effort was
successful, it is  desirable to examine the trend  of the data over time to make sure that there is no
evidence that the cleanup standard could be exceeded in the future. This could be indicated by
evidence of a statistically significant positive  slope for the sample data (in this case, the three yearly
averages).  Three observations is a rather small  sample on which to base such decisions, but again
the chief purpose of these example worksheets is illustrative.  The reader can more quickly
determine how the regression estimates were computed using a small data set. In  practice, it is
quite likely that the number of years' worth  of data resulting in a decision that the site is clean will
exceed  three  by  several years.
                                       B-2

-------
          APPENDIX B: EXAMPLE WORKSHEET'S
Table B.I Summary of Notation Used in Appendix B
 Symbol
                 Definition
m
N

index i
index k
index j

index c
index w
 'm
Cs
Df

di
The number of years for which data were collected (usually the
analysis will be performed with full years worth of data)
The number of sample measurements per year (for monthly data, n
» 12; for quarterly data, n * 4). This is also referred to as the
number of "seasons" per year
The total number of sample measurements (if there are no missing
observations, N = mn)
Indicates the order in which the ground-water samples are collected
Indicates the year in which the ground-water samples are collected
Indicates the  season  or time within the  year at  which  the
groundwater samples are collected
Indicates the chemical analyzed
Indicates the well sampled
Contaminant measurement for the ith ground-water sample
An  alternative way of denoting a contaminant measurement, where k
= 1, 2, ..., m denotes  the year, and j =  1, 2, .... n denotes the
sampling period (season) within the year.  The subscript for x% is
related to the subscript for Xj in  the following manner i = (k-l)n +
j-
The mean (or average) of the contaminant measurements for year k
(see Boxes 8.5 and 9.4)
The mean of the yearly averages for years k « 1 to m.
The standard deviation  of the  yearly average  contaminant
concentrations from m years of sample collection (see Boxes 8.7
and 9.6)
The standard error of the mean  of the yearly means (see Boxes 8.9
and 9.8)
The designated clean up standard
The degrees of freedom associated with the standard error of an
estimate (see Boxes 8.7 and 9.6)
The distance of the monitoring well from the treatment well	
                         B-3

-------
                      APPENDIX B: EXAMPLE WORKSHEETS


                                WORKSHEET 1  Sampling Wells

        See Section 3.2 in "Methods for Evaluating the Attainment of Cleanup Standards", Volume 2

           SHE:   Site ABC	

Sample
 Well
Number
w
1
2
0
4
5





monitoring
monitoring
monitoring
monitoring
monitoring
well
well
well
well
well
d, feet
d2 feet
dc, feet
dd feet
ds feet
northeast of treatment
west of treatment well
north of treatment well
southwest of treatment
southeast of treatment
well


well
well

wells 1 and 2
wells 3, 4,
and
will be
assessed individually

5 will be assessed as a group


Decision Criteria: Wells assessed (Checked one)  Individually ED  As a Group  d

Use the Sampling Well Number (w) to refer on subsequent sheets to the sampling wells described
above.

Attach a map showing the sampling wells within the waste site.

Date Completed:  EXAMPTE                    Completed by     RXAMF!-F	

Use additional sheets if necessary.                                      Page	of	


Continue to WORKSHEET 2 if wells are assessed individually.
Continue to WORKSHEET 8 if wells arc assessed as a group.
                                     B-4

-------
                        APPENDIX  B: EXAMPLE WORKSHEETS


                     WORKSHEET 2 Attainment Objectives for Assessing Individual Wells
          See Chanter 3 in "Methods for Evaluating the Attainment of Cleanuo Standards" Volume 2
            SITE:
        Site ABC
          Numbers in square brackets [] refer to the Worksheet bom which the information may be obtained.

 (for purposes of illustration, both methods will be used)
 Sample Design (Check one): Fixed Sample Size BI    Sequential Sampling BI
                     Probability of mistakenly declaring the well(s) dean = a =

            Probability of mistakenly declaring the well(s) contaminated = P =
                                                                 .1
                                                                 .2
 chemical
 Number
     £
Chemical
  Name
                       If Mean,
                         Enter
 Cleanup    Parameter  alternate
 Standard     to test:  hypothesis
(with units)   Check one    mean
    Cs                   Jii
If %rile. Enter
   Critical
proportion for
 alternate/null
 hypothesis
 null   alternate
Pn        Pi
1
2


Hazardous #1
Hazardous #2


100
60


Mean Q
%tikD
Mean HI
%tik D
MeanU
%tik D
Mean Q
%tite D
75
30










Sample Collection Procedures to be used (attach separate sheet if necessary):
              Not specified for this example
Secondary Objectives/Other purposes for which the data is to be  collected
Use the Chemical Number (c) to refer on other sheets to the chemical described above.
Attach documentation describing the lab analysis procedure for each chemical.

Date Completed:   EXAMPLE                    Completed by     EXAMPLE

Use additional sheets if necessary.                                        Page _

Continue to WORKSHEET 3 if a fixed sample size test is used; or
Continue to WORKSHEET 4 if a sequential sample test is used.
                                                                 Of.
                                       B-5

-------
                       APPENDIX  B:  EXAMPLE  WORKSHEETS
     WORKSHEET 3  Sample Size When Using a Fixed Sample Size Test for Assessing Individual Wells
See Sections 8.2 in "Methods for Evaluating the Attainment of Cleanuo Standards". Volume 2
            SITE:
                       Site ABC
         Numbers in square brackets {] refer to the Worksheet from which the information may be obtained.
         Probability of mistakenly declaring the site dean [2] = a =

Probability of mistakenly declaring the site contaminated [2] = P =
                                                                            From Table A.2.
                                                                             Appendix A
                                                                                .842
                              Number of samples per year = n = |  6
•l-fr,

(based on calculations
described in Section 8.2)
             Variance factor from Table A.5, Appendix A = F1  = I  5.55  I
For  testing the mean  concentration
 Chemical    Cleanup               Standard Deviation
Number [2]  Standard[2]        [2]      of yearly mean       Calculate:
               Cs
                                                      _
                                                      B
                                                                                ft2
1
2


100
60


75
30


23
6


138.53
199.50


2.69
2.03


For  testing the  proportion of contaminated wells or samples
 Chemical    Cleanup                                     Calculate:
Number [2]  Standard[2]        [2]           [2]        B
     c          Cs ,          Pn           Pi
                                                                                  B
                                                            •      (l'?l\. "d'FCPn-Pi)2
                                                        Zi^VPo(l-Po))2       (  °  1}
                      Column Maximum, (Maximum of 1114 values ) = C =
Round C to next largest integer=Number of years of sample collection^ nv
                                    Total number of samples = nm = N =
Date Completed:  EXAMPLE                    Completed by   EXAMPLE
Use additional sheets if necessary.                                        Page
Continue to WORKSHEET 4
                                                                             2.69
                                                                             3
                                                                           «MM^

                                                                            18
                                                                               .of.
 1 An estimate of $, the serial correlation, is necessary to determine the appropriate value of F. Worksheets IS and
  16 can be used to estimate 0. 0 - .2 was assumed for this example.
                                       B-6

-------
                       APPENDIX B: EXAMPLE WORKSHEETS


        WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well.
                                            andYear
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards", VoL 2
SITE:
NU14
CHEMICAL:
NUMI
WELL:
MUM
YEAR:
Site ABC
IER(C) AND DESCRIPTION (2J
»emwj AND DUOdr i lUN 1 1 J
IBB4K)

1. Hazardous #1
#1. di ft northeast of treatment well
1988, K = 1
         Number* m square brackets [J peter to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size &1   Sequential Sampling &1
For purposes of illustration, both methods are used.

                             Parameter to be tested [2] (Check one)
                                                                 n
                                   Number of samples per year [3]
                    Number of samples with nonmissing data in year = nk=

                                             Cleanup standaid[2] * Cs=

         Concentration used for observations below the detection limit =
"Season"
Number
 j within  Sample
 thisktn    ID
  year
                      Sample
                     Collection
                     date/time
Reported
Concen-
 tration
 Concentration
 Corrected for
Detection Limit
      A
Is A Greater
  thanCs?
   l-Yes
   0»No
     B
                                                                          Mean EH
                                                                          %tikD
                                                                             100
                                                                              10
   Data for
   analysis
XjfcsAifMean V
Xjk « B if %tile
1
2
3
4
5
6






11
21
31
41
51 •
61






Feb. 18, '88
April 12, '88
June 16, '88
Aug. 15, '88
Oct. 12, '88
Dec. 11, '88






88
123
98
78
89
65






88
123
98
78
89
65


















88
123
98
78
89
65






Total of Xjk for this year = C = 1 541
Mean of xik for this k* vear = £- = xk = |_ 90.17
Date Completed: ^AMPT^.

Use additional sheets if necessary.
              Completed bv   EXAMPLE

                                      Page.
                                                                                of  3
Complete WORKSHEET 4 for other chemicals, years, and wells; otherwise.
Continue to WORKSHEET 5 if a fixed sample size test is used: or
Continue to WORKSHEET 7 if a sequential sample test is used.
                                       B-7

-------
                        APPENDIX B: EXAMPLE WORKSHEETS


        WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well.
                                            and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards" Vol 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIPTION [2]
NUMBER(W) AND OESCXIFTION [1 J
NUMBER(K)

1. Hazardous #1
1. di ft northeast of treatment well
1989. K = 2
         Numbers in square brackets Q refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size &]    Sequential Sampling El
For purposes of illustration, both methods are used.                             	

                             Parameter to be tested [2] (Check one)
                                   Number of samples per year [3]
                    Number of samples with nonmissing data in year
                                              Cleanup standard[2]
         Concentration used for observations below the detection limit
                                                                  n* =
                                                                  Cs=
"Season"
Number
j within
 thisk">
  year
                                                                           Mean GO
                                                                           %tileD
                                                                              100
                                                                               10
          Sample
            ID
 Sample
Collection
date/time
Reported
Concen-
 tration
 Concentration
 Corrected for
Detection Limit
      A
Is A Greater
  thanCs?
   l»Yes
   0-No
     B
   Data for
   analysis
x:k = A if Mean
Xjk = B if %tile
1
2
3
4
5
6






12
22
32
42
52
62






Feb. 15, '89
April 17, '89
June 14, '89
Aug. 18, '89
Oct. 15, '89
Dec. 13, '89






89
72
105
77
63
92






89
72
105
77
63
92


















Total of Xjfc for this year = C =
Q
Mean of x^ for this k* year = — = xk =
89
72
105
77
63
92






498
83.00
Date Completed:  EXAMPLE                    Completed by .

Use additional sheets if necessary.

Complete WORKSHEET 4 for other chemicals, years, and wells: otherwise.
Continue to WORKSHEET 5 if a fixed sample size test is used; or
Continue to WORKSHEET 7 if a sequential sample test is used.
                                                                       Page.
                                       B-8

-------
                        APPENDIX B: EXAMPLE WORKSHEETS


        WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well.
                                            andYear
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL;
WELL-
YEAR:
Site ABC
NUMBER(C) AND DESCRIPTION [2]
NUMBER(w) AND DESCRIPTION 11]
NUMBER(K)

1. Hazardous #1
1. di ft. northeast of treatment well
1990, K = 3
         Numbers in square bracket* [] refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size Efl    Sequential Sampling Efl
For purposes of illustration, both methods are used.

                              Parameter to be tested [2] (Check one) =
                                   Number of samples per year [3] = n =

                    Number of samples with nonmissing data in year = nk =
                                              Cleanup standard[2] = Cs=

         Concentration used for observations below the detection limit =
"Season"
Number
j within
 this k*
  year
Sample
  ID
                      Sample
                     Collection
                     date/time
Reported
Concen-
 tration
 Concentration
 Corrected for
Detection Limit
      A
Is A Greater
  thanCs?
   l-Yes
   0-No
     B
                                                                 Mean El
                                                                 %tiJe D
                                                                   100
                                                                              10
Data for
analysis
- A if Mean
= B if %tile
1
2
3
4
5
6






13
23
33
43
53 .
63






Feb. 16, '90
April 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90






71
62
88
43
62
73






71
62
88
43
62
73


















71
62
88
43
62
73






Total of Xjjc for this year » C * | 399
Mean of xik for this k^vears — »xk= [ 66.50
                                               Completed by   EXAMPLE
Date Completed: EXAMPLE

Use additional sheets if necessary.
Complete WORKSHEET 4 for other chemicals, years, and wells: otherwise.
Continue to WORKSHEET 5 if a fixed sample size test is used; or
Continue to WORKSHEET 7 if a sequential sample test is used.
                                                                      Page
                                                                      of_3_
                                       B-9

-------
                        APPENDIX B:  EXAMPLE WORKSHEETS
        WORKSHEETS Data Calculations for a Fixed Sample Size Test When Assessing Individual Wells; by
                                         Chemical and Well
See Chapter 8 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2	
            SITE:
Site ABC
     CHEMICAL:
                   NUMBER(C) AND DESCRIPTION [2]
                        1. Hazardous #1
           WELL:
                   NUMBER( w) AND DESCRIPTION 11J
                        1. di ft. northeast of treatment well
          Numbers in square brackets (] refer to the Worksheet horn which the information may be obtained.
               Year
              Number
Total from previous page
(if more than one Worksheet
5 used)

Column Totals:
      Mean
      for the
     year [4]
1
2
3











90.17
83.00
66.50











8,130.63
6,889.00
4,422.25











A    239.67     I  B  19.441.88    I
                                                  (xk)2
Date Completed:   EXAMPLE

Use additional sheets if necessary.
                        Completed by    FYAVTPT F

                                                Page.
.of
Complete WORKSHEET 5 for other chemicals and wells or continue to WORKSHEET 6
                                       B-10

-------
                        APPENDIX B: EXAMPLE WORKSHEETS
        WORKSHEET 6 Inference for Fixed Sample Sites Tests When Assessing Individual Wells, by Chemical
                                            and Well
 See Chanter 8 in "Methods for Evaluation the Attainment of Cleanup Standards". Volume 2
            SITE:
                      Site ABC
      CHEMICAL:
                   NUMBc AND DESuwilON [ZJ
                                              1. Hazardous #1
           WELL:
                   NUMBEX(w) AND OeSOUPnON [I J
                                              1. di ft. northeast of treatment well
          Numbers in square brackets [] refer to the worksheet trom which the information may be obtained.

                                         [2] 
-------
                       APPENDIX B: EXAMPLE WORKSHEETS
        WORKSHEET 7(1 Data Calculations for a Sequential Sample When Assessing Wells Individually; by
                                        Chemical and Well
See Chanter 9 in "Methods for E valuatinff the Attainment of Cleanun Standards" Volume 2
            SITE:
               Site ABC
     CHEMICAL:
                  NUMBER(C) AND DESCRIPTION [2]
                                      1. Hazardous #1
          WELL:
                   NUMBER(w) AND DESCRIPTION [ 1 ]
                                      1. di ft. northeast of treatment well
         Number* in square brackets (] refer to the Worksheet from which the information may be obtained.
                                                Cleanup standard[2] = Cs
                                                    Alternate mean =
                                                                    100
                                                                     75
        Probability of mistakenly declaring the well(s) dean [2]» a = |
Probability of mistakenly declaring the well(s) contaminated [2] = P =
                                                                                .1
                                                                                .2
  Year      Yearly      Cumulative     Cumulative       Mean
Number   Average     Sumofxk      Sum of x^    (average of
   [4]         [4]          (Ao = 0)       (Bo»0)   yearly averages)

 k or m       xk      Ak = Ak.t+xk Bk = Bk_!+xk2   xm
                                                               Standard
                                                            Error of Mean
                                                         Sjf
                                                                    m
1
2
3







90.17
83.00
66.50

•





90.17
173.17
239.67


Car




8,130.63
15,019.63
19,441.88


y as many signi




90.1700
86.5950
79.8900


kant figures as




_
3.4622
7.0077


possible




Date Completed: EXAMPLE
Use additional sheets if necessary.
                                      Completed by   EXAMPLE
                                                             Page.
.of.
Complete WORKSHEETS 7a and 7b for other chemicals and wells
                                      B-12

-------
                        APPENDIX B: EXAMPLE WORKSHEETS


        WORKSHEET 7u Data Calculations for a Sequential Sample When Assessing Wells Individually: by
                                        Chemical and Well
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE: Site ABC
NUt&Ejtlc) AND DESdUriiON 12 J
CHEMICAL: 1. Hazardous #1
NUMBER( W) AND DBSCMPI toR 1 1 J
WELL: 1 . di ft northeast of treatment well
Number* in square
Year
Number „
[4] 5 = ^
m Sxm
1
2
3







*LR = ex


-3.5675







bracfceu [] refer U
t*
Sxm


-1.086







(Rm-2 /" ]
P[5 m 'Vm-l+t* J



> the Worksheet horn which the information may be obtained.
Critical Critical Decision:
value: value: clean LR > B,
Likelihood clean contaminated contaminated LR £ A,
ratio R in or no decision
LR* A--^ 8=-^ A
-------
                        APPENDIX B:  EXAMPLE WORKSHEETS


       WORKSHEET 8 Attainment Objectives When Assessing Wells as a Group
See Chapter 3 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
            SITE:
                        Site ABC
         Number* in square brackets [J refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size O  Sequential Sampling Efl

               Probability of mistakenly declaring the well(s) dean = a =

       Probability of mistakenly declaring the well(s) contaminated = P =
                                                                           .1
                                                                           .2
 Chemical
to be tested
  number
                Chemical
                  name
 Cleanup    Parameter  alternate
 standard      to test:   hypoth-
(with units)   Check one    esis
    Cs                     u.
If mean,    If mean,
enter the    enter the
           alternate
           hypoth-
             esis
                                                                                  Maxi
1
2


Hazardous #1
Hazardous #2


100
60


MeanGZI
%tileD
MeanES
%tilea
Mean U
Max a
McanU
Max D
75
30






Sample Collection Procedures to be used (attach separate sheet if necessary):
                     Not specified for this example
Secondary Objectives/ Other purposes for which the data is to be collected:
Use the Chemical Numb?- (c) to refer on other sheets to the chemical described above.
Attach documentation d  ,. ribing the lab analysis procedure for each chemical.

                                                                 EXAMPT.E
Date Completed:  EXAMPLE

Use additional sheets if necessary.
   Completed by,
                                                                        Page.
                                     of
Continue to WORKSHEET 9 if a fixed sample size test is used; or
Continue to WORKSHEET  10 if a sequential sample test is used.
                                       B-14

-------
                        APPENDIX  B: EXAMPLE WORKSHEETS
     WORKSHEET 9 Sample Size When Using a Fixed Sample Size Test for Assessing Wells as a Group
See Sections 8.2 in "Methods for Evaluating the Attainment of Cleanup Standards", Volume 2
            SITE:
                        Site ABC
          Numbers in squire brackets [] refer to the Worksheet from which the information may be obtained.
                                                                               From Table A.2,
                                                                                 Appendix A
         Probability of mistakenly declaring the site clean [8] = a =

 Probability of mistakenly declaring the site contaminated [8] = p =

                               Number of samples per year « n » [j6

                                                          F1 =
             Variance factor from Table A.5, Appendix A

For testing the mean concentration
                                                                        j (based on calculations
                                                                        J described in Section 8.2)
 Chemical     Cleanup
Number [8]  Standard[8]
     c          Cs
      Standard Deviation
[8]         of mean
                                                            Calculate:
                                                               cs-m
-T
-p /
1
2


100
60


75
30


23
6


138.53
199.50


2.69
2.03


For testing the maximum concentration across all  wells
                                     Standard Deviation      Calculate:
Number [8] Standaid[8] [8] of yearly mean
c Cs Maxi &













fCs-Maxif ft2
[zi -O+ZL-B J ""a - F*B + '




Column Maximum, (Maximum of md values) = C =
Round C to next largest integer=Number of years of sample collection= m=
Total number of samples = nm = N =




2.69
3
18
Date Complered: EXAMPLE Completed bv EXAMPLE.
Use additional sheets if necessary.
Continue to WORKSHEET 10
                                                                         Page
                                                     of
1 An estimate of 0, the serial correlation, is necessary to determine the appropriate value of F. Worksheets 15 and
  16 can be used to estimate 0. 0 was assumed to be .20 for this example.
                                        B-15

-------
                       APPENDIX B: EXAMPLE WORKSHEETS
         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                 Group; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(c; AND DESCRIPTION 18]
NUMBEK^W) AND DESCRIPTION 11 1
NUMBER(K)
1988,

1. Hazardous #1
3. ds ft north of treatment well
k-1
         Numbers in square brackets [J refer to the Worksheet from which the information may be obtained.
                                       Parameter to be tested (Check one) =

                                         Number of samples per year = n =

                Concentration used for observations below the detection limit =
                                                                         MeanE?
                                                                          MaxD
                                                                             10
   "Season"
   Number
      J
                              Sample
                 Sample      Collection
                   ID          time
Reported   Concentration
Concen-   Corrected for
 tration   Detection Limit
1
2
3
4
5
6






31
32
33
34
35
36






Feb. 18, '88
Apr. 12, '88
June 16. '88
Aug. IS, '88
Oct. 12, '88
Dec. 11/88






88.71
89.38
74.92
80.03
89.98
91.34






88.71
89.38
74.92
80.03
89.98
91.34






Date Completed:  EXAMPLE

Use additional sheets if necessary.
                                              Completed by     EXAMPLE
                                                                     Page_L.of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
                                     B-16

-------
                       APPENDIX B: EXAMPLE WORKSHEETS


         WORKSHEET 10 Data Records for an Individual Weil and Calculations When Assessing Wells as a
                                 Group; by Chemical, Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
N-UMBER(C) AND DESCRIPTION [8]


NUMBER(w) AND DESCRIPTION [1]
NUMBER(K)
1988,



1




. Hazardous
4. dj
k

s
1
ft.


#1
southwest




of treatment well

         Numbers in square brackets [] refer to (he Worksheet from which the information may be obtained.
                                       Parameter to be tested (Check one)'

                                         Number of samples per year = n:

                Concentration used for observations below the detection limit :
                                                                          Meanla?
                                                                           MaxD
                                                                              10
   "Season"
   Number
      j
                              Sample
                 Sample      Collection
                   ID          time
Reported   Concentration
Concen-   Corrected for
 oration    Detection Limit
1
2
3
4
5
6






41
42
43
44
45
46






Feb. 18, '88
Apr. 12, '88
June 16, '88
AUK. 15, '88
OCL 12, '88
Dec. 11, '88






76.50
71.28
93.77
73.60
120.94
82.56






76.50
71.28
93.77
73.60
120.94
82.56






Date Completed: EYAMWP.

Use additional sheets if necessary.
                                               Completed by     EXAMPLE
                                                                      Page_2_of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
                                      B-17

-------
                       APPENDIX B: EXAMPLE WORKSHEETS
         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                 Group; by Chemical. Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanun Standards" Vol 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIn ION (8]
NUMBE1HW) AND DESCRIPTION 11]
NUMBERtK)
1988,

1.
5.
k =

Hazardous #1
ds ft. southeast
1


of treatment well

         Numbers in square brackets [J refer to the Worksheet from which the information may be obtained.
                                       Parameter to be tested (Check one) =

                                         Number of samples per year = n =

                Concentration used for observations below the detection limit =
                                                                          Meanl^T
                                                                           MaxD
                                                                              10
   "Season"
   Number
      j
                              Sample
                 Sample      Collection
                   ID           time
Reported   Concentration
Concen-   Corrected for
 nation    Detection Limit
1
2
3
4
5
6






51
52
53
54
55
56






Feb. 18, '88
Apr. 12, '88
June 16, '88
Aug. 15, '88
Oct. 12, '88
Dec. 11/88






62.68
92.49
80.94
103.38
95.39
99.04






62.68
92.49
80.94
103.38
95.39
99.04






Date Completed:  BXAMPI.E

Use additional sheets if necessary.
                                               Completed by     F.YAMPT.F
                                       1                               Page_l_of_2_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
                                      B-18

-------
                        APPENDIX  B:  EXAMPLE  WORKSHEETS
         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                  Croup; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIPTION [8]


NUMBER(w) AND DESCRIPTION (I J
NUMBER(K)
1989,

1.
3.
k-



Hazardous
d.
2
ft.

north


#1
of treatment



well

          Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.
                                        Parameter to be tested (Check one) =

                                          Number of samples per year = n =

                Concentration used for observations below the detection limit =
                                                                          Mean &
                                                                           MaxD
                                                                              10
   "Season"
   Number
      J
                              Sample
                 Sample     Collection
                   ID           time
Reported   Concentration
Concen-   Corrected for
 nation    Detection Limit
1
2
3
4
5
6






31
32
33
34
35
36






Feb. 15, '89
Apr. 17, '89
June 14, '89
Aug. 18, '89
Oct. 15, '89
Dec. 13, '89






87.11
78.38
80.61
73.51
89.16
100.26






87.11
78.38
80.61
73.51
89.16
100.26






Date Completed: EXAMPLE

Use additional sheets if necessary.
                                               Completed by     EXAMPLE
                                                                      Page_4_of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
                                      B-19

-------
                        APPENDIX B:  EXAMPLE  WORKSHEETS


         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                  Group; by Chemical. Well and Year
See Charter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site
ABC


NUMBEXCc) AND DESCRIPTION [8]
NUMBER(W) AND DESCRIPTION [1]
NUMBHKK)

1989,

1.



Hazardous
4. dd
k
SB
2
ft.


#1
southwest




of treatment well

         Nurnben in squve brackets [] refer to the Worksheet from which the information may be obtained.
                                        Parameter to be tested (Check one) =

                                          Number of samples per year = n =

                Concentration used for observations below the detection limit =
                                                          MeanGJ
                                                           MaxD
                                                                               10
   "Season"
   Number
      J
             Sample
Sample      Collection
  ID           time
Reported   Concentration
Concen-   Corrected for
 tration    Detection Limit
1
2
3
4
5
6






41
42
43
44
45
46






Feb. 15, '89
Apr 17, '89
June 14, '89
Aug. 18, '89
Oct, 15, '89
Dec. 13, '89






82.34
85.69
96.72
108.61
95.75
66.77






82.34
85.69
96.72
108.61
95.75
66.77






Date Completed:
                               Completed by     EXAMPLE
Use additional sheets if necessary.                                         Page _5_ of _9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
                                       B-20

-------
                        APPENDIX B: EXAMPLE WORKSHEETS


         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                  Croup; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESOUFTION (8J


NUMBEX(W) AND DESOIimON [1]
NUMBEXK)
1989,

1.
5.
k»



Hazardous
dIJ
2
ft.


#1
southeast of treatment




well

         Numben in square brackets [] refer to the worksheet from which the information may be obtained.
                                        Parameter to be tested (Check one) =

                                          Number of samples per year = n =

                Concentration used for observations below the detection limit =
                                                          Meanla?
                                                           MaxD
                                                                               10
   "Season-
   Number
      J
             Sample
Sample      Collection
  ID           time
Reported   Concentration
Concen-   Corrected for
 tration    Detection Limii
1
2
3
4
5
6






51
52
53
54
55
56






Feb. 15, '89
Apr. 17, '89
June 14, '89
Aug. 18, '89
Oct. 15, '89
Dec. 13, '89

/




80.05
81.44
92.89
93.87
95.82
78.39






80.05
81.44
92.89
93.87
95.82
78.39






Par* rnmpliti-H-
                              Completed by     EXAMPLE
Use additional sheets if necessary.                                        Page _6_ of _2_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue ID WORKSHEET II
                                      B-21

-------
                       APPENDIX B:  EXAMPLE WORKSHEETS
         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                  Group; by Chemical. Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards" Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCR1FTION (8]


NUMBER(W) AND DESCRIPTION [1 ]
NUMBER(K)
1990,1

1.
3.
C s



Hazardous
d^
3
ft

north


#1
of treatment



well

         Numbers in square brackets [I refer to the Worksheet from which the information may be obtained.
                                        Parameter to be tested (Check one) =
                                          Number of samples per year = n =
                Concentration used for observations below the detection limit =
                                                                          MeanEj
                                                                           MaxD
                                                                              10
   "Season-
   Number
      j
                              Sample
                 Sample      Collection
                   ID           time
Reported   Concentration
Concen-   Corrected for
 tration    Detection Limit
1
2
3
4
5
6






31
32
33
34
35
36






Feb. 16, '90
Apr. 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90






76.86
76.38
87.46
80.84
71.65
57.28






76.86
76.38
87.46
80.84
71.65
57.28






Date Completed:  PYAMPI.F.
Use additional sheets if necessary.
                                               Completed by     EXAMPLE
                                                                       Page_7_of_9_

Complete WORKSHEET 10 for other chemicals, yean, and wells or continue to WORKSHEET 11
                                       B-22

-------
                        APPENDIX B: EXAMPLE WORKSHEETS
         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                  Group; by Chemical, Well and Year
See Chapter 8 or 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
YEAR:
Site ABC
NUMBER(C) AND DESCRIPnON [8]


NUMUR(W) AND DESCRIPTION |1J
NUMBER^*)
1990,



1




.Hazardous
4. dd
k

X
3
ft.


#1
southwest of treatment




well

         Numben in square bracket* [] refer to the Worksheet from which the information may be obtained.
                                        Parameter to be tested (Check one) =

                                          Number of samples per year = n =
                Concentration used for observations below the detection limit =
                                                                          MeanE?
                                                                           MaxD
                                                                              10
   "Season"
   Number
      j
                              Sample
                 Sample      Collection
                   ID           time
Reported   Concentration
Concen-   Corrected for
 tration    Detection Limit
1
2
3
4
5
6






41
42
43
44
45
46






Feb. 16, '90
Apr. 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90






87.85
87.08
97.84
105.95
81.58
87.76






87.85
87.08
97.84
105.95
81.58
87.76






Date Completed: EXAMPLE

Use additional sheets if necessary.
                                               Completed by.
                     EXAMPLE
                                                                      Page_8_of_9_

Complete WORKSHEET 10 for other chemicals, years, and wells or continue to WORKSHEET 11
                                      B-23

-------
                        APPENDIX B: EXAMPLE WORKSHEETS
         WORKSHEET 10 Data Records for an Individual Well and Calculations When Assessing Wells as a
                                  Croup; by Chemical. Well and Year
See Chanter 8 or 9 in "Methods for Evaluating the Attainment of Cleanuo Standards" Vol 2
SITE:
CHEMICAL:
WELL-
YEAR:
Site ABC
NUMBER(c) AND DESCRIPTION [8]


NUMBER(W) AND DESCRIPTION [1]
NUMBER(K)
1990,

1.
5.
k =



Hazardous #1
di
3
ft.

southeast of treatment well

         Numben in square bracken [] refer to (he Worksheet from which the information may be obtained.
                                        Parameter to be tested (Check one) =

                                          Number of samples per year = n =

                Concentration used for observations below the detection limit =
                                                                          MeanG?
                                                                           MaxG
                                                                              10
   "Season"
   Number
      j
                              Sample
                 Sample      Collection
                   ID           time
Reported   Concentration
Concen-   Corrected for
 (ration    Detection Limit
1
2
3
4
5
6






51
52
53
54
55
56






Feb. 16, '90
Apr. 14, '90
June 14, '90
Aug. 17, '90
Oct. 15, '90
Dec. 15, '90






79.70
59.32
66.64
52.48
91.63
35.08






79.70
59.32
66.64
52.48
91.63
35.08






Date Completed: EXAMPLE

Use additional sheets if necessary.
                                               Completed by     EXAMPLE
                                                                      Page_9_of_9_

Complete WORKSHEET 10 for other chemicals, yean, and wells or continue to WORKSHEET 11
                                      B-24

-------
                       APPENDIX  B:  EXAMPLE  WORKSHEETS
       WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical and
                                             Year                              •*
See Chapter 8 or 9 in "Statistical Methods for Evaluating the Attainment of Superfund Cleanup Standards". Vol. 2
            SITE:
Site ABC
     CHEMICAL:
                  NUMBER(C) AND DESCRIPTION 18}
                       1. Hazardous #1
          YEAR:
                  NUMBER(K)
                 1988, k -1
         Numben in square brackets [] refer to (he Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size D   Sequential Sampling ED
                                       Parameter to be tested (Check one) =
                                      Number of samples per year [9] = n =
                                                  Mean 1*7
                                                   MaxD
"Season" Well#l
NumbcrtlO] [10]
j xik
Well #4
[10]
Xik
Well#_i
[10]
Xik
Well*.
[10]
                                                                    Measure for
                                                                      analysis
                                                          Well #_ (row maximum
                                                            [10]     or row mean)
                                                            x,k
1
2
3
4
5
6






88.71
89.38
74.92
80.03
89.98
91.34


.



76.50
71.28
93.77
73.60
120.94
82.56






62.68
92.49
80.94
103.38
95.39
99.04






























75.96
84.38
83.21
85.67
102.10
90.98






                                        Total of Xj for this year * A = }   522.30 \
                                                           -   A
                                   Mean of Xjk for this year = xk = — = j    gy 05 |
Date Completed:   EXAMPLE
Use additional sheets if necessary.
                       Completed by  EXAMPLE
                                              Page__L.of.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size lest is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
                                      B-25

-------
                       APPENDIX B: EXAMPLE WORKSHEETS


        WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical*and
                                             Year
See Chanter 8 or 9 in "Statistical Methods for Evaluating the Attainment of Suoerfund Cleanuo Standards" Vol 2
            SITE:
                        Site ABC
     CHEMICAL:
                   NUMBER(C) AND DESCRIPTION (8]
                                               1. Hazardous #1
           YEAR:
                   NUMBER(K)
                                        1989.
         Numbers in square brackets [j refer to the Worksheet from which the information may be obtained.
Sample Design (Check one): Fixed Sample Size D   Sequential Sampling &)

                                        Parameter to be tested (Check one)

                                       Number of samples per year [9] = n
                                                                          MeanEj
                                                                           MaxD
Measure for
analysis
"Season" Well#JL Well #4 Well #.5. Well#_ Well #_ (row maximun
Number! 10] [10] [10] [10] [10] [10] or row mean
J xik xik xik xik xik xi
1
2
3
4
5
6






87.11
78.38
80.61
73.51
89.16
100.26


*



82.34
85.69
96.72
108.61
95.75
66.77






80.05
81.44
92.89
93.87
95.82
78.39






























83.17
81.84
90.07
92.00
93.58
81.81






                                         Total of Xj for this year = A = |  522.47 I
                                                           _    A
                                   Mean of Xj^ for this year = x^ = —
                                                                        87.08
Date Completed: EJLAM£LE_
Use additional sheets if necessary.
                                               Completed by   EXAMPLE
                                                                       Page_2_of.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size test is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
                                      B-26

-------
                       APPENDIX  B:  EXAMPLE  WORKSHEETS


        WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical and
                                             Year                               4
See Chapter 8 or 9 in 'Statistical Methods for Evaluating the Attainment of Superfund Cleanup Standards",Vol. 2
           SITE:
Site ABC
     CHEMICAL:
                  NUMBER(C) AND DESCRIPTION IB]
                       1. Hazardous #1
          YEAR:
                  NUMBERUt)
                 1990. k » 3
         Numbers m square brackets [] refer to the Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size D   Sequential Sampling Efl

                                       Parameter to be tested (Check one) ••

                                      Number of samples per year [9] = n
                                                  Mean 1?
                                                   MaxD
  "Season"   Well#_3_    Well #4
Number! 10]    [101         [10]
     j         *ik
             Well #5   Wcll#_
               [101       [101
               *ik	xik
         Measure for
           analysis
Well #_ (row maximum
 [10]     or row mean)  V
1
2
3
4
5
6






76.86
76.38
87.46
80.84
71.65
57.28



'


87.85
87.08
97.84
105.95
81.58
87.76






79.70
59.32
66.64
52.48
91.63
35.08






























81.47
74.26
83.98
79.76
81.62
60.04






                                        Total of Xj for this year = A = j  451.13  |
                                                               A   r——
                                   Mean of Xjk for this year = Xk= — = I   ^Tl
Date Completed:   EXAMPLE
                       Completed by   EXAMPLE
Use additional sheets if necessary.
                                              Page.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size test is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
                    .Of.
                                      B-27

-------
                        APPENDIX B: EXAMPLE WORKSHEETS
        WORKSHEET 1 4(1 Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
                                            Chemical
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2 _
            SITE:
            Site ABC
     CHEMICAL:
                   NUMBER(C) AND DESCRIPTION (8]
                                   1. Hazardous
         Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.
                                                Cleanup standard[8] = Cs •
                                                     Alternate mean = nt •

                 Probability of mistakenly declaring the well(s) clean [8] = a =
         Probability of mistakenly declaring the well(s) contaminated [8] = $
                                                                 100
                                                                  75
 Year
Number
 [11]
   k
 Yearly
Average
  [11]
                        Cumulative
                         Sum of x^
                          (A0 = 0)
Cumulative      Mean
Sum of x^   (average of
 (Bo = 0)  yearly averages)
    Standard
Deviation of Mean
                                                                               (k-l)k
1
2
3







87.05
87.08
76.86







87.05
174.13
250.99







7,577.70
15,160.63
21,068.09

Carry as ma





87.0500
87.0650
83.6633

ly significant fi]





_
_
3.402

ures as possible





Date Completed:
                                   Completed by      EXAMPLE
Use additional sheets if necessary.

Complete WORKSHEET 14a and 14b for other chemicals and groups of wells
                                                          Page
                                                                                of.
                                      B-28

-------
                       APPENDIX B: EXAMPLE WORKSHEETS


        WORKSHEET 14b Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
                                           Chemical
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
SITE: Site ABC
NUMBER(C) AND DESCRIPTION [8]
CHEMICAL: 1. Hazardous #1
Numbers in squari
Year
Number
m l S5T
1
2
3









-7.349







• brackets [] refer
t =
Sxm


-1.128







to the Worksheet from which the information may be obtained.
Critical Critical Decision:
value: value: cleanLR>3,
Likelihood clean contaminated contaminated LR £ A
ratio n in or no decision
LR* A=-^- B=-^ A
-------
                       APPENDIX B: EXAMPLE WORKSHEETS


        . WORKSHEET 15 Removing Seasonal Patterns in the Data (Use as First Step in Computing Serial
                                         Correlations)
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2    	
            SITE:
                       Site DEF (data independent of five-well example)
     CHEMICAL:
                  NUMBEWC) AND DESCRIPTION [2 OR 8J
                                              1. Chemical #1
          WELL:
                  NUMBER(W) AND DESCRIPTION [1J
                                              1. di ft. south of treatment well
         Numbers in square brackets [] refer to the Worksheet from which (he information may be obtained.
                                                        Number of
"Season"     Measurements for each "season" for year k       years with   Row
Number Yr=_L_   Yr=_2_   Yr=__  Yr»__   Yr*_    Data      Total
                                                                              Row
                                                                              Mean
   J
                    Xjk
m;
                                                                              J
                                                                                 m;
k J
1
2
3
4
5
6






120
163
128
150
125
110






133
117
113
126
114
145






t



































2
2
2
2
2
2






253
280
241
276
239
255






126.5
140
120.5
138
119.5
127.5






Corrected measurements with seasonal patterns removed
"Season" Corrected Measurements for each "season" for year k

Number  Yr=_J_ Yr=_2_ Yr=	 Yr=__  Yr«

   J     Xft-Xj    Xfc-Xj    Xfr-Xj
1
2
3
4
5
6






-6.5
23
7.5
12
5.5
-17.5






6.5
-23
-7.5
-12
-5.5
17.5










































Date Completed:   EXAMPLE
                                              Completed by   EXAMPLE
Use additional sheets if necessary.

Complete WORKSHEET 15 Tor other chemicals
Continue to WORKSHEET  16 if serial correlations are being computed.
                                                                     Page.
                   of
                                     B-30

-------
                       APPENDIX B: EXAMPLE WORKSHEETS
       WORKSHEET 16 Calculating Serial Correlations
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
           SUE:
SiteDEF (data independent of five-well example)
     CHEMICAL:
                  NUMBER(C) AND DESCRIPTION [2 OR 8|
                       1. Chemical #1
          WELL:
                  NUMBER(w) AND DESCRIPTION [1 ]
                       1. di ft. south of treatment well
         Numbers in square brackets (] refer to the Worksheet from which the information may be obtained.
                                                              Year«k'

                                 Period between well samples in months = t •
   Data
 Numbers
(season within
  yeark)
   Residual
     [131
Product
     11
     21
     31
     41
     51
     61
-6.5
23.0
7.5
12.0
5.5
-17.5





-149.5
172.5
90.00
66.00
-96.25




                                         42.25
                                        529.00
                                          56.25
                                         144.00
                                          30.25
                                         306.25
         Totals from previous page =
              (if more than one
              Worksheet 16 is used)
                   Column Totals =     |A 82.75      I    |B1 108

                                           A   A
Estimated Serial Correlation based on the data = g" = %bs =
Serial Correlation between monthly observations = $ = ($bbs)'
Date Completed:  FVAMPT .F.

Use additional sheets if necessary.

Complete WORKSHEET 16 for other chemicals
                        Completed by    EXAMPLE
                                               Page.
                                       of.
                                      B-31

-------
                        APPENDIX B: EXAMPLE WORKSHEETS
       WORKSHEET 16 Calculating Serial Correlations
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanuo Standards". Vol. 2
SITE:
CHEMICAL:
WELL:
Site DEF (data inde
NUMBCRIC^ AND OESuUrnuH [i Ol
NUMBER( W J AND Dcsudri luH [ 1 J
pendent of five-well example)
1. Chemical #1
1. di ft south of treatment well
         numbers in square brackets [J refer to the worksheet from which the information may be obtained.
Data
Numbers
(season within
yeark)
12
22
32
42
52
62





Year =k =
Period between well samples in months = t =
Residual Product
[151





6.5
-23.0
-7.5
-12.0
-5.5
17.5




Totals from previous page =
(if more than one
Worksheet 16 is used)
Column Totals^

-149.50
172.50
90.00
66.00
-96.25
-





| 82.75 1
*
42.25
529.00
56.25
144.00
30.25
306.25




2
2








1,108

U 165.5 1
Estimated Serial Correlation based on the data = «• = $hhs a
•% ' W9
Serial Correlation between monthly observations « $ » (^
Hare rnmpleted! EXAMPIE Campl
B 2,216

| .0747 |
1
,Y-
etedby

1 .2733
EXAMPLE

•

Use additional sheets if necessary.

Complete WORKSHEET 16 for other chemicals
Page_2_of JL_
                                       B-32

-------
                       APPENDIX B: EXAMPLE WORKSHEETS


       WORKSHEET 1R Basic Calculations for a Simple Linear Regression
See Section 6.1 in "Methods for Evaluating the Attainment of Cleanup Standards", Vol. 2
            SITE:
Site ABC
     CHEMICAL:
                   NUMBER(C) AND DESCRIPTION [2 OR 8]
                        1. Hazardous
          WELL:
                   NUMBER(w) AND DESCRIPTION (1J
                        l.di ft northeast of treatment well
         Numben in square bracket* (] refer to ihe Worksheet from which the information may be obtained.
         Concentration
Sample  Corrected for
Number Detection Limit
   Concentration used when no concentration is reported •

                   Number of collectable samples = N •

                  Transformed
                     Time
                    Variable
1
2
3







90.17
83.00
66.50







8,130.63
6,889.00
4,422.25







1
2
3







1
4
9







90.17
166.00
199.50







Totals from orevious naffd s):

Column Totals:
A 239.67
I

B 19,441.88 C 6
1 1

D 14 IE
A » Zyn B = 2-yil c * £*n D = £xn
Corrected Sum of Sauares and Cross Products:
79.89 2
294.64
2 1
Date Completed! EXAMPLE Comnletedbv EXAMPLE
Use additional sheets if necessary.
Page
455.67 |
E = 2ynxn
-23.67 |
,£- —
	 of 	
Complete WORKSHEET 1R for other chemicals or continue to WORKSHEET 2R.
                                       B-33

-------
                       APPENDIX B: EXAMPLE WORKSHEETS
                       WORKSHEET 2R Inference in a Simple Linear Regression
See Section 6.1 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
           SITE:
Site ABC
                  NUMBEWC) AND DESCRIPTION [2 OR 8J
     CHEMICAL:                              1. Hazardous #1
          WELL:
                  NUMBERC") AND DESCRIPTION 11 J
                       l.di ft. northeast of treatment well
  "~""^TJ3nb«sunquareTra3!eTrn^e^^hTWwkiheei from which the information may be obtained.
Estimating Regression Coefficients
Sw [1RJ • 294.64

S»11RJ= I

Svx 11RJ = -23.67
Type 1 error
probability
a= .1
>
Sum of sc
Critical value from table of t
for sped
'Stand
Upper Tw
U
Intc
Calculating Predictic
Value
Standard Error of Predicted >
Upper Two Sided
Lower Two Sided
Hate Completed: EXAMPL
Number of collectable
samples [1R] « N =
Mcanofyt[lR] y
Meanofx,[lR] = X =
sv_
Estimated slope [1R], bj » r" =
Estimated Intercept [1R], brj = y - (bi*X) =
(sM
uares due to error [1R], SSE = ^yy-IgM *
-distribution (Appendix A.1)
fied values of (1-j) and Df » t *
Mean Square Error, MSE = JJT2 s
VWCE
s =
o Sided Confidence Interval
for Slope: bt +• t * s(bj) =
)wer Two Sided Confidence
aval for Slope: bj - 1 * s(bj) =
»n Limits
of x, at which concentration is to be predicted -
Predicted value, 9 * bo + b^, =
/oln. CA — -\ 1 \jfCT7 f 1 _i_ * _i_ ' \ —
'aiue = by — \l Moc( i + ^ + ^ } —
v ^^ XX
Confidence Interval for Prediction = y + 1 *Sy =
1 Confidence Interval for Prediction = y - 1 *Sy =
f.F Comnletedbv EXA
3
79.89
2
-11.84

103.57
14.51
1

6.314
	 14.51
2.69
	 5.14
-28.82

2.5 1
73.97 I
4.6000
Upper 103.01
Low«r 44.93
MPTJi
Use additional sheets if necessary. Page of
Complete WORKSHEET 2R for other chemicals
                                     B-34

-------
                       APPENDIX C:  BLANK WORKSHEETS
              The worksheets in this appendix can be photocopied when needed. Then the copies
may be us&i in their current form or modified, as appropriate. They may be employed to
document the objectives and decisions, record data, and make calculations to determine if the
ground water at the site attains the cleanup standard. These worksheets refer to  in the main text  of
this  document. Appendix B provides examples of how to fill out the worksheets.


              The initial appearance of a "Bold" letter in a worksheet represents an intermediate
computation, the result of which will be used in a later computation and will also be signified by
the letter in "Bold" script.
              To maintain adequate precision in doing the computations appearing in the
worksheets, (particularly in the  calculation of estimated  variances, standard deviations, or standard
errors), the number of decimal places retained should be as high as possible, with a minimum of
four.
                                       C-l

-------
           APPENDIX C: BLANK WORKSHEETS
Table C. 1  Summary of Notation Used in Appendix C
 Symbol
                 Definition
m
N

index i
index k
index j

index c
index w
*k
Sx
Cs
Df
The number of years for which data were collected (usually the
analysis will be performed with full years worth of data)
The number of sample measurements per year (for monthly data, n
= 12; for quarterly data, n = 4). This is also referred to as the
number of "seasons" per year
The total number of sample measurements (if there are no missing
observations, N * mn)
Indicates the order in which the ground-water samples are collected
Indicates the year in which the ground-water samples are collected
Indicates the  season or time within the year at which the
groundwater samples are collected
Indicates the chemical analyzed
Indicates the well sampled
Contaminant measurement for the ith ground-water sample
An alternative way of denoting a contaminant measurement, where k
= 1, 2, .... m denotes the year, and j = 1,  2, .... n denotes the
sampling period (season) within the year.  The subscript  for x* is
related to the subscript for Xj in the following manner i = (k-l)n +
j-
The mean (or average) of the contaminant measurements for year k
(see Boxes 8.5 and 9.4)
The mean of the yearly averages for years k =  1 to m.
The standard deviation of the  yearly average contaminant
concentrations from m years of sample collection (see Boxes 8.7
and 9.6)
The standard error of the mean of the yearly means (see Boxes 8.9
and 9.8)
The designated clean up standard
The degrees of freedom associated with  the standard error of an
estimate (see Boxes 8.7 and 9.6)	
                         C-2

-------
                         APPENDIX C: BLANK WORKSHEETS


                                  WORKSHEET!  Sampling Weils

	See Section 32 in "Statistical Methods for Evaluating the Attainment of cleanup Standards". Volume 2
|	SITE:	

Sample   -
  Well
Number      Describe each sampling well to be used to assess attainment
   w
Decision Criteria: Wells assessed (Checked one)  Individually D As a Group D

Use the Sampling Well Number (w) to refer on subsequent sheets to the sampling wells described
above.

Attach a map showing the sampling wells within the waste site.

Date Completed:	                 Completed by	

Use additional sheets if necessary.                                       Page	of	


Continue 10 WORKSHEET 2 if wells are assessed individually.
Continue » WORKSHEET f if wells are assessed as a group.
                                      C-3

-------
                          APPENDIX C:  BLANK WORKSHEETS


                     WORKSHEET 2 Attainment Objectives for Assessing Individual Wells
          See Chapter 3 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
            SUE:
          Numbers in square brackets [] refer to die Worksheet from which the information may be obtained.

Sample Design (Check one): Fixed Sample Size D  Sequential Sampling D
                     Probability of mistakenly declaring the well(s) clean = a =

            Probability of mistakenly declaring the weU(s) contaminated = (3 =
                                                               If Mean,
                                                                Enter
                                         Cleanup    Parameter  alternate
                          If %rile. Enter;
                             Critical
                                    IQT
                           alternate/null
Chemical Chemical
Number Name
c








Standard to test
(with units) Check one
Cs




MeanU
%dle D
Mean a
%tile D
Mean D
%tile D
Mean D
% tile D
hypothesis
mean




hypothesis
null alternate
Po Pi








Sample Collection Procedures to be used (attach separate sheet if necessary):
•
Secondary Objectives/ Other purposes for which the data is to be collected:
Use the Chemical Number (c) to refer on other sheets to the chemical described above.
Attach documentation describing the lab analysis procedure for each chemical.
Date Completed:	
Use additional sheets if necessary.
Completed by.
                        Page.
.of.
Continue to WORKSHEET 3 if a Fixed sample size test is used; or
Continue to WORKSHEET 4 if a sequential sample test is used.
                                        04

-------
                         APPENDIX C: BLANK WORKSHEETS
      WORKSHEET 3 Sample Sue When Using a Fixed Sample Sue Test for Assessing Individual Wells
    Sections 8.2 in 'Methods for Evaluating the Attainment of Cleanup Standards". Volume 2	"
            SITE:
         Numbers in squire brackets (] refer to the Worksheet from which the
         Probability of mistakenly declaring the site dean [2]» a ••
                                       nay be obtained.
                                                 From Table A.Z
Probability of mistakenly declaring the site contaminated [2]» 0 = |        \zl-b = I

                              Number of samples per year « n » j_

            Variance factor from Table A.5, Appendix A = F1  »

For testing the mean concentration
 Chemical    Cleanup               Standard Deviation     CpMTUtiEr
Number [2]  Standard[2]       (21      of yearly mean
     c          Cs           MI            ft          B»
I                                                                      (based on calculations
                                                                      described ir
                                                   in Section 8.2)
                                                                   :T
For  testing the  proportion of contaminated wells or samples
 Chemical    Cleanup                                     Calculate:
            Sundard[2]
               Cs
Number [2]
    c
 121
_Po_
 12]
_Pj_
                         B
                                                                  (1-PQ
















Column Maximum, (Maximum of nu values ) = C = 1




Round C to next largest integer=Number of years of sample collections m=

                                   Total number of samples = nm = N =

Date Completed:	                  Completed by	
Use additional sheets if
Continue to WORKSHEET 4
ary.
                                                                      Page.
                                                                               .of.
1 An estimate of +. the serial correlation, is necessary to determine the appropriate value of F. Worksheets IS and
  16 can be used to estimate +.
                                      C-5

-------
                         APPENDIX C: BLANK WORKSHEETS
        WORKSHEET 4 Data Records and Calculations When Assessing Individual Wells; by Chemical. Well,
                                            and Year
 Se« fhanier it or Q in "Methods for Evaluating the Attainment of Cleanim Standards" Vol 2
SITE:
CHEMICAL:
WELL:
YEAR:
NUMBBt(c) AND DESCRIPTION [2]
raWBER(w) AND DESCRIPTION [I]
NUMBER(K)
         Numbers in square brackets Q refer to the Worksheet from which the information may be obtained.
Sample Design (Check one):  Fixed Sample Size O  Sequential Sampling D
                              Parameter to be tested [2] (Check one) =
                                   Number of samples per year [3] =
                    Number of samples with nonmissing data in year •
                                              Cleanup standard(2] =

         Concentration used for observations below the detection limit •
                                                       Cs=
"Season"
Number
 j within
 dusk*
  year
Sample
  ID
 Sample
Collection
date/time
Reported
Concen-
 tration
 Concentration
 Corrected for
Detection Limit
      A
Is A Greater
  thanCs?
   l-Yes
   0-No
     B
                                                                MeanU
                                                                %tik D
   Data for
   analysis
Xjfc = A if Mean
Xjk = B if %tile




























.











.











































Total of X£ for this year = C = |
Mean of x* for this k* yea
'-ST-^r- 	
Date Completed:
                                     Completed by,
Use additional sheets if necessary.
Complete WORKSHEET 4 for other <
                                                            Page.
                                                          .of
                           , yean, and wells; otherwise.
Continue to WORKSHEET 5 if a fixed ample size test is used: or
Continue to WORKSHEET 7 if a sequential sample test is used.

-------
                        APPENDIX C: BLANK WORKSHEETS
        WORKSHEET 5 Data Calculations for a Fixed Sample Site Test When Assessing Individual Wells; by
                                       Chemical and Well
jee Chanter 8 in 'Methods for Evaluating the Attainment of Cleanup St

            SHE:
     CHEMICAL:
                  NUMBER(C) AND DEKBVTION [2]
          WELL:
                  NUMHR(W) AND DBOOPnON [1J
         NumlMn m Bqiura braduu U rote lo tf» W<
                                from which OIB
                                                   nuy
               Ye
             Number
               Mean
              for the
              year [4]
Total fa
spage
(if man dun one Worksheet
5 wed)

Column Totals:
Date Completed:.
                           B
Use additional sheets if necessary.

Complete WORKSHEET 5 for other chenu
                                Completed by,
                                                       Page.

                           welb or continue to WORKSHEET *
.of.
                                      C-7

-------
                         APPENDIX C: BLANK WORKSHEETS
        WORKSHEET O Inference for Fixed Sample Sizes Tests When Assessing Individual Wells, by Chemical
                                            and Well
 See Chapter 8 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2	
            SITE:
     CHEMICAL:
                   NUMBER(C) AND DESCRIPTION [2]
           WELL:
                   NUMBBK(W) AND DBSdtirTlON [1]
          Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.


                                         [2] O-

                                       [2]  CS-
                           Number of Years 0]
                     Sum of the yearly means [5]


              Sum of the squared yearly means [S]
                      Overall mean concentration ••
Standard Deviation of the yearly means •


           Degrees of Freedom for sf=
  Critical value from table of the t-distribution
   (Table A. 1) for specified values of (1 -a) and Df
              Standard Error for the overall mean
Upper One Sided Confidence Interval
                                         m


                                           *k


                                         (Xk)2

                                         A
                                         m
                                                  A


                                                  B
                                                      m-1
                                                  m-1
                                                  Df
                                                   Vm
                                                    tSx
                                                       m
      If |iua< Cs then circle Clean, otherwise circle Contaminated:  |^^™^^^^^^^^^^^™
              Based on the mean concentration, the sampling well is: I Clean  Contaminated
Date Completed:
                                   Completed by
Complete WORKSHEET 6 for other chemicals and wells
                                                                      Page	of	
                                      C-8

-------
                        APPENDIX C: BLANK WORKSHEET'S
        WORKSHEET 7d Data Calculations for a Sequential Sample When Assessing Wells Individually; by
                                       Chemical and Well
See fhaitter 9 in "Vfethmta for F.vahiatinir the Attainment of C*leMum Snndmls" Volume 2
            SITE:
     CHEMICAL:
                  NUMBOt(c) AND DESCRIPTION [2]
          WELL:
                  NUMMKw) ANDOaOUPnON [1}
         Numben in *yun bnck>u Q rate to the Woctoh
                                                      nuy be obtained.
                                               Cleanup standard!?]« Cs >
                                                   Almnatc mean • lit:

                Probability of mistakenly declaring the well(s) dean [2]» a =
        Probability of mistakenly declaring die well(s) contaminated [2] = P >
  Year
Number
   [4]

 korm
 Yearly
Average
  [4]
Cumulative
Sum of x^
 (Ao-0)
Cumulative      Mean
Sum of x^     (average of
 (Bo»0)    yeariy averages)
  Standard
Error of Mean
Date Completed:
                                   Completed by,
Use additional sheets if necessary.

Complete WORKSHEETS 7a rad 7b for other chemicab «d wells
                                                          Page.
         .of.
                                      C-9

-------
                        APPENDIX C: BLANK WORKSHEETS
        WORKSHEET 7u Data Calculations for a Sequential Sample When Assessing Wells Individually; by
                                       Chemical and Well
                     or Evaluating the Attainment of Cleanun Standards'* Volume 2
           SITE:
    CHEMICAL:
                  NUMBER(C) AND DESCRIPTION [2]
          WELL:
                  NUMUBXw) AND DESCXDT10N [1]
         Numbers in iquve brackeu Q refer to UM Workshi
                                       vhichthe
                                                                nuy be obtained.
 Year
Number
  [4]
   m
                                               Critical
,
   Hi-Cs   xm.
Likelihood
   ratio
   LR*
                                                clean
                                                   R
                                                «-=-
                                                  1-a
   Critical          Decision:
   value:         clean LR > B,
contaminated  contaminated LR £ A,
      i.ft        or no decision
*m      (9 ro-2^ I   m   "1
*LR = ex^8—tA/^l-5j
If "no decision", collect another years' allotment of samples and test the hypothesis again.

Date Completed:      -	                  Completed by ______=-

Use additional sheets if necessary.

Complete WORKSHEETS 7a and 7b for other chemicals and wells
                                                          Page.
                                                                              .of.
                                      C-10

-------
                       APPENDIX C: BLANK WORKSHEETS
       WORKSHEET 8 Attainment Objectives When Assessing Wells as a Group
See Chapter 3 in "Methods for Evaluating the Attainment of CTfanup Sfr*¥frffds". Volume 2
           SITE:
         Numbm in xpun
U rate to HH World
which
nuy t» obtained.
Sample E
Pi
Cbemica
tobeteste
number
c




tetign (Check one): Fixed Sample Size LJ Sequential Sampling LJ

Probability of mistakenly declaring the well(s) don ™ a ™

obability of mistakenly declaring the well(s) contaminated — P -

If me
enter
I Cleanup Parameter alten
sd Chemical standard to test: hypo
name fwith nnitx^ Ctttrlr m^ Mti
CS \L]
MeanU
MaxD
MeanU
MaxD
MeanU
MaxD
MeanU
MaxD



san. If mean,
the enter nie
inflp alternate
A- hypoth-
s esis
Max,




Sample Collection Procedures to be used (attach separate sheet if necessary):

•

Secondary Objectives/ Other purposes for which die **nf* is to be collected'



Use the Chemical Number (c) to refer on other sheets to the chemical described above.
Attach documentation describing the lab analysis procedure for each chemical
Date Completed: ^^^^^^^^_                  Completed by ^^^^^_^^___
Use additional sheets if necessary.                                      Page_
Continue to WORKSHEET 9 if a fixed sample size test is used; or
Continue to WORKSHEET 10 if a sequential sample test is used.
                                                .Of.
                                    C-ll

-------
                         APPENDIX C: BLANK WORKSHEETS
     WORKSHEET 9 Sample Sue When Using a Fixed Sample Size Test for Assessing Wells as a Group
See Section 8.2 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2	
            SITE:
            tJA AArf.	^	__^_.^_^_	
         "NufnooTin square brackets [] refer to the Worksheet from which die information may bTobttmed.
         Probability of mistakenly declaring the site dean [8]» a=

 Probability of mistakenly declaring the site contaminated 18]«(J s

                              Number of samples per year « n '

             Variance factor from Table A.5, Appendix A = F1 =

For testing the  mean concentration
 Chemical    Cleanup               Standard Deviation      Calcul
Number [8]  Standard[8]        [8]         of mean
     c          Cs           Ui            &           B
                                                                            Prom Table Ai
                                                                             Appendix A
                                                                       •l-a=[
                                                                      tttmmfA rift
                                                                          ibed in Section &2)
                                                                                a2
For  testing the maximum  concentration across all wells
 ~~     '                          Standard Deviation      Calculate:
Number [8] Standard[8] . [8] of yearly mean












/Cs-Maxif
V2l-a+zi^j



«V,-^ + 2



                       Column Maximum, (Maximum of m^ values) = C =

Round C to next largest intcger=Number of years of sample collections m=

                                    Total number  of samples = run = N =

Date Completed: _                  Completed by
Use additional sheets if necessary.
Continue to WORKSHEET 10
                                                                      Page.
.of.
1 An estimate of +, the aerial correlation, is necessary to determine the appropriate value of F. Worksheets IS and
  16 can be used to estimate+.
                                      C-12

-------
                        APPENDIX  C:  BLANK WORKSHEETS
         WORKSHEET 10 Data Records for a* Individual WcU and Calculations When Assessing Wells as a
                                 Group; by Chemical. Well and Year
SM Chanter 8 or 9 in "Methods for Evaluating the Attainment of CfeannD Standards". VoL 2
            SITE:
     CHEMICAL:
          WELL
          YEAR:
         Nombn in iquare bnckett 0
                  I nuy bt obcunod.
                                       Parameter to be tested (Check one) •
                                         Number of samples per year = n
                Concentration used for observations below the detection limit •
                           Menu
                            MaxD
Sample Reported Concentration
"Season" Sample Collection Concen- Corrected for
Number ID time nation Detection Limit










,

















































Date Completed:.
Completed by,
Use additional sheets if necessary.                                      Page	of.
Complete WORKSHEET 10 for other chemicals, yean, and wdls or continue ID WORKSHEET 11
                                     C-13

-------
                        APPENDIX C: BLANK WORKSHEETS


        WORKSHEET 11 Data Records and Calculations When Assessing Wells as a Group; by Chemical and
                                             Year
SM Chant** R or 9 in "Methods for Evaluating the Attainment of CteaniiD Standards"  VoL 2
            SITE:
     CHEMICAL;
                  NUMBER(c) AND DESOUPnON [8]
          YEAR:
                  NUMBOI(K)
         Numben in square brackets Q refer to (he Worksheet ten which the infonnatioa nay be obtained.

Sample Design (Check one): Fixed Sample Size O  Sequential Sampling D
                                       Parameter to be tested (Check one)
                                      Number of samples per year [9] = n
                                    Mean U
                                      MaxD
  "Season"    Well#__    Well#_
 NumbefllO]    [101         [10]
     j
Well#_   WeU#_
  [101       [101
         Measure for
           analysis
Well #	(row maximum
 [101    or row mean)
                                        Total of Xj for this year » A
                                                          _   A
                                  Mean of x^ for this year = xk » — = I
Date Completed:	
Use additional sheets if necessary.
          Completed by.
Complete WORKSHEET 11 for other chemicals; otherwise.
Continue to WORKSHEET 12 if a fixed sample size lea is used; or
Continue to WORKSHEET 14 if a sequential sample test is used.
                                page-of-'
                                     C-14

-------
                        APPENDIX C:  BLANK WORKSHEETS
        WORKSHEET 12 Data Calculations for a Fixed Sample Site Tea When Assessing Wells as a Group;
                                         by Chemical
See ChaMer 8 in "Methods for Evaluating the Attainment of Cleanup St"vfardi'. Volume 2
            SITE:
     CHEMICAL:
                                                                i Buy to
               Yea-
             Number
Total from p
(if more dm one copy of
Woriuheet 12 is necessary)

Column Totals:
Date Completed:
                             Man
                            for the
                           yetr[ll]
                                             (Xk)2
                                              Completed by.
Use additional sheets if necessary.                                       Page.

Complete WORKSHEET 12 for other chemicals or continue to WORKSHEET 13
                                                                             .of.
                                     C-15

-------
                         APPENDIX C: BLANK WORKSHEETS
          WORKSHEET 13 Inference for Fixed Sample Sizes Tests When Assessing Wells as a Group; by
                                           Chemical
See Chapter 8 in "Methods for Evaluating the Attainment of Cleanup Standards'. Volume 2	
            SITE:
     CHEMICAL:
                   NUMBER(c) AND DESCRIPTION [8]
         Numben in square brackets Q refer to die Worksheet from which the information mey be obtained.
                                                 I 
                                      [8] C§>
                          Number of Years [9] >
                   Sum of the yearly means [12] >
           Sum of the squared yearly means [12] >
                    Overall mean concentration=

          Standard Deviation of the yearly means •
                    Degrees of Freedom for s» =
Value from table of T-distribution (Appendix A.1)
           for specified values of (1 - a) and Df=
             Standard Error for the overall mean =
            Upper One Sided Confidence Interval
                then circle Qean, otherwise circle Contaminated:
m
A
B
x
                                                   Vm
                                                               'm
            us then circle dean, otherwise circle Contaminated:                    ;
             Based on the mean concentration, the sampling well is: I Clean   Contaminated
Date Completed:
                                             Completed by.
Complete WORKSHEET 13 for other chemicals
                                                                      Page.
                                                                              .of.
                                      C-16

-------
                         APPENDIX  C: BLANK WORKSHEETS
        WORKSHEET 14(1 Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
                                            Chemical
See Chanter 9 in "Methods for Evaluating the Attainment of Cleanun Standards". Volume 2
            SUE:
     CHEMICAL:
                   NUMBBI(C) AND DESCRIPTION [8]
         Numben in square brackets Q refer to the Worksheet bom which the information may be obtained.
                                                 Cleanup standard[8] - Cs •
                                                     Alternate mean »|it •

                 Probability of mistakenly declaring the well(s) dean [8] = a
         Probability of mistakenly declaring the well(s) contaminated [8] = P
 Year
Number
 [11]

k or m
 Yearly
Average
  [11]

   xk
Cumulative
Sumofxk
 (Ao = 0)
                                         Cumulative      Mean
                                         Sum of x?    (average of
                                          (Bo«0)  yearly averages)

                                                        xm
   Standard
Error of Mean
                                                                                (k-l)k
Date Completed:
                       Completed by.
Use additional sheets if necessary.

Complete WORKSHEET 14a and 14b for other chemicals and groups of wells
                                              Page.
                                                                     of
                                       C-17

-------
                         APPENDIX C: BLANK WORKSHEETS
        WORKSHEET 14b Data Calculations for a Sequential Sample When Assessing Wells as a Group; by
                                            Chemical
See Chapter 9 in "Methods for Evaluating the Attainment of Cleanup Standards". Volume 2
            SITE:
     CHEMICAL:
                   NUMBER(C) AND DESCRIPTION [VJ
          Numben in square brackets U refer to the Worksheet from winch the mfomtuion nuy be obtained.
  Year
 Number
   [4]
    m
                 'm
                          s*m
Likelihood
   ratio
   LR*'
Critical     Critical          Decision:
 value:      value:         clean LR > B,
 clean   contaminated contaminated LR £ A,
     P         l_jj        or no decision

    1-a         a
A
-------
                         APPENDIX C:  BLANK WORKSHEETS
         WORKSHEET 13  Removing Seasonal Patterns in the Data (Use as First Step in Computing Serial
                                          Correlations)
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
            SITE:
     CHEMICAL:
                   NUMBHU.C) AND DESCRIPTION (2 OR Bj
          WELL:
                   NUMBER(w) AND DESOUfTION [1]
         Numbers in square brackets [] refer to the Worksheet from which the information may be obtained.
                                                         Number of
"Season"     Measurements for each "season" for year k       years with   Row      Row
Number Yr=_   Yr=__   Yr*__   Yr»__   Yr=_     Data     Total     Mean
Corrected measurements with seasonal patterns  removed
aeasoi
Nurnbe
j












i ixnrecied
r Yr=
Xfc-Xj












iivieasurcni
Yr=
Xft-Xj












cnc ror eac
Yr=
Xfc-X,












n season
•Yr*
Xfc'Xi












tor year K
Yr«
Xft-Xi












                                              Completed by.
Date Completed:	
Use additional sheets if necessary.
Complete WORKSHEET 15 for other chemicals
Continue to WORKSHEET It if serial correlations are being computed.
                                                                      Page.
.of.
                                     C-19

-------
                         APPENDIX C: BLANK WORKSHEETS
       WORKSHEET 16 Calculating Serial Correlations
See Sections 8.4 and 9.4 in "Methods for Evaluating the Attainment of Cleanim Standards" Vol. 2
            SITE:
     CHEMICAL:
                   NUMBER(c) AND DESOUPnON [2 OR 8]
          WELL:
                   NUMBER(w) AND DESCRIPTION [I ]
         Number* in square brackets [] refer to the Worksheet bom which the information may be obtained.
                                                               Year*k

                                 Period between well samples in months = t •
   Data
 Numbers

    jk
(season within
  yeark)
Residual
  [15]
Product
         Totals from previous page •
              (if more than one
              Worksheet 16 is used)


                   Column Totals •
Estimated Serial Correlation based on the data •
                                          !B =
                                  B
Serial Correlation between monthly observations * $ = ($obs)' x     L
Date Completed:	

Use additional sheets if necessary.

Complete WORKSHEET 16 for other chemicals
                     Completed by.
                                            Page.
                                       .of.
                                      C-20

-------
                         APPENDIX C:  BLANK WORKSHEETS
       WORKSHEET 1R. Basic Calculations for a Simple Linear Regression
See Section 6 1 in "Methods for Evaluating the Attainment of Cleanun Standards" Vol 2
            SITE:
     CHEMICAL:
                   NUMBER(C) AND DESdUPriON [2 Oft 8]
          WELL:
                   NUMBE&4W) AND DESCRIPTION (IJ
         Numbers in squere bracken [] refer to the Worksheet boon which the mfacnution m«y be obtained.
         Concentration
Sample  Corrected for
Number Detection Limit
             Concentration used when no concentration is reported ••

                             Number of collectable samples * N ••

                            Transformed
                                Time
                              Variable
Toads
from orevious orals):
1
1 1 1

Column Totals:
        JA.
           B
             A « Zyn      B = Zyn
Corrected Sum of Squats and Cross Products:
                              C»Zx.
                    = Iynxr
 _   A
-   C
                         S   .B
                         Syy - B- J^
                                           S,,»D-
xx- "-T"
                      AC
Date Completed:
                                  Completed by
Use additional sheets if necessary.                                        Page	of.

Complete WORKSHEET 1R for other chemicals or continue to WORKSHEET 2R.
                                      C-21

-------
                        APPENDIX C: BLANK WORKSHEETS
                       WORKSHEET 2 R Inference in a Simple Linear Regression
 See Section 6.1 in "Methods for Evaluating the Attainment of Cleanup Standards". Vol. 2
            SITE:
     CHEMICAL:
                  NUMBER(C) AND DESCRIPTION [2 OR 5]
          WELL:
                  NUMBER(w) AND DESOtimON11J
Syy [1R] -
S« [1R] =
Syx[lR] =
Type 1 error
probability
a =




1


         Numben in square brackets [] *efer "> *« Worksheet from which the information may be obtained.

Estimating  Regression Coefficients                         	
                          Number of collectable                       I
                                     samples [1R]  =     N     =  |

                                   Meanofyt[lR]

                                   Meanofxt[lR]


                            Estimated slope [1R], b}


                         Estimated Intercept [1R], bo


                Sum of squares due to error [1R], SSE


                            Degrees of freedom, Df

 Critical value from table of t-distribution (Appendix A.1)
                 for specified values of(l-j) and Df

                           Mean Square Error, MSE
                                                         y

                                                         *

                                                        Syx


                                                     y-(bi*x)
                                                        N-2
                                                         t      =
                                                        SSE
                                                       N-2    -
                                                         MSB
                   • Standard Error of the Slope, s(bt) =
                Upper Two Sided Confidence Interval
                            for Slope: fy * t * sO^) =

                       Lower Two Sided Confidence
                      Interval for Slope: bj -1 * s(b}) =

Calculating  Prediction Limits
                    Value of x, at which concentration is to be predicted;

                                      Predicted value, $ =b0 + bx,:
Standard Error of Predicted Value
                                                         'XX
         Upper Two Sided Confidence Interval for Prediction = y +1 *Sy

         Lower Two Sided Confidence Interval for Prediction = y -1 *Sy

Date Completed:	                 Completed by _^__

Use additional sheets if necessary.
Complete WORKSHEET 2R for other chemicals
                                                                   Unner
                                                                    Lower
                                                                   Page.
.of.
                                     C-22

-------
                   APPENDIX D:  MODELING THE  DATA
              A model is a mathematical description of the process or phenomenon from
which the data arc collected. A model provides a framework for extrapolating from the
measurements obtained during the data collection period to other periods of time and
describing the important characteristics  of the data. Perhaps most importantly,  a model
serves as  a formal description of the assumptions which are being made about the  data.
The choice of statistical method used to analyze the data depends on the nature of these
assumptions.

              The results of the statistical analysis may be sensitive to the degree  to which
the data adhere to the assumptions of the analysis.  If the statistical results arc  quite
insensitive to the validity of a particular assumption, the statistical methods arc said  to be
"robust" to departures from that assumption. On the other hand, if the results are sensitive
to an assumption so that the results may be substantially incorrect if the assumption does
not hold,  the validity of that assumption should be  checked before the  results of the
analysis arc used or given credence.

              After steady state conditions have been reached, the model assumed to
describe the ground water data is the equation in Box D.I.

              The laboratory measurement, xtcw, will be expressed in measurement units
selected by either the lab or the management of the cleanup effort. All terms in the model
equation must have the same units. The samples on which the measurements are made can
be identified by the time and location of collection. In the model above, the  location is
indicated by the well identifier w.  For wells in which samples are collected at different
depths or by different sampling  equipment,  a more extensive set of identifiers and
subscripts will be required.  If the parameter being tested represents a group of wells  (e.g.,
an average concentration in several wells), xtcw represents the combined measure and w
refers to the group of wells.
                                       D-l

-------
                      APPENDIX D: MODELING THE DATA
                                     Box D.I
                                Modeling the Data

       The model assumed to describe ground water data after steady-state
       conditions have been reached is:

                       xicw "^cw + Su(t)cw +ztcw +£&?,          (D.I)
       where
       xtcw      =  kb measurement of chemical c for the sample collected at
                    rime t for well w.

       |ACW      =  long- term (or short-term) average concentration for chemical
                    c in well w.

       ^u(t)cw   =  a seasonal pattern in the data for concentration of chemical c
                    in well w, assumed to repeat on a regular cycle.   The
                    subscript u(t) designates the point in time within the cycle
                    when the sample was collected. In most situations the term
                    Su(tvw will correspond to a yearly cycle associated with
                    yearly patterns in temperature and precipitation.

       ztcw      =  serially correlated normal error following an auto-regressive
                    model of order one (Box  and Jenkins, 1970). (Note:
                    seasonal auto-correlations are assumed to be negligible after
                    the seasonal  cycles (Su/t\cw) have been removed).   The
                    correlation, p, between two measurements separated by time
                    t (in months) is assumed to be p »  Rl where R is the
                    correlation for measurements separated by one month.

       Etcw      = independent normal errors.
             This model for the data assumes that the average level of contamination is
constant over the period, of concern (either a short or very long period).  However, the
actual measurements may fluctuate around that level due to seasonal differences, lab
measurement errors, or serially correlated fluctuations (described below). The purpose of
the statistical test is to decide if there is sufficient evidence to conclude that \if^If is less than

the cleanup standard in the presence of this  variability.
                                      D-2

-------
                      APPENDIX D: MODELING THE  DATA

              Because the primary cyclical force affecting the ground water system is
climatic, in most situations the seasonal term will have a period of one year. In some
climates there are two rainy seasons and two dry seasons, possibly resulting in a seasonal
pattern  of a half year.  The connection between the seasonal pattern in the ground water
concentrations and the  climatic changes may be-complex such that both patterns may have
the same period; however,  the shape of the patterns, the relative times of maximum rainfall
or the maximum or  minimum concentration may differ.

              Ground water concentrations at points close together in time or space are
likely to be more similar than observations taken far apart in time or space.  There are
several  physical reasons why this may be the case.  In statistical terms, observations taken
close together are said to be  more correlated than observations taken far apart

              The serial correlation of observations separated by a time difference oft can
be denoted by p(t), where p is the Greek letter rho (p).  A plot of the serial correlation
between two observations versus the time separating the two observations is  called an  auto-
correlation function. The model above assumes that the autocorrelation  function has the
shape shown in Figure  D.I, which is described by the equation in Box D.2.
                                      Box D.2
                              Auto&relation Function
                                           Rl                          (D.2)
       where R is the serial correlation for measurements separated by a month,
       and t is the time between observations in months.
              If the serial correlation of the measurements is zero, the data behave as if
they were collected randomly.  As the correlation increases, the  similarity of measurements
taken close together relative to all other measurements becomes more pronounced.  Figure
D.2 shows simulated data with serial correlations of 0.0,  0.4 and 0.8.  Serial correlations
are always between -1 and 1. However, for most environmental  data, serial correlations are
usually between 0 and 1, indicating that measurements taken close together in time will be
more  alike than measurements taken far apart.
                                        D-3

-------
                      APPENDIX D: MODELING THE DATA
Figure D. 1    Theoretical Autocorrelation Function Assumed in the Model of the Ground
              Water Data
                              The between observations
              Many common statistical procedures will  provide incorrect conclusions if an
existing correlation  in the data is not properly accounted for. For example, the  variability in
the data may be inappropriately estimated. Proper selection of a simple random sample for
estimating the mean guarantees that the errors are uncorrelated. However, when using a
systematic sample (such as for ground water samples collected at regular intervals),  the
formulae based on a random sample provide a good estimate of the standard error of the
mean only if there is no serial correlation. With serial correlation, a correction term is
required. For the autocorrelation function assumed  above, the  correction term increases  the
standard error of the long-term mean and decreases it for the short-term mean.

              The autocorrelation function can have many different shapes;  however, in
general, correlations will decrease as the time between observations increases. If the
samples are taken farther apart in time, the correction becomes less  important.

              The error term, etcw, represents errors resulting from lab measurement
error and other factors associated with the environment being sampled and the sample
handling procedures.

-------
                     APPENDIX D:  MODELING THE DATA
Figure D.2
Examples of Data with Serial Correlations of 0, 0.4, and 0.8. The higher
the serial correlation the more the distribution dampens out
                                    Serial Correlation «0
                                         Tune
                                    Serial Correlation = 0.4
               f
                                         Time
                                    Serial Correlation = 0.8
                                         Tune
                                     D-5

-------
                       APPENDIX D: MODELING  THE DATA
              Different models may be used to describe the data collected during the
 treatment phase and the post-treatment assessment phase because either (1) the
 characteristics of the data will be different, or (2) different information about the measured
 concentrations is of interest.  The statistical procedures discussed in Chapter 6 to be used
 during treatment are therefore different from those discussed in Chapters 8 and 9 for
 assessing attainment of the cleanup  standards.

              There are two terms  which have been excluded from the model above and
 could be used to model ground water concentrations in some situations. These are a slope
 (or trend) term and  a spatial correlation term.

              In many situations it is reasonable to assume that the general level of
 contamination is  either gradually  decreasing or gradually  increasing. It may be desirable to
 assume a functional form for  this change in concentration. For example, the concentration
 may be considered to be decreasing linearly a exponentially. A revised model with a linear
 trend term is presented in Box D.3.

              If the slope is  not zero, as in the model in Box D.3, then the ground water is
 not at  steady state.  If the slope is positive, the concentrations are increasing over time. If
 the slope is negative, the concentrations are decreasing over time. If concentrations are
 below the cleanup standard and are increasing over time, the ground water may be judged
 to attain the cleanup standard; however the cleanup standard may not be attained  in the
future as concentrations increase. Therefore, the ground water in the sampled wells will be
judged to attain the cleanup standard only if (1) the selected parameter is significantly less
 than the cleanup standard, and (2) the concentrations are not increasing. This decision
 criteria is presented  in Table D.I.

              The model in Box D.3 does not include spatial correlation. In this
 guidance, it is assumed that the results from different wells (or different depths in the same
 well) are combined using criteria developed based on expert knowledge of the site  rather
 than by fitting statistical models. For this reason a spatial correlation has not been
 included.
                                        D-6

-------
                     APPENDIX D: MODELING THE DATA
                                   Box D.3
                      Revised Model for Ground Water Data

       A revised model with a linear trend term would be:
       where
Jcw
                  xtcw
                                                           (D.3)
                the change in concentration over time for measurements of
                chemical c in well w.

                the concentration of chemical c in well w at time zero, usually at

                the beginning of sampling. Note that o^ * ^w if Pcw • 0.
Table D.I     Decision criteria for determining whether the ground water concentrations
             attain the cleanup standard
Test for parameter (mean or
percentile) less than the cleanup
standard (Equation D.2)
Parameter is significantly less
than the cleanup standard
Parameter is not significantly less
than the cleanup standard
Test for significant slope Pew (Equation D.3)
Pew significantly greater
than zero
Ground water is
contaminated
Ground water is
contaminated
Ptw not significantly
greater than zero
Ground water from the
tested wells attains the
cleanup standard
Ground water is
contaminated
                                     D-7

-------
APPENDIX D: MODELING THE DATA
             D-8

-------
  APPENDIX  E:  CALCULATING  RESIDUALS  AND SERIAL CORRELATIONS
                                    USING SAS1
             Several statistical programs can be used-to make the calculations outlined in this
guidance document. Although these programs can be used to perform the required calculations,
they were not specifically designed for the application addressed in this document.   Therefore,
they can only be used as a partial aid for the procedures presented here. Only one of the many
available statistical packages, SAS, will be discussed below in the example. This example makes
no attempt to thoroughly introduce the SAS system, and no endorsement of SAS is implied. Help
from a statistician or  programmer familiar  with any software being used is strongly recommended.

             The basic quantities discussed in the Sections 5.2.3 and 5.2.4 can be calculated
using one of several statistical procedures  available  in SAS. Among them ate PROC GLM, PROC
ANOVA, and PROC REG (see  SAS Users Guide: Statistics, SAS Institute, 1985). All of these
procedures require specifying a linear model and requesting certain options in the MODEL
statement A SAS data set containing the data to be used in the analysis should first be created (see
SAS Users Guide: Basics, SAS Institute, 1985). In the data set, the observations should be listed
or sorted in time order, otherwise the calculated  serial correlations will be meaningless.

             Given below is an example of a SAS program using PROC REG that will subtract
seasonal means  from the observed concentration  measurements and  calculate the required first
order serial  correlation of the residuals.

             PROC REG DATA = CHEM1;
                    MODEL CONC =  SEAS1 SEAS2 SEAS3  SEAS4/NOINT,DW;

             In the  program, CHEM1 is the  SAS data set containing the following variables:
CONC, the concentration measurement of the ground water sample;  TIME, a sequence number
indicating the time at which the sample was drawn; YEAR, the year the sample was drawn, and
PER, the period within the year in which the sample was drawn. For this illustration, data were
collected quarterly so that PER = 1, 2, 3, or 4. The variables SEAS 1 through SEAS4 are indicator
variables defined at a previous DATA step. For  each observation, these indicator variables are
defined as follows: SEAS1 = 1 if PER = 1, and is 0, otherwise; SEAS2 = 1  if PER = 2, and is 0
'Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

                                         E-l

-------
 APPENDIX E: CALCULATING RESIDUALS AND SERIAL CORRELATIONS USING SAS

otherwise; SEAS3 = 1 if PER = 3, and is 0, otherwise; and SEAS4 = 1 if PER = 4, and is 0,
otherwise. Creation of these indicator or "dummy" variables is required if PROC REG is used.
On the other hand, dummy variables arc not required for PROC ANOVA or PROC GLM.  Note
that in this example, the variable TIME is not included as an independent variable in the model.

             The model statement specifics the form of the linear model to be fitted. In the
example, CONC is the dependent variable  and SEAS1 through SEAS4 are the independent
variables. The reason for specifying this particular model is to have the seasonal means subtracted
from the observed concentrations. NOINT is an option that specifics that a "no-intercept model" is
to be estimated.  Other models can also be used to produce the required residuals,  but they will not
be discussed here. Finally, DW is the "Durbin-Watson" option, which requests that the Durbin-
Watson test (see Section 5.6.1) and the serial correlation of the residuals be calculated. The output
from the above computer run will look like:
             DEP Variable:  CONC
SOURCE
MODEL
tfKJH
OF
4
12
SUM OF
SQUARES
580.455
1.656
MEAN
SQUARE
145.114
0.138
                                                          FVALUE

                                                          1051.355
                                PROB>F

                                0.000
                   ROOTMSE
                   DEP MEAN
                   C.V.
0.3715
5.995
6.197
R-SQUARE
AOJR-SQ
0.997
0.996
VARIABLE
SEAS1
SEAS2
SEAS3
SEAS4
CF
1
1
1
1
PARAMETER
ESTIMATE
6.778
6.025
5.134
6.042
STANDARD
ERROR
0.186
0.186
0.186
0.186
T FOR HO:
PARAMETER-0
36.490
36.490
36.490
36.490
PROB>|T|
0.000
0.000
0.000
0.000
             DURBIN-WATSON D                 2.280
             1ST ORDER AUTOCORRELATION       -.184
                                        E-2

-------
 APPENDIX E:  CALCULATING RESIDUALS AND SERIAL CORRELATIONS USING SAS
             The first part of the output (identified by the heading SOURCE, DF, SUM OF
SQUARES, etc.) is referred to as the "analysis of variance table." In the "MEAN SQUARE"
column of the able corresponding to the row titled "ERROR," is the mean square error, s*. In the
example output, s£ » 0.138.

             The second pan of the output gives the "PARAMETER ESTIMATES" for each of
the four indicator variables, SEAS1 to SEAS4. Because of the way these variables were defined,
the parameter estimates are actually the seasonal means, Xt, X2» *3> and X4, respectively. These
seasonal means are used to calculate the residuals, e,, as defined in equation (5.8). The last line of
the output shows the serial correlation of the residuals as computed from equation (5.14), viz.,
$obc  = --184.  From Neter,  Wasserman, and Kutner  (1985), du -  1.73, for N * 16 (16
observations) and p -1 = 3 (where p is the number of variables in the model). Since D » 2.28 >
1.73, it can be assumed that there is no autocorrelation in the error terms of the model.

             As mentioned earlier, PROC GLM or PROC ANOVA can also be used to compute
the required statistical quantities.  The interested reader should refer to the SAS users manual for
more information.
                                       E-3

-------
APPENDIX E: CALCULATING RESIDUALS AND SERIAL CORRELATIONS USING SAS
                               E-4

-------
       APPENDIX F:  DERIVATIONS AND EQUATIONS
              This appendix provides background fop several equations presented in the
 document. This background is provided only far equations which cannot be easily verified
 in a standard statistical text. A simulation study provides the background for the sequential
 tests presented in Chapter 9. The simulation study was  supported by Westat. The last
 section of this appendix incorporates a technical paper prepared for publication which
 summarizes the simulations.
F. 1          Derivation of Tables A. 4  and A.5

              This  section outlines the derivation of Table A.4 for determining a
recommended  number of samples to take per year and  Table A.5 for obtaining variance
factors for use in determining sample size. Table A.4 is based on the assumption that the
number of samples per year will be chosen to minimize the total sampling costs while still
achieving the desired precision.  The assumptions on which the derivation is based arc
explained below. The values in Table A.5  follow directly from the calculations used to
obtain Table A.4.

              For a fixed sample size test, the cost of the sampling program can be
approximated by:
                               C = E + (Y+nS)m                      (F.I)
where
              c = the total cost of the sampling program;
              E = the cost to establish the  sampling program;
              Y = the yearly cost to maintain the program;
              S = the incremental cost to collect each sample;
              n = the number of samples per year and
              m = the number of years of sampling

This can also be written as:

                               C = E + S(R+n)m                      (F.2)
                                      F-l

-------
                 APPENDIX F: DERIVATIONS  AND EQUATIONS
 Where R = -5-.  Since E and S are constants, the total sampling cost can be minimized by
 minimizing (R + n)m subject to the constraint that the choices of n and m achieve the
 desired precision. The total number of samples collected is:
                                         nm
(F.3)
              Consider the hypothesis test where a mean is being compared to a standard
and assume that 1) the measurements are independent and 2) a normal approximation can
be  used. Then the following equation can be used to determine the required sample size:
Where:
              a2 = variance of the individual measurements;
              Cs = the cleanup standard to which the mean is being compared;

              Hi = the concentration on which the alternate hypothesis and (5 are based;

              o = the probability of a false positive decision if the true mean is Cs;

              3 = the probability of a false negative decision if the true mean is m ;

              zl-ct = .I"6 1 "a percentile point of the normal distribution; and
              Neff = the required number of independent observations.

                         o2
              Noting that v, — is the  standard error of the mean based on independent
measurements, equation (F.4) can be rewritten as:


                                                                      (F.5)
                               °nm      I

Where: 
-------
                 APPENDIX F: DERIVATIONS AND  EQUATIONS

              The problem is to select the combination of n and m such that equation (F.5)
is satisfied and the sampling costs are minimized.

              The values of n and m which satisfy equation (F.5) depend only slightly on
the values of a, p\ Cs, jit, and a2.  For the purposes of estimating the values in Table A.4
and A.5, the following assumptions  were used: a = .10, f) = .10, Cs = 1, m = .5, and
o2 * 1.0, resulting in Neff = 26.3.

              The following equation (derived in section F.2) can be used for Neff for the
mean of n observations per year collected over m years with a lag 1 serial correlation of .
                                            N(l-<>2) J

             Note that the serial correlation in equation (F.6) is the serial correlation
between successive observations. As the number of observations per year changes, $ will
also change. If $ is the serial correlation between monthly observations, then <(>

              The values in Tables A.4 and A.5 were calculated using the following
procedures:
              (I)    For selected values of O and n, calculate $ and use a successive
                    approximation procedure to determine m such that the criteria in
                    equation (F.6) are met
             (2)    The values in Table A.5 are -, or the effective number of samples
                    per year,
             (3)    For each calculation in step (1) and for selected values of R,
                    calculate the sampling cost using equation (F.2).

             (4)    Using all the sampling costs calculated for the selected values of 4>,
                    n, and R, determine the value of n which has the minimum sampling
                    cost Show this value in Table A.4.
                                      F-3

-------
                 APPENDIX F:  DERIVATIONS AND EQUATIONS

 F.2          Derivation of Equation  (F.6)

              A series of periodic ground water measurements following an auto-
 regressive (AR(1)) process can be described by the following equation (see Box and
 Jenkins (1970) for details):
                                          -i-H + Zt                    (F.7)

 where:
              xt = the measurement at rime t;

              |i = the long-term (attainment) mean concentration ;

              0 = the serial correlation between successive measurements;
              at = a random change from the measurement at time t-i to time t such that
                    xt • ^XM = at-  The &t are assumed to be independent and have a
                    mean of zero and a variance of e2; and
              Zt = the difference between the mean being estimated and the measurement
                    at time L The values z^ will have a mean of zero.

              The mean of N successive observations is

                          .  N-l          . N-l
                                              t-k = H +  -            (F.8)
                           k-0            k-0
             The variance of zl and z are derived below. Note that the variance of xt and
Zt are the same, written V(xt) = V(zt); also, V(x) = V(z).

             The following relationships  are used in the derivation of the variance:

                                 l+«> + «>2.K>3 + ...                   (F.9)
and
                                     F-4

-------
                APPENDIX F: DERIVATIONS AND EQUATIONS


F.2.1        Variance of zt

             The variance of zt is:

                                                                     (F.ll)
             Here E[ ] indicates the expected value of the term inside the brackets.

             Since Efz,] is zero, the variance can be written as;

                                 V(zt)»E[zt2]                        (F.12)


                                                                     (F.13)

             Since the expected value of all the cross product terms are zero (i.e.,
          for 1*0), they have been dropped from the summation.
             Since Efa^,] = e2,


                               ~~         1+$2+ $* + ..)             (F.17)
                                     F-5

-------
                APPENDIX F:  DERIVATIONS AND  EQUATIONS
             Using equation (F.9):
                                                                    (F. 18)
F.2.2        Variance of z
Note that z can be expressed as





                N-l        N-l  -




              NW)  '    NW> M>
                                                                    (F.19)
                                                                    (F.20)
             This last relationship is illustrated in the Table F.I for the case where N = 3.
             The variance of z is:
                                 = E[i2-E[z]2]
              Since E|_zJ is zero, the variance can be written as;
                                                        (F.21)
         V(z) =
                                                  f
                                                                    (F.22)
(F.23)
                                      F-6

-------
                APPENDIX F: DERIVATIONS AND EQUATIONS
Table  F.I     Coefficients for the terms at, at-i, etc., in the sum of three successive
             correlated observations
observation
z =
term
^ at., at.2 a,., a,^ a,_5
1
1
1
1
3
4 iP" 43 4*
4 4* 43 ...
1-42 1-43 (1-43)4 d^X* (HW
3(1-4) 3(1-4) 3(1-4) 3(1-4) 3(1^)
             Since the expected value of all the cross product terms are zero (i.e.,
          for wO) they have been dropped from the summation.
          V(£)
             Since Eag2/,] = e2,
             V(z):
                       P2    rt£.l
                   N2(1-4)2I

                                                                     (F.27)
             Using equations (F.9) and (F.10):
                                  (1-4)
                                                                     (F.28)
                                      F-7

-------
                 APPENDIX F: DERIVATIONS AND EQUATIONS

             This can be simplified to:
             Combining equations (F.5), (F.18). and (F.29):
V(£)
                 N(H>2)
                                                         x
                                                         J
             Note that the denominator in equation (F.30) has the termj—^j multiplied

by a "correction term" which is usually close to 1.0 and approaches 1.0 as the sample size
increases.
F.3          Derivation of the Sample Size Equation

             When the variance is known, the sample size for a hypothesis test of the
mean is shown in equation (FA).  When the variance, o2, is to be estimated from the data,
use of the t statistic is recommended, as shown below, where 62 is the estimate of o2:
                                                 3
                           N = ft2  fafil-B + tal-ttl
                                   I    CHI,    J
                                    (F.31)
             To use this equation, the recommended procedure is to substitute the normal
statistic for the t statistic (e.g., zj.p for tN.1;i_p), calculate a preliminary sample size from
which the degrees of freedom can be estimated, and use this to determine t and a new
estimate of the sample size. For small sample sizes, a third or fourth estimate of the sample
size may be required.

             Using equation (F.31) the exact sample size satisfies die following equation:


                  Samplesize(t)-Nt-&2
                                     F-8

-------
                 APPENDIX F:  DERIVATIONS AND EQUATIONS
              Using the conditions which satisfy equation (F.32), the calculated sample
size using (F.4) would be:
                   Sample size (z) * Nz >
                         (F.33)
             The difference between these two sample size estimates where a = .10 and
(J = .10 is shown in figure F.I.
Figure F. 1    Differences in Sample Size Using Equations Based on a Normal Distribution
             (Known Variance) or a t Statistic, Assuming a ».10 and fi».10
               2.5
   Sample size (t) •
   Sample size (z)
                1.5
               0.5


                                   •*•
-t-
•+•
                                    10       IS      20
                                       Sample size (t)
                 25       30
             Note  that the difference in the sample sizes using equations (F.4) and
(F.31) is fairly constant over a wide range of possible sample sizes. This property can be
used to estimate the samples size based on equation (F.31) from equation (F.4). Thus:
                                      CS-HJ
                                                                     (F.34)
                                      F-9

-------
                 APPENDIX F: DERIVATIONS  AND EQUATIONS

where K is a constant which will depend on on a and P. Table F.2 tabulates K at a sample
size of 20, for selected values of a and P.

           - The equations for sample size in the text use equation (F.34) with K = 2.
Table F.2    Differences between the calculated sample sizes using a t distribution and a
             normal distribution when the samples size based on the t distribution is 20,
             for selected values of a (Alpha) and P (Beta)
Beta
.25
.10
.05
.025
.01
.25
0.8
1.2
1.6
2.1
2.7
.10
1.2
1.4
1.7
2.0
2.6
Alpha
.05
1.6
1.7
1.9
2.2
2.7
.025
2.1
2.0
2.2
2.5
2.9
.01
2.7
2.6
2.7
2.9
3.2
F.4
             Effective Df for  the Mean from an AR1 Process
             The following formula is appropriate for estimating the variance of the mean
of n observations from an AR1 series, assuming a large sample size:
                                                                        7.35)
             if the serial correlation is assumed to be zero then, s, the estimated variance
of the data, has a scaled chi-square distribution with n-1 degrees of freedom.  The mean of
a chi-square distribution is v, the degrees of freedom, with a variance of 2v. Thus, the
   _ .    ,    .  .          ..    ,  2v   2
coefficient of variation squared is cv2 * — * -.
                                    v2   v
                                      F-10

-------
                APPENDIX F:  DERIVATIONS AND EQUATIONS

             With zero serial correlation, $ will have a mean of zero and variance of -

(Box and Jenkins, 1970).  The term — -~- - 1 + 2 9 (for small $) has a mean of roughly
                             4                           4
1 and a variance of approximately -. The cv2 is also approximately - since the mean - 1.

             Assuming  a large sample size, the cv of the product of two estimates is
equal to the square root of the sums of the squares of the cv's for each term if the terms are
independent (which will be true if the serial correlation is zero). Thus, the cv2 of s2mean is
roughly the sum of two cv2's:l) the  chi-square distribution, and 2) the correction term
based on $.  Thus the
                                                                    (F.36)
             Assuming that the distribution of s2mean is roughly chi-square, then the
effective number of degrees of freedom for S2mean is v1 where — * ^jy , or v' »   3-^
             Simulations appear to be consistent with this result when 0 = 0, and suggest
that the number of degrees of freedom drops further when <|» 0.
F.4          Sequential Tests for Assessing Attainment

             The following paper, prepared by Westat, has been included in this
appendix as it was submitted for publication.
                                    F-ll

-------
                     APPENDIX F: DERIVATIONS AND EQUATIONS
    Assessing Attainment of Ground Water Cleanup Standards Using
                            Modified  Sequential  t-Tests

                        By John Rogers, Westat, Rockville Maryland

    Assessing the attainment of Superfund cleanup standards in ground water can be complex
    due to measurements with skewed distributions,  seasonal or  periodic patterns, high
    variability, serial correlations, and censoring of observations below the laboratory detection
    limit.  The attainment  decision is further complicated by trends and transient changes in the
    concentrations as a result of the cleanup effort. EPA  contracted Westat to prepare a
    guidance document recommending statistical procedures for assessing the attainment of
    ground water cleanup standards. The recommended statistical procedures were to require a
    minimum of statistical training. The recommended procedures included a sequential t-test
    based  on yearly average concentrations.

    Further research and simulations by Westat indicate that modifications of the sequential t-
    test have better performance and are easier to use than the originally proposed sequential t-
    test, particularly with highly skewed data. This paper presents three modified sequential
    tests with simulation results showing how the sequential t-test and the modifications
    perform under a variety of situations similar to those found in the field. The modified tests
    use an easy-tocalculate approximation for the log likelihood ratio and an adjustment to
    improve the power of the test for small sample sires. Using the log transformed yearly
    averages improves the test performance with skewed data. Expected sample sizes and
    practical considerations for application of these tests are also discussed.

Key words: Sequential t-test,  Simulations, Ground water, Superfund.


1.     Introduction

EPA contracted Westat  to prepare a draft guidance document recommending sampling  and
statistical  methods for evaluating the attainment of ground-water cleanup standards  at Superfund
sites.  The  recommended statistical methods were to be applicable to  a  variety of site conditions and
be able to be implemented by technical  staff with a minimum of statistical training.

The draft  document included  an introduction to basic statistical procedures and recommended a
variety of  statistical methods including a sequential  t-test.  Although  the sequential t-test has several
advantages for testing ground  water, one significant disadvantage is the relative complexity of the
calculations, requiring use of the noncentral t distribution. Additional research was undertaken by
Westat to  find an alternative to the standard sequential t-test which is easier to implement. As  part
of this research, simulations have been used to  evaluate the  performance of the sequential t-test and
several modifications of it.

This paper presents these simulation results showing how the sequential t-test and the modified
tests perform under a variety of situations similar to  those found in the field.

    The Problem of Assessing  Ground Water at a Superfund Site

The history of contamination and cleanup  at a Superfund site will result in ground water
contaminant concentrations which generally  (1) increase during periods of contamination, (2)
'This research was supported by Westat.
2EPA contract 68-01-7359.
                                          F-12

-------
                     APPENDIX F: DERIVATIONS AND EQUATIONS


decrease, during remediation, and (3) settle into dynamic equilibrium with the surrounding
environment after remediation,  at which point the success of the remediation can be determined.

Specifying the attainment objectives and assessing attainment of cleanup standards can be
complicated by many site specific factors, including: multiple wells, multiple contaminants, and
data which have seasonal patterns, serial correlations, significant lab measurement variation, non-
constant variance,  skewed  distributions, long-term trends,  and censored  values below the detection
limits.  The general characteristics  of ground water quality data have  been discussed by Loftis et  al.
(1986). All of these factors complicate the specification of an appropriate  statistical test. Figure 1
illustrates the variation which might be  found in monthly ground water measurements, using
simulated observations.

    The  Statistical  Problem to  be Discussed

The following statistical problem is addressed in this paper. Suppose remediation is complete and
any transient effects of the remediation on the ground water levels and flows have dissipated. We
then wish to determine  if the mean concentration of a contaminant, \i, is less than the relevant
cleanup standard, W). The ground water will be judged to attain the cleanup standard if the null
hypothesis, HO: Ji £ MO. can  be rejected  based on a statistical test  The power of the test, the
probability of rejecting the null  hypothesis, is to be a when H = HO-  For a  specified alternate
hypothesis, HI: |i = \i\ (0 < \i\ < Mo) the power is to be 1-fJ, where f) is the probability of a false
negative decision (the probability of incorrectly accepting the null hypothesis).

The statistical tests considered in this paper are the sequential t-test for comparing means and
modifications of this test.  Using a sequential  procedure, a test of hypothesis is performed after
each sample, or set  of samples, is collected.  The test of hypothesis results in three possible
outcomes, (1) accept the null hypotheisis, (2) reject the null hypotheisis, or (3) continue sampling.
The hypothesis is tested  based on the n ground water samples, x\ to xn, collected prior to the test
of hypothesis. The sample size at the termination of the test is a random variable. The power and
sample size distribution of the sequential tests were evaluated using monte carlo simulations. For
the simulations  the following parameters were varyed:  the  mean,  standard deviation, detection
limit, proportion of the variation which  is serially cotrelated versus independent,  lag. 1 serial
correlation, alpha and beta, distribution (normal or lognormal), and \i\. For all simulations |iQ is
set at 1.0.  1000 simulations were made for each set of parameters tested, unless otherwise noted.
Simulations were performed using SAS version 6.

Section 2 reviews and compares the fixed sample size and sequential t-tests. Sections 3 and 4
discuss the performance of  the t-test and several modifications when applied to normally
distributed and independent observations. The performance of the sequential tests when applied to
simulated ground water data is evaluated in Sections. Section 6 discusses  the results  and presents
the conclusions.


2.     Fixed Versus Sequential Tests

The fixed sample size test and sequential t-test are reviewed briefly below, emphasizing factors
which are relevant to the development of a modified test and for selecting a test for  assessing
ground water.
                                          F-13

-------
                     APPENDIX F:  DERIVATIONS  AND EQUATIONS


    Fixed Sample Size t-Test

 The fixed sample size t-test, familiar to many users of statistics, requires the following steps:

 (1)    Estimate the variance of the future measurements, &2, based available data;
 (2)    Determine sample size n, such that,
       where taji-i is the a percentile of the t distribution with n-1 degrees of freedom.
(3)    Collect n samples and measure the contaminant concentrations;
(4)    Calculate the test statistic t, with n-1 degrees of freedom,
       where : t =	 , x = Y'r1 and  sr =
(5)    Conclude that the ground water attains the cleanup standard if t < lajn-l otherwise, accept
       the null hypothesis that the ground water does not attain the cleanup standard.

The t-test does well to preserve the power of the test at the null hypothesis when the data have a
roughly normal distribution. However .the power at the alternate hypothesis depends on the the
accuracy of the initial variance estimate, o2. Thus the fixed sample size test fixes a and n, leaving
P  variable.

    Standard Sequential  t-Test

With normally distributed independent observations and known a2, an optimal sequential test is the
sequential probability  ratio test (SPRT) (Wald 1947). When O2 is  unknown, as here, one
approach  is provided by the sequential t-test which states the null  hypothesis in terms of the
unknown  standard deviation (Rushton 1950, Ghosh 1970, and others).  For testing hypotheses
about means,  an alternative heuristic solution replaces the unknown variance by the sample
estimate at each  step in the  sequential test (Hall  1962, Havre 1983).  This second version of the
sequential t-test can be used to compare the mean to an established cleanup standard. Liebetrau
(1979) discussed the application of this test to water quality sampling.

The steps in  implementing the sequential t-test  for comparing the mean to a standard are:

(1)    Collect  k-1 samples without,  testing  the hypothesis.
(2)    Collect  one additional sample  for a total of  n samples collected so far and calculate:

                               0  e    n  e   IM * 1*0                            ,~
                                 . So=0. Si= *l  KU;                       eq.(2)
                                                   s;
(3)     Calculate the likelihood ratio:
                                                                                (cq.3)
                                   fn-l(tl5 = 50)
       where fn-i(t I 6 ) is the density of the noncentral t distribution with n-1 degrees of freedom,
       and noncentrality parameter 5.
                                         F-14

-------
                    APPENDIX F: DERIVATIONS AND EQUATIONS
(4)    If L > —- then reject the null hypothesis and conclude that the ground water attains the
              a
       cleanup standard
       if L < -— then accept the null hypothesis that the groundwater does not attain the cleanup
            f - a
       Standard,
       otherwise, return  to step (2)  and collect additional samples until  a decision is reached.

Unlike the fixed sample size test, for the sequential test, a and (J are fixed and n is variable.

   Comparison of the Sequential and Fixed Sample Size Test

Table 1 compares the sequential and fixed sample size tests based on several characteristics. The
choice of which  test to use depends  on the  circumstances in which the test is to be applied.
Table 1
Comparison of the fixed sample size and sequential t-test.
Characteristic
Power
Sample Size
Sampling
Estimate: of the
mean
Ease of
Calculation
Sequential t-Test
Fixed at the null and alternate
hypothesis
Subject to variation, often less
than for a fixed sample size test
with the same power
Works well if the time between
collection of samples is long
relative to the analysis time.
Biased •
Standard test requires tables of the
non-central t distribution which
are not generally available.
Modified test reported here can be
easily calculated.
Fixed Sample Size t-Test
Fixed at the null hypothesis.
Power at the alternate hypothesis
depends on the estimate of
measurement variance used for
calculating sample size.
Fixed
Works well if the sample
collection period is short relative
to the analysis time.
Unbiased
Uses widely available tables
   Application of the Sequential Test to Ground Water Data

For testing contaminant concentrations against a cleanup standard, the sequential t-test has some
distinct advantages: (1) ground water sample collection is sequential with sample analysis time
often short compared to the sample collection period, (2) a good estimate of measurement variance
for calculating the sample size for the fixed test may not be available, (3) for assessing attainment,
the objective is to test a hypothesis rather than to obtain an unbiased estimate of the mean or
construct a confidence interval,  (4) reducing sample size can be important when the cost of
laboratory sample analysis is high, and (5) if the concentrations at the site are indeed below the
cleanup standard, maintaining the power at the alternate hypothesis can protect against incorrectly
concluding that additional costly cleanup is required.  For many users, the main disadvantage of
using the standard sequential t-test is the relative complexity of the calculations.
                                         F-15

-------
                     APPENDIX F:  DERIVATIONS  AND EQUATIONS
 3.    Power  and  Sample Sizes for the Sequential  t-Test  with  Normally
       Distributed  Data

 For the purpose of describing the simulation results used to determine the power of the sequential
 t-test, define the scale factor as the ratio of the standard deviation of the measurements to the
 difference between the means for the null and  alternate hypotheses:

                              Scale factor  =	.
 Also let nfued designate the sample size for a fixed sample size test with the same nominal power as
 the sequential test being discussed, where nfiw
-------
                     APPENDIX F: DERIVATIONS AND EQUATIONS


          HQ: H = Ho against HI: (I = Hi, power at Ho = <*. Ml = 1-P (i-e. ho = Mo);
            HQ: (I = (ii against HI: (I = MO. power at Mo = «, Ml = 1-P (i.e. ho = Mi).

Based on this symmetry, the nominal power of the sequential t-test is the same whether ho = MO or
ho = Hi- In practice, ho serves as the zero point around which the parameters for the non-central
t distribution are calculated rather than the mean value at which the power is maintained, as in the
fixed sample size test. If the equations for the sequential test are modified to put the zero point
mid-way between MO and Ml. then (1) 81= -So, (2) only one non-central t distribution needs to be
evaluated, and (3) the power of the test is symmetric around ho when a = |5, i.e. the false positive
and false negative rates are equal. Although Rushton (1950) considered null hypotheses other than
zero and ho = Mo. in this paper ho is called the zero point rather than the null hypothesis. To avoid
confusion, the terms null and alternate hypothesis will be used as defined in Section 1, reflecting
the intentions of those performing the test

Define the centered sequential t-test by replacing equations (2) by equations (4) and setting
the zero point for the calculations mid-way between Mo and Hi. i.e.:

                                  1.0=^^.                                  eq.(5)
This  centered  test  is used in the  following simulations to determine the relationship between power
and sample size.

   Changes in Power with Increasing Sample Size

Figure 3  shows the false decision rate  (false positive or false negative rate) and average sample size
for the centered sequential t-test with a and p set at .05. and the scale factor ranging from 0.4 to
3.6.  For this symmetric test, the false positive  and false negative rates are equal. The false
decision  rate at very low sample sizes is smaller than the nominal level of .05. As the scale factor
increases, resulting in increasing sample sizes, the false  decision rate increases to a maximum of
roughly three times the nominal level  and then  decreases slowly. The average sample size is
roughly half of that for the corresponding fixed sample size test except at very low sample sizes.
Similar patterns were seen in the false negative rates when the zero point was set at the  null
hypothesis.

The good performance of the test at low samples  sizes is in part due to the discrete  nature of the
sampling. From the sample just before the termination of the test to the sample which terminates
the test, the likelihood ratio jumps from inside the decision limits to outside. With  small sample
sizes, the likelihood ratio may be considerably beyond the decision limits on the last sample. This
is equivalent to having more information than is necessary to make the decision, resulting in
improved performance.


   Distribution of Sample  Sizes

Simulations were used to look at the distribution of sample sizes at the termination of the test, for
selected values of p. and scale factors of 1.0 and  3.0. Figure 4 shows the distribution of sample
sizes, using a log scale, when M = Ml and the scale factor equals 1.0. The sample sizes are
displayed separately for simulations which rejected the null  hypothesis (correct decision) and those
which did not. For both decisions a relatively large proportion of the  simulations terminate at a
sample size of two. The false decision rate is greater than the nominal value by roughly the
proportion of simulations terminating with only two samples.  The modified sequential test, for
which the distribution of samples sizes is also shown in Figure 4, is discussed in the next section.
                                          F-17

-------
                    APPENDIX  F:  DERIVATIONS AND EQUATIONS


The general characteristics of the sample size distributions are the same regardless of the conditions
simulated. Samples sizes for the sequential t-test are highly skewed. For many simulations, the
test terminated with two samples.  For those simulations not terminating with two  or three
samples,  the distribution  of sample sizes was roughly  log-normal.


4.     Modifications to Simplify the Calculations and  Improve the Power

The poor performance of the centered sequential t-test at the alternate and null hypotheses and the
observation that many of the simulations which terminate at two samples contribute to the large
false decision rates, suggest that a modification to the test might improve the performance. Other
authors have noted this problem and suggested alternate procedures. In particular, Hayre f 1983)
suggested changing the test boundaries. Hayre's suggestion is equivalent to multiplying the the
log likelihood ratio by the adjustment factor (na)/(n+c) where d < k and c > -d. Based on
heuristic arguments, Hayre concluded that k, the minimum number of samples, should be at least 5
if a large sample size is  expected.

When small sample sizes are expected, requiring as many as 5 samples before the first test of
hypothesis can result in an overly conservative test.  In this research decision rules requiring a
minimum of 2, 3, or 4 samples were considered. In  addition, the performance of the centered
sequential t-test was simulated using adjustment  factors of:  1, (n-l)/n,  (n-2)/n, (n-3)/n. The
simulations used a and P set at 0.10,0.05, and 0.01.

The false decision rates for the four adjustment factors, with  (a,P) = (0.05,0.05), are shown in
Figure 5. All of the adjustment factors improved the performance of the test by reducing the
maximum probability of a false decision to values closer to the nominal value. The selection of an
optimal adjustment factor  requires specification of  the conditions  under which the test is to  be used.
One adjustment factor might be chosen if small sample sizes are expected, another if large sample
sizes are  expected. In all cases, the test is conservative for low sample sizes, possibly liberal for
intermediate sample sizes, and approaches the nominal values for large sample sizes. Over the
range of the scale factor considered in the simulations, the average false decision rate for the
adjustment factor (n-2)/n was closest to the nominal value. Therefore, this adjustment factor, (n-
2)/n, with k=3 was chosen for evaluation in subsequent simulations.

   Approximation for Non-central t

Calculation of the likelihood ratio using the noncentral t-distribution is difficult because the tables
are not generally available and are difficult to use. The use of the sequential t-test can therefore be
simplified by using an approximation to the log likelihood ratio of the two non-central t-
distributions. Rushton (1950) published three approximations for the log of the likelihood ratio.
Westat's analysis showed that the approximations  performed well, particularly when the zero point
for the test was set mid-way between the null and alternate hypotheses. Using Rushton's simplest
approximation and the adjustment factor selected above,  the equations for the modified
sequential t-test  become:

                   .      m + Hl  .  x - hp  K   m -lip                           ...
                   ho  =—5	,t =	, 5 =	  and                   eq. (5)
                            L          Sx           Sx
                        L.«,(t,-^-i3).                        „.«,
                                         F-18

-------
                     APPENDIX F: DERIVATIONS  AND EQUATIONS


Figure 4 shows the distribution of samples sizes for the modified test compared to that for the
standard sequential t-test.  Figure 6 shows the power curve and average sample sizes for the
modified test with cc=|3 and scale factor =1.6. Figure 6 can be compared directly with Figure 2
for the standard sequential t-test.

    Termination of the Test Before a Decision Has Been Reached

Figure 7 shows the distribution of sample sizes for selected values of n, the mean of the simulated
measurements,  using the modified test with scale factor  of 1.6. As noted before, the distribution
of the sample sizes is roughly log-normal. The minimum sample size is 3 because a minimum of
three samples are required before the first test of hypothesis. The mean sample size is generally
similar to or less than nfaed.  The 95th percentile of the sampje sizes is less than three times %xed
and, for values of |i close to the null and alternate hypothesis, is generally similar to or less than
Several authors, including Wald, have suggested that, for practical purposes, the sequential test
can be terminated after some fixed large number of samples if the test has not otherwise terminated,
with the decision going to which ever hypothesis is more favored at termination. Figure 7
suggests that a decision rule terminating the test with a maximum sample size of three times nfixed
is reasonable because very few tests would be terminated early when the true mean is close to the
null or alternate hypothesis. When the mean is mid-way between the null and alternate hypothesis,
acceptance of the null hypothesis is essentially random, and early termination will not affect the
power of the test.

Simulations were performed to evaluate different termination rules.  One hundred simulations were
run for all combinations of: termination at 1, 2, 3, 4, and 5 times nfixed; four scale factors from .4
to 3.6; a = P = 0.1, 0.05, 0.01; and m = 0.5. In addition, 100  simulations  were run for all
combinations of: 11 values of jo. from .35 to 1.15; termination at 1, 2, 3, and 4 times the fixed
sample size; scale factor = 1.6; and a = P = 0.05. The differences in the power due to early
termination  were not statistically significant.  Early, termination resulted in a decrease in the average
sample size with p. mid-way between the null and alternate hypotheses; however, with \i at the null
or alternate hypothesis, changes in the average sample size  were, practically speaking,
insignificant.

These results indicate that early, termination of the sequential test will have little effect on the .power
of the test.  Because the fixed sample size is estimated from 62 based on  data available before
sampling and is therefore subject to error, it is recommend that sequential tests not be terminated
until the samples size is at least twice the estimated sample size  for an equivalent  fixed sample size
test. For the simulations reported in other sections of this paper, the sequential tests were
terminated  if the sample size exceeded 5 times nfixed.


5.     Application to  Ground Water Data from Superfund Sites

The  modified sequential t-test performs  well  with normally distributed data, having  average sample
sizes below those for equivalent fixed sample  size tests and power close to the nominal power.
However, ground water measurements may be skewed, serially correlated, censored, and have
seasonal patterns. How well does the modified  test perform with ground water data? Simulations
were used to determine how four sequential tests  performed when assessing  ground-water data.

For all statistical  tests, the following sequential sample design is  assumed: m  ground water samples
are collected at  periodic intervals throughout the year, with  at least 4 samples per year. The
samples  are  analyzed and the test of hypothesis is performed once per year starting after three years
of data are  collected.  The number of years  of data collection is n.
                                          F-19

-------
                    APPENDIX F: DERIVATIONS AND  EQUATIONS



The four statistical tests  evaluated using the simulations are:

1)   Standard sequential t-test described in  section 2 using the yearly averages;

2)   Modified sequential t-test using the yearly averages;

3)   Modified sequential t-test with adjustments for seasonal variation and serial correlation:

        Remove seasonal patterns from the data using one-way analysis of variance. Calculate
        the standard error, Se, and the lagl serial correlation of the residuals, r. Estimate the
        standard error of the mean as:
                                 9 1+r   • UT^
                               Se2 J7 withDf=
        The effective sample size is assumed to be one more than the number of degrees of
        freedom. Therefore:

                        L = exp


4)   Modified sequential t-test with an adjustment for skewness:

        Calculate y = ln(y early average). Estimate the log transformed mean and its standard
        error using  the following equations:
                                                      t-1
        The test statistic for the sequential t-test uses:

                 ln(no) + ln(m)     ln(x) - hn    .  s   ln(m) - ln(Uo)
            ho =       2    — , t —   1 '    " .and  o =   1! "    * *'
                                       Sln(x)
The first, second and fourth tests use the yearly average concentrations, averaging across the
within year seasonal patterns.  The serial correlation between the yearly averages is less than
between individual observations, reducing the influence of correlation on the test results.  The third
test removes the seasonal patterns.  The standard error of the mean is adjusted by a factor which
accounts for the serial correlation,  assuming an AR(1) model and many observations per year.
Although this assumption may not  be correct, the lag 1 correlation is expected to dominate the
correlations for higher lags, making  the AR(1) model a reasonable approximation to the data. The
effective degrees of freedom for the standard error is based on asymptotic approximations.  The
fourth test is based on the assumption that the yearly averages have a log normal distribution. For
highly skewed data this assumption is more reasonable than assuming a normal distribution.  The
mean and standard error of the mean are first order approximations based on  a lognormal
distribution.
The secpnd test was expected to perform well with data which has an approximately norr
distribution. The third test was expected to perform best with highly skewed data. The fourth
                                                                                normal
                                                                                   test
was expected to perform best with data with significant correlation and little skewness.
Simulations were performed to test these assumptions.
                                         F-20

-------
                     Figure 8  Range of False Positive Rates for Scale Factors from
                             1.6 to 4.8 for Four Sequential Tests, by Data Type
71
OJ
                          D,
                                    D.
                                       III
                                                                     Oi
                          •I
                                            D

               Normal    Bas!f,  12 samp/yr
                        cv = 0.5        vJ
                       4 samp/yr
Skewed    Censored
cv = 1.5    30%

-------
Figure 9  Range of False Negative Rates for Scale Factors from
       1.6 to 4.8 for Four Sequential Tests, by Data Type
0.35 i
0.3
| 0.25
V
£ 0.2
C8
^ - 01
cs u>1
fa
0.05
0








> -



















n
ll





















i.l













-







n.l D_ ...



















i-
•
| i| ^a.

Normal Bas!f_ 12samp/yr Skewed Censored Correlated Skewed &
cv = 0.5 cv = 1.5 30%

-------
                          APPENDIX G: GLOSSARY

Alpha (a) -In the context of a statistical test, a is probability of a Type I error.

Alternative Hypothesis  See hypothesis.

Analysis Plan The plan that specifics how the data are to be analyzed once they have
          been collected,  includes what  estimates are to be made from the data, how the
          estimates are to be calculated, and how the results of the analysis will be
          reported.

Autocorrelation See serial correlation

Attainment This term by itself refers to the successful achievement of the attainment
          objectives. In brief, attainment means that site contamination has been reduced
          to or below the  level of the  cleanup standard.

Attainment Objectives The attainment objectives refer to a set of site descriptors and
          parameters together with standards as to what the desired level should be for the
          parameters. These are usually decided upon by the courts and the responsible
          parties. For example, these objectives usually include the chemicals to  be
          tested, the cleanup standards to be attained, the measures or parameters to  be
          compared to the cleanup standard, and the level of confidence required if the
          environment and human health are to be protected (Chapter 3).

Beta (P)  In the context of a statistical test, p is the probability of a Type II error.

Binomial Distribution A probability distribution used to describe the number of
          occurrences of a specified  event in n independent trials. In this manual, the
          binomial distribution is used to develop statistical tests concerned with testing
          the proportion of ground water samples that have excessive concentrations of a
          contaminant (see Chapters 8  and 9). For example, suppose the parameter of
          interest is the portion (or percent) of the ground water wells that exceed a level
          specified by the cleanup standard, Cs. Then one might estimate that portion by
          taking a sample of 10 wells and counting the number of wells that exceed the
          Cs. Such a sampling process results in a binomial distribution. For additional
          details  about  the binomial distribution, consult  Conover (1980).
                                       G-l

-------
                          APPENDIX G: GLOSSARY
Central Limit Theorem  If X has a distribution with the mean p. and variance a2, then
          the sample mean X, based on a random sample of size n has an approximately
                                                 a2
          normal distribution with mean (0. and variance —. The approximation becomes
          increasingly good as n increases.  In other words, no matter what the original
          distribution of X (so long as it has a finite mean and variance), the distribution
          of X from a large sample can be approximated by a normal distribution.  This
          fact is very important since knowing the approximate distribution of X allows
          us to make corresponding approximate  probabilistic estimates.  For example,
          reasonably good estimates for confidence intervals on X can frequently be given
          even though the underlying probabilistic structure of Y is unknown.

Chain of Custody Procedures Procedures for documenting who has custody of and
          the condition of samples from the point of collection to the analysis at the
          laboratory.  Chain of custody procedures are used to insure that the samples are
          not lost, tampered with, or improperly stored or handled.

Clean Attains the cleanup standard. That is, a judgment has been made that the site has
          been cleaned or processed to the point that in the attainment objectives, as
          defined above, have been met.

Cleanup  Standard (Cs) The criterion set by EPA against which the measured
          concentrations are compared to  determine whether the ground water at the
          Super-fund site is acceptable or not (Sections 2.2.4 and 3.4). For example, the
          Cs might be set at 5 parts per million (5 ppm) for a site chemical. Hence, any
          water that tests out at greater than 5 ppm  is not acceptable.,
                                                               SSK
Coefficient  of Determination (R2) A descriptive statistic, R2 = 1 - -5— and 0 < R2
                                                                byy
          < 1, that provides a rough measure of the overall fit of the model.  A perfect fit;
          i.e., all of the observed data points fall  on the fitted regression line, would be
          indicated by an R^ equal to 1. Low values of R2 can indicate either a relatively
          poor fit of the model or no relationship between the concentration levels and
          time.  R^ is just the square of the well-known correlation coefficient. For more
          information, see any standard text book.
                                      G-2

-------
                           APPENDIX G: GLOSSARY
Coefficient of Variation (cv) The ratio of the standard deviation to the mean (j*) for a
          set of data or distribution.  For data which can only have positive values, such
         "as concentration measurements, the coefficient of variation provides a crude
          measure of skewness.  Data with larger cv's usually are more skewed to the
          right The cv provides a relative measure of variation (i.e., relative with respect
          to the mean).  As such, it can be used as a rough measure of precision. It is
          useful to know if the cv is relatively constant over the range of the variable of
          interest.

Comparison-wise Alpha 'For an individual statistical  decision on one compound or
          well, the maximum probability of a false positive decision.

Compositing Physically mixing several samples into one larger sample, called a
          composite sample. Then either the entire composite is measured or one or more
          random subsamples from the composite are measured Generally the individual
          samples  which are composited must be the  same size or volume, and the
          composite sample must be  completely mixed. Composite samples  can be useful
          for estimating  the mean concentration.  If appropriate,  compositing  can result in
          substantial savings where the cost of analyzing individual samples is high.

Confidence Interval A sample-based estimate of a population parameter which is
          expressal as a range or  interval of values which will include the true parameter
          value with a known probability or confidence. For example, instead of giving
          an estimate of the population mean, say x = 15.3, we can give a 95 percent
          confidence interval, say [x-3, x+3] or [12.3 to  18.3] that we are 95 percent
          confident contains the population mean.

Confidence  Level The degree of confidence associated with an  interval  estimate. For
          example, with a 95 percent confidence interval, we would be 95 percent certain
          that the interval contains the true value being estimated.  By this, we mean that
          95 percent of independent 95 percent confidence intervals will contain  the
          population mean. In the context of a statistical test,  the confidence level  is equal
          to 1 minus the Type I error (false positive rate). In this case, the confidence
          level represents the probability of correctly concluding that the null hypothesis
          is true.
                                       G-3

-------
                            APPENDIX G:  GLOSSARY
Conservative Test A statistical test for which the Type I error rate (false positive rate) is
           actually less than that specified for the test.  For a conservative test there will be
           a greater tendency to accept the null hypothesis when it is not true than for a
           non-conservative test. In the context of this volume, a conservative test errs on
           the side of protecting the public health.  That is to say, the mistake (i.e.. error)
           of wrongly deciding that the  site is clean will be less than the stated Type I Error
           Rate.

Contaminated A site is called contaminated if it does not attain the cleanup standards. In
           other words, the contamination level on the site is higher than that allowed by
           the  cleanup standard.

Degrees of Freedom (Df) The degrees of freedom of an estimate of variance,  standard
           deviation, or standard error is a measure of the amount of information on which
           the estimate is based or the precision of the estimate. Usually, high degrees of
           freedom are associated with a large sample size and a corresponding increase in
           accuracy of an estimation.

Dependent Variable (y;) An outcome whose variation is explained by the influence of
           independent variables. For  example, the contamination level in ground water
           (i.e., the dependent variable y) may depend on the distance  (i.e., the
           independent  variable x) from the site incinerator.

Detection Limit The level below which concentration measurements cannot be reliably
           determined (see Section 2.3.7).  Technically, the lowest concentration of a
           specified contaminant which  is  unlikely to be obtained when analyzing  a  sample
           with none of the contaminant

Distribution The frequencies (either relative or absolute) with  which measurements in a
           data set fall within specified classes. A graphical display of a distribution is
           referred to as a histogram. Formally, a distribution is defined in terms of the
           underlying probability function. For example, the distribution of x, say Fx(t),
           may be defined as the probability that x is less than t (i.e., P(x
-------
                           APPENDIX G: GLOSSARY
          test is "statistically" large then the decision rule is to declare that we do not
          believe that serial correlation is present If  
-------
                           APPENDIX G: GLOSSARY
Independent Variable (x;) The characteristic being observed or measured that is
          hypothesized to influence an event (the dependent variable) within the defined
          area of relationships under  study. The  independent  variable is not influenced  by
          the event but may cause it or contribute to its variation.

Inference The process of generalizing (extrapolating) results from a sample to a larger
          population. More generally, statistical inference is the art of evaluating
          information (such as samples) in order to draw reliable conclusions about the
          phenomena under study. This usually means drawing conclusions about the
          distribution of some variable.

Interquartile Range The difference between the 75th and 25th percentiles of the
          distribution.

Judgment Sample A sample of data selected according to non-probabilistic methods;
          usually based on expert judgment.

Kriging Kriging is the name given to the least squares prediction of spatial processes. It
          is a form of curve fitting using a variety of techniques from regression and time
          series. Statistically, kriging is best linear unbiased estimation using generalized
          least squares. This statistical technique can be  used to model the contours of
          water and contaminant levels across wells at given points in  time (see Chapter 7
          of this guidance and Volume I, Chapter 10). Kriging is not appropriate for
          assessing attainment in ground water.

Laboratory Error See measurement error.

Lag 1 Serial Correlation See serial correlation.

Least Squares Estimates This is a  common estimation  technique. In regression, the
          purpose is to find estimates for the  regression curve fit. The estimates are
          chosen so that the regression curve is  "close" to the plotted sample data in the
          sense that the square of their distances is minimized (i.e., the least). For
          example, the estimates Po and Pi of the y-intercept Po and the slope Pi are least
          square estimates (see Section 6.1.2).

Less-than-Detection Limit A concentration value that is reported to be below the
          detection limit with now measured concentration provided by the lab. It is

                                       G-6

-------
                           APPENDIX G: GLOSSARY
          generally recommended that these values  be included in the analysis as values at
          the detection limit.

Lognormal  Distribution A family  of positive-valued,  skewed distributions commonly
          used in environmental work. See Gilbert (1987) for a detailed discussion of
          lognormal distributions.

Mean  The arithmetic average of a set of data values. Specifically, the mean of a data set,
                                         n x.
          xlt x2,.... x,,, is defined by X » £ •£.

Mean Square Error  (MSE) The  sum  of squares due to error divided by the
          appropriate degrees of freedom which provides an estimate of the variance
          about the regression.

Measurement Error Error or variation in laboratory measurements resulting from
          unknown factors  in  the handling and  laboratory  analysis  procedures.

Median The values which separates the lowest  50 percent of the observations from the
          upper 50 percent of the observations.  Equivalently, the "middle" value of a set
          of data, after the values have been arranged in ascending order.  If the number
          of data points is even, the median  is defined to be the average of the two middle
          values.,

Mode The value with the greatest probability,  i.e., the value which occurs more often
          than any other.

Model  A mathematical description of the process or phenomenon by  which the data arc
          generated  and  collected.

Non-Central t-Distribution Similar to the t-distribution with the exception that the
          numerator is a normal variate with  mean equal to something other than zero (see
          also t-distribution).

Nonparametric Test A test based  on relatively few assumptions about the underlying
          process generating the data. In particular, no assumptions arc made about the
          exact form of the underlying probability distribution. As a consequence,
          nonparametric tests are valid for a fairly broad class of distributions.
                                       G-7

-------
                           APPENDIX G: GLOSSARY
'Normal Distribution A family of "bell-shaped" distributions described by the mean and
           variance, n and a2. Refer to a statistical text (e.g., Sokal and Rohlf, 1973) for a
           formal definition. The standard normal distribution has M. * 0 and o2 - 1.

Normal Probability Plot A plot of the ordered residuals against their expected values
           under normality (see Section 5.6.2).

Normality See normal distribution (see also Section 5.6).

Null Hypothesis  See hypothesis.

Outlier Measurements that are (1) very large or small relativeto the rest of the data, or (2)
           suspected of being unrepresentative of the true concentration at the sample
           location.

Overall Alpha When multiple chemicals or wells are being assessed, the probability that
           all chemicals in all wells are judged to attain-the cleanup standard when  in
           reality,  the concentrations for at least one well or chemical  do not attain the
           cleanup  standard.

Parameter A statistical property or characteristic of a population of values.  Statistical
           quantities such as means, standard deviations, percentiles, etc. are parameters if
           they refer to a population of values, rather than to a sample of values.

Parameters of the Model See regression coefficients.

Parametric Test A test based on assumptions about the underlying process generating
           the data. For example,  most parametric tests assume that the underlying  data
           are normally distributed. Although parametric tests are strictly not valid unless
           the underlying assumptions are met, in many cases parametric tests perform
           well over  a range of conditions found in  the  field. In particular, with
           reasonably large sample  sizes  the distribution of the  mean will be approximately
           normal. See robust test, and Central Limit Theorem.

Percentile The specific value of a distribution that divides the set of measurements  in
           such a way that P percent of the measurements fall below (or  equal) this value,
           and 1-P percent  of the measurements exceed this value. For specificity, a
           percentile is described by the value of P (expressed as a percentage). For
                                       G-8

-------
                           APPENDIX G: GLOSSARY
          example, the 95th percentile (P=0.95) is that value X such that 95 percent of the
          data have values less than X, and 5 percent have values exceeding X.  By
          definition, the median is the 50th percentile.

Physical Sample A portion of ground water collected from a well at the waste site and
          used to make measurements. This may also be  called a water sample. A
          water sample may be mixed, subsampled, or otherwise handled to obtain the lab
          sample of ground water which is sent for laboratory analysis.

Point Estimate See estimate.

Population The totality of ground water  samples in a well for which inferences
          regarding attainment of cleanup standards  are to be  made.

Population Mean Concentration The concentration which is the arithmetic average
          for the totality of ground water units (see  also mean and population).

Population Parameters See parameter.

Power  The probability that a statistical test will result in rejecting the null hypothesis
          when the null hypothesis is false.  Power = 1 - p,  where p is the Type II error
          rate associated with the  test.  The term "power  function" is more accurate
          because it reflects the fact that power is a function of a particular value of the
          parameter of interest under the alternative hypothesis.

Precision Recision refers to the degree to which repeated measurements are similar to
          one another. It measures the agreement (reproducibility) among individual
          measurements, obtained under prescribed similar conditions. Measurements
          which are precise are in  close agreement. To use an analogy from archery,
          precise archers have all of their arrows land very close together. However, the
          arrows of a precise archer  may or may not land on  (or even near) the bull's-eye.

Predicted Value In regression analysis, the calculated value of y,, under the estimated
          regression line, for a particular value of Xj.

Proportion The number of ground water samples in a set of ground water samples that
          have a specified characteristic, divided by the total number of ground water
          samples  in the set.
                                       G-9

-------
                          APPENDIX G: GLOSSARY
Random Error (EI)  Represents  "random" fluctuations of the  observed chemical
          measurements around the hypothesized mean or regression model.

Random Sample A sample of ground water units selected using the simple random
          sampling procedures described in Section 4.1.

Range The difference between the maximum and minimum values of measurements in a
          data set.

Regression Analysis The process of finding the "best"  mathematical model (within
          some restricted class of models) to describe the dependent variable, y;, as a
          function of the independent variable, x;, or to  predict y; from x;. The  most
          common form is the linear model.

Regression Coefficients  The constants J3o and Pi in  the simple linear regression
          model which represent the y-intercept and slope of the model.

Residual  In  regression analysis, the difference between the observed value of the
          concentration measurement yj and the corresponding fitted (predicted) value, yj,
          from the estimated regression line.

Response Variable  See dependent variable.

Robust Test A statistical test which is approximately valid under a wide range of
          conditions.

Sample Any collection of ground water samples taken from a well.

Sample Design The procedures used to select the ground water samples.

Sample Mean See mean.

Sample Residual See residual.

Sample  Size The number of lab samples (i.e., the size of the statistical sample). Thus, a
          sample of size 10  consists  of the measurements taken  on 10 ground water
          samples or composite samples.

Sample Standard Deviation See standard  deviation.

                                     G-10

-------
                          APPENDIX G: GLOSSARY
Sample Statistics Numerical quantities which summarize the properties of a data set

Sampling Error Variability in sample statistics between different samples that is used to
          characterize the precision  of sample-based  estimates

Sampling Frequency (n) The number of samples to be taken per year or seasonal
          period.

Sampling Plan See sample design..

Sampling Variability See sampling  error.

Sequential Test A statistical test in which the decision to accept or reject the null
          hypothesis is made in a sequential fashion. Sequential tests are described in
          Chapters 4, 8, and 9 of this manual.

Serial Correlation A measure of the extent to which successive observations are
          related.

Significance Level The probability of a Type I error associated with a statistical test.
          In the context of the statistical tests presented in this manual, it  is the probability
          that the ground water from a well or group of wells is  declared to  be clean when
          it is contaminated. The significance level is often denoted by the symbol a
          (Greek letter  alpha).

Simple Linear Regression A regression analysis where there is only one independent
          variable and the equation for the model is of the form yj * fa + p^, where PQ
          is the intercept and fij is the slope of the regression (see Section 6.1).

Simple Linear Regression Model  A linear model relating the concentration
          measurements (or  some other parameter) to  time (see  Section 6.1).

Size of the Physical Sample The  volume of a physical ground water sample.

Skewness A measure of the extent to which a distribution is symmetric or asymmetric.

Skewed Distribution Any asymmetric distribution.
                                     G-ll

-------
                           APPENDIX G:  GLOSSARY
Standard Deviation  A measure of dispersion of a set of data. Specifically, given a set
          of measurements, xlt x2,.... x,,, the standard deviation is defined to be the
                            VS (x{ - x)2
                           '*L^i—'
          quantity, s»   V        .     , where X is the sample mean.

Standard Error A measure of the variability (or precision) of a sample estimate.
          Standard  errors are often used to construct  confidence intervals.

Statistical Sample A collection of chemical concentration measurements reported by the
          lab for one or more lab samples where the lab samples were collected using
          statistical  sampling methods, Collection of a statistical sample allows estimation
          of precision  and confidence intervals.

Statistical Test A formal statistical procedure and decision rule for deciding whether the
          ground water  in a well attains the specified cleanup standard.

Steady State A state at  which the residual effects of the treatment process (or any other
          temporary intervention) on general ground water characteristics appear to be
          negligible (see Section 7.1).

Sum of Squares Due to Error (SSE) A measure of how well the model fits the data
          necessary for assessing the adequacy of the model.  If the SSE is small, the fit
          is good; if it is large, the fit is poor.

Symmetric Distribution A distribution of measurements for which the two sides of its
          overall shape  are mirror images of each other about a center line.

Systematic  Sample  Ground water samples  that are collected at equally-spaced intervals
          of time.

t-Distribution The distribution of a quotient of independent random variables, the
          numerator of which  is a standardized normal variate with mean equal to zero
          and variance equal to one, and the denominator of which is the positive square
          root of the quotient of a chi-square distributed variate and its  number of degrees
          of freedom.  For additional details about the t-distribution, consult Resnikoff
          and Lieferman (1957) and Locks, Alexander, and Byars (1943).
                                       G-12

-------
                           APPENDIX G: GLOSSARY
Tolerance Interval A confidence interval around a percentile of a distribution of
          concentrations.

Transformation  A manipulation of either the dependent of independent variable, or
          both, to normalize a distribution or linearize a model. Useful transformations
          include  logarithmic,  inverse,  square root,  etc.

Trends A general increase or decrease  in  concentrations  over time  which is persistent and
          unlikely to be due to random variation.

True Population Mean The actual, unknown arithmetic average contaminant level for
          all ground water samples in  the population (see also mean and population).

Type I Error The error made when the ground water in a well is declared to be clean
          based on a statistical test when it is actually contaminated. This is also referred
          to as a false positive.

Type II  Error The error made when the ground water in a well is declared to be
          contaminated  when it is actually clean. This  is also referred to as a false
          negative.

Variance The square of the standard deviation.

Waste Site The entire area being investigated for contamination.

Z Value Percentage point of a standard normal distribution. Z values are tabulated in
          Table A.2 of Appendix A.
                                       G-13

-------