&EPA
United States
Environmental Protection
Agency
                          Office of Water
                          Washington, DC 20460
EPA-821-B-01-014
December 2001
         Statistical Analysis of Abandoned
         Mine Drainage in the Assessment of
         Pollution Load ("The Griffiths Report")

-------

-------
                                   EPA-821-B-01-014
       Statistical Analysis of
Abandoned Mine Drainage in the

  Assessment of Pollution Load

        ("The Griffiths Report")



             Prepared for:

     U.S. Environmental Protection Agency
              Office of Water
       Office of Science and Technology
       Engineering and Analysis Division

              Prepared by:

             John C. Griffiths
    Roger J. Hornberger, Pennsylvania DEP
        Ken Miller, DynCorp I & ET
     Michael W. Smith, Pennsylvania DEP
            December 2001

-------

-------
                                    DEDICATION

   John C. Griffiths was one of the early leaders in the use of statistics in the geological
sciences. As an attest to his world class stature, he was the first recipient of the William C.
Krumbein Award by the International Association of Mathematical Geology in 1977, named
after one of his contemporaries. Griffiths, Krumbein, Felix Chayes and a few others introduced
geologists and geological students to statistical methods in sampling, experimental design,
petrology, mineralogy, sedimentology, stratigraphy and other aspects of the geosciences
throughout the 1950's, 1960's and 1970's as documented in approximately 100 scientific papers
and several text books.

   John Griffiths was born on February 29, 1912 in Wales. He earned 3  degrees from the
University of Wales including a PhD in 1937 in glacial geology and petrography, a Diploma of
Imperial College at the Royal College of Science in London, and a second PhD from the
University of London in 1940. He was employed as a research petrographer on oil well drilling
projects from 1940 to 1947 in Trinidad, where he was married on July  26, 1941. He was a
professor in geosciences at the Pennsylvania State University from 1947  to his retirement in
1977, and thereafter was a Professor Emeritus until  his death on June 2, 1992 in State College,
PA. The day before his death at age 80, he was conducting research in  the Earth and Mineral
Sciences Library at Penn State. During his many years as a professor, he served as the Head of
the Department of Mineralogy (and Geochemistry) from 1955 to 1966, and as the Director of
Planning Research for the entire University from 1969 to 1971.

   Dr. Griffiths was an excellent teacher who instilled scientific rigor  and an appreciation for
proper sampling and the use  of statistics in the minds of many students. While at Pennsylvania
State University, he taught univariate statistics, bivariate statistics, and multivariate statistics;
periodically, he also taught a course in time series analysis. New graduate students, relying upon
the foundation of their undergraduate studies, would be confronted by  this feisty Welshman,
armed with more than 20 years of data on a local stratified gravel deposit from previous classes,
saying things like "You call yourselves geologists; you can't even tell  me how many layers there
are in this gravel deposit." Students soon learned that Dr. Griffiths was challenging them to use
statistical analysis as a guide to the unknown in a scientific method for solving problems  in the
geosciences.

   J.C. Griffiths approached teaching, research and much of life in general, with a blend of
humor, history, and lessons learned from other sciences, observations from current events, and a
strong foundation of scientific rigor and ethics. With the advances in computer science in the
1950's and 1960's, Griffiths  expanded his areas of interest into related fields of computer
modeling, operations research and cybernetics. In the 1960's and 1970's, he proposed drilling
the entire United States on a  20-mile grid spacing, wherein approximately 7500 drill holes each
10,000 to 15,000 feet deep would almost certainly result in the discovery of billions of dollars
worth of oil, gold, uranium, zinc, copper and other valuable minerals overlooked by
conventional "hit-and-miss"  type of exploration. In the early  days of research on the correlation
between cigarette smoking and the incidence of lung cancer, Griffiths was requested to meet
with a famous  statistical researcher for dinner the evening before his cancer research speech at
the University.  Griffiths was a smoker at that time,  and his recollection of the evening was, "I

-------
took one look at that man's statistics and I knew that I had 2 choices: I either had to give up
cigarettes or give up statistics."

   Following his retirement from the full time faculty in 1977, J. C. Griffiths worked with the
U. S. Geological Survey in Reston VA and continued his research with graduate students on
quantifying the geology of the world by country for mineral resource assessment purposes. He
served as a consultant to DER (now DEP) and EPA from 1984 to 1988 on a cooperative project
to support development of Pennsylvania's Coal Remining regulatory package.
      Beyond his many professional accomplishments, John C. Griffiths was a great
      person. This document was prepared in his honor and with great respect for
      his accomplishments as a geostatistician, a teacher, and a major contributor to
      our understanding of sedimentary and geochemical processes.

-------
                                 DISCLAIMER

The statements in this document are intended solely as guidance. This document is not intended,
nor can it be relied upon, to create any rights enforceable by any party in litigation with the
United States. EPA may decide to follow the guidance provided in this document, or to act at
variance with the guidance, based on its analysis of the specific facts presented. This guidance is
being issued in connection with amendments to the Coal Mining Point Source Category.

-------

-------
                          TABLE OF CONTENTS
LIST OF FIGURES 	i
LIST OF TABLES	 vii
Chapter 1    Introduction	  1-1

Chapter 2    Statistical Analysis of Mine Drainage Data 	  2-1

Chapter 3    Mine Drainage Analysis Algorithm 	  3-1

Chapter 4    Analysis of Data from the Arnot Site	  4-1

Chapter 5    Analysis of Data from the Clarion Site  	  5-1

Chapter 6    Analysis of Data from the Ernest Site  	  6-1

Chapter 7    Analysis of Data from the Fisher Site	  7-1

Chapter 8    Analysis of Data from the Markson Site	  8-1

Chapter 9    Statistical Summary and Review of Quality Control Limits 	  9-1
APPENDIX A: Hamilton Discharge Data  	A-l
APPENDIX B: Arnot Discharge Data	B-l
APPENDIX C: Clarion Discharge Data	C-l
APPENDIX D: Ernest Discharge Data	D-l
APPENDIX E: Fisher Discharge Data  	E-l
APPENDIX F: Markson Discharge Data	 F-l
REFERENCES  	R-l

-------

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load


LIST OF FIGURES
                                                                                Page

Chapter 1.0  Introduction

       Figure 1.1:    Map of Pennsylvania Counties and Mine Sites  	  1-3


Chapter 2.0  Statistical Analysis of Mine Drainage Data

       Figure 2.1:    Example of Acid Load Variation Before, During, and After
                    Remediation  	  2-1
       Figure 2.2:    Example of Normal Distribution	  2-5
       Figure 2.3:    Example of a Quality Control Graph	  2-6
       Figure 2.4:    Stem-and-leaf of Discharge 	  2-7
       Figure 2.5:    Net Alkalinity Boxplot for Fisher Mine Site Discharge	  2-9


Chapter 3.0  Mine Drainage Data Analysis Algorithm

       Figure 3.1:    Algorithm for Analysis of Mine Drainage Discharge Data	3-3


Chapter 4.0  Analysis of Data from the Arnot Site

       Figure 4.1:    Map of Arnot Site	  4-2
       Figure 4.2:    Log Flow vs.  Time (Arnot 001, 003, and 004)	  4-3
       Figure 4.3:    Stem-and-leaf of Sulfate 	  4-9
       Figure 4.4:    Stem-and-leaf of Discharge	  4-9
       Figure 4.5:    Stem-and-leaf of pH	  4-10
       Figure 4.6:    Stem-and-leaf of Sulfate	  4-10
       Figure 4.7:    Stem-and-leaf of Acidity	  4-11
       Figure 4.8:    Plot of Manganese vs. Log Flow	  4-11
       Figure 4.9:    Plot of Acidity vs. Flow  	  4-12
       Figure 4.10:   Plot of Manganese vs. Flow	  4-12
       Figure 4.1 la:  Plot of Discharge vs. Time 	  4-14
       Figure 4. lib:  Plot of pH vs. Time  	  4-14
       Figure 4. lie:  Plot of Acidity vs. Time  	  4-15
       Figure 4. lid:  Plot of Total Iron vs. Time 	  4-15
       Figure 4. lie:  Plot of Sulfate vs. Time	  4-15
       Figure 4.1 If:  Plot of Aluminum vs. Time	  4-16
       Figure 4.12a:  Autocorrelation Function of Discharge	  4-19
       Figure 4.12b:  Autocorrelation Function of Calcium 	  4-19
       Figure 4.12c:  Autocorrelation Function of Aluminum 	  4-20

-------
Table of Contents
       Figure 4.12d: Autocorrelation Function of Total Iron	 4-20
       Figure 4.13a: Autocorrelation Function of pH	 4-21
       Figure 4.13b: Partial Autocorrelation Function of pH	 4-22
       Figure 4.13c: Autocorrelation Function of Log Discharge	 4-22
       Figure 4.13d: Partial Autocorrelation Function of Log Discharge  	 4-23
       Figure 4.13e: Autocorrelation Function of Ferric Iron 	4-23
       Figure 4.13f: Partial Autocorrelation Function of Ferric Iron	 4-24
Chapter 5.0  Analysis of Data from the Clarion Site

       Figure 5.1:    Map of Clarion Mine Site	 5-2
       Figure 5.2:    Stem-and-leaf of Sulfate  	 5-4
       Figure 5.3a:   Stem-and-leaf of Discharge  	 5-5
       Figure 5.3b:   Stem-and-Leaf of Log Discharge	 5-5
       Figure 5.4a:   Stem-and-leaf of Acid	 5-6
       Figure 5.4b:   Stem-and-Leaf of Log Acid  	 5-6
       Figure 5.5a:   Cross Correlation Function for pH and Discharge  	 5-8
       Figure 5.5b:   Cross Correlation Function for Acidity and Discharge	 5-9
       Figure 5.5c:   Cross Correlation between Sulfate and Discharge  	 5-9
       Figure 5.6a:   Plot of pH vs. Acidity  	 5-10
       Figure 5.6b:   Plot of Acidity vs. Sulfate	 5-10
       Figure 5.6c:   Plot of Iron vs. Sulfate	 5-11
       Figure 5.7a:   Plot of pH vs. Time  	 5-12
       Figure 5.7b:   Plot of Acidity vs. Time  	 5-13
       Figure 5.7c:   Plot of Iron vs. Time  	 5-13
       Figure 5.7d:   Plot of Sulfate vs. Time	 5-13
       Figure 5.8a:   Autocorrelation Function of pH	 5-14
       Figure 5.8b:   Autocorrelation Function of Acid  	 5-15
       Figure 5.8c:   Autocorrelation Function of Total Iron	 5-15
       Figure 5.8d:   Autocorrelation Function of Sulfate 	 5-16
       Figure 5.9:    Projections of Sulfate Data	 5-18
Chapter 6.0  Analysis of Data from the Ernest Site

       Figure 6.0:    Map of Ernest Site	 6-2
       Figure 6. la:   Histogram of pH (n = 174) 	 6-5
       Figure 6. Ib:   Histogram of Acid Load (n = 174)  	 6-5
       Figure 6. Ic:   Histogram of Log Acid Load ( n=174)	 6-5
       Figure 6. Id:   Histogram of Total Iron (n=174)	 6-6
       Figure 6.1e:   Histogram of Log Total Iron (n=174)	 6-6
       Figure 6. If:   Histogram of SO4 (n=174)  	 6-6
       Figure 6. Ig:   Histogram of Log SO4 (n=174) 	 6-7
       Figure 6.2a:   Cross Correlation Function of pH vs. Flow 	 6-9

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

       Figure 6.2b:   Cross Correlation Function of pH vs. Log Acid	  6-9
       Figure 6.2c:   Cross Correlation Function of pH vs. Log Acid Load 	  6-10
       Figure 6.2d:   Cross Correlation Function of pH vs. Log Iron  	6-10
       Figure 6.3a:   Plot of pH vs. Time (Days)	  6-11
       Figure 6.3b:   Plot of Log Flow vs. Time (Days)	  6-11
       Figure 6.3c:   Plot of Log Acid vs. Time (Days)  	  6-12
       Figure 6.4:    Plot of Acid vs. Acid Load	  6-12
       Figure 6.5a:   Plot of Iron Load vs. Acid Loading	  6-13
       Figure 6.5b:   Plot of Acid Load vs. Sulfate Loading  	  6-13
       Figure 6.5c:   Plot of Iron vs. Sulfate Loading	  6-14
       Figure 6.6a:   Bivariate Plot of Log Acidity vs. Log Flow	  6-14
       Figure 6.6b:   Bivariate Plot of Log Sulfate vs. Log Flow  	  6-15
       Figure 6.6c:   Bivariate Plot of Log Acidity vs. Log Acid Load 	  6-15
       Figure 6.6d:   Bivariate Plot of Log Sulfate vs. Log Acid  	  6-16
       Figure 6.7a:   Time Series Plot of pH	  6-17
       Figure 6.7b:   Time Series Plot of Flow	  6-17
       Figure 6.7c:   Time Series Plot of Acidity	  6-18
       Figure 6.7d:   Time Series Plot of Iron Load 	  6-18
       Figure 6.7e:   Time Series Plot of Acid Load	  6-19
       Figure 6.7 f:   Time Series Plot of Sulfate Load	  6-19
       Figure 6.8a:   Autocorrelation Function of pH	  6-22
       Figure 6.8b:   Autocorrelation Function of Iron	  6-22
       Figure 6.8c:   Autocorrelation Function of Flow	  6-23
       Figure 6.8d:   Partial Autocorrelation Function of Flow	  6-23
       Figure 6.8e:   Autocorrelation Function of Acidity	  6-24
       Figure 6.8f:   Partial Autocorrelation Function of Acid	  6-24
       Figure 6.8g:   Autocorrelation Function of Acid Load  	  6-25
       Figure 6.8h:   Partial Autocorrelation Function of Acid Load	  6-25
Chapter 7.0  Analysis of Data from the Fisher Site

       Figure 7.0:    Map of Fisher Site  	  7-2
       Figure 7. la:   Histogram of Log Flow	7-6
       Figure 7. Ib:   Histogram of Log Acid  	  7-6
       Figure 7. Ic:   Histogram of Log Iron	  7-6
       Figure 7. Id:   Histogram of Log Manganese  	  7-7
       Figure 7. le:   Histogram of Log Aluminum   	  7-7
       Figure 7.2a:   Log Flow vs. Time (Days) 	  7-9
       Figure 7.2b:   Log Acidity vs. Time (Days)  	 7-10
       Figure 7.2c:   Log Iron vs. Time (Days)  	 7-11
       Figure 7.2d:   Log Manganese vs.  Time (Days)	 7-12
       Figure 7.2e:   Log Aluminum vs. Time (Days)  	 7-13
       Figure 7.3:    Plot of Log Iron vs.  Acidity 	 7-17
       Figure 7.4a:   Collection Dates vs. Observation Number (First Differences)	 7-19

-------
Table of Contents
       Figure 7.4b:   Plot of Log Flow vs. Time 	 7-19
       Figure 7.4c:   Plot of Log Acidity vs. Time  	 7-20
       Figure 7.4d:   Plot of Log Sulfate vs. Time	 7-20
       Figure 7.4e:   Plot of Log Iron vs. Time 	 7-20
       Figure 7.4f:   Plot of Log Manganese vs. Time	 7-21
       Figure 7.4g:   Plot of Log Aluminum vs. Time  	 7-21
       Figure 7.5a:   Autocorrelation Function of Days	 7-23
       Figure 7.5b:   Partial Autocorrelation Function of Days	 7-23
       Figure 7.5c:   Autocorrelation Function of Flow	 7-23
       Figure 7.5d:   Partial Autocorrelation Function of Flow	 7-24
       Figure 7.5e:   Autocorrelation Function of Acid  	 7-24
       Figure 7.5f:   Partial Autocorrelation Function of Acid	 7-24
       Figure 7.5g:   Autocorrelation Function of Sulfate  	 7-25
       Figure 7.5h:   Partial Autocorrelation Function of Sulfate 	 7-25
       Figure 7.5i:   Autocorrelation Function of Iron	 7-25
       Figure 7.5j:   Partial Autocorrelation Function of Iron	 7-26
       Figure 7.5k:   Autocorrelation Function of Manganese	 7-26
       Figure 7.51:   Partial Autocorrelation Function of Manganese  	 7-26
       Figure 7.5m:  Autocorrelation Function of Aluminum	 7-27
       Figure 7.5n:   Partial Autocorrelation Function of Aluminum	 7-27
Chapter 8.0  Analysis of Data from the Markson Site

       Figure 8.0:    Map of Markson Site	 8-2
       Figure 8. la:   Histogram of Interval (n=106)	 8-5
       Figure 8. Ib:   Histogram of Log Flow (n=107)  	 8-6
       Figure 8. Ic:   Histogram of Log Acidity (n=252)  	 8-6
       Figure 8. Id:   Histogram of Log Acidity (n=107)  	 8-6
       Figure 8.1e:   Histogram of Iron (n=253) 	 8-7
       Figure 8. If:   Histogram of Manganese (n=249)	 8-7
       Figure 8.1g:   Histogram of Log Aluminum (n=246)  	 8-7
       Figure 8. Ih:   Histogram of Sulfate (n=253)   	 8-8
       Figure 8.1i:   Histogram of Log Ferrous Iron (n=241)	 8-8
       Figure 8.2a:   Plot of Sulfate vs. Log Flow	 8-10
       Figure 8.2b:   Plot of Iron vs. Log Flow 	 8-10
       Figure 8.2c:   Plot of Manganese vs. Log Aluminum  	 8-11
       Figure 8.2d:   Plot of Log Aluminum vs. Sulfate	 8-11
       Figure 8.2e:   Plot of Manganese vs. Log Flow	 8-11
       Figure 8.2f:   Plot of Log Acidity vs. Log Flow 	 8-12
       Figure 8.2g:   Plot of Log Acid vs. Iron	8-12

       Figure 8.3a:   Time Series Plot of Flow	 8-14
       Figure 8.3b:   Time Series Plot of Manganese 	 8-15
       Figure 8.3c:   Time Series Plot of Sulfate	 8-16

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

       Figure 8.3d:  Time Series Plot of Iron  	  8-17
       Figure 8.3e:  Time Series Plot of Aluminum	  8-18
       Figure 8.3f:   Time Series Plot of Acidity	  8-19
       Figure 8.4a:  Autocorrelation Function of Flow	  8-22
       Figure 8.4b:  Autocorrelation Function of Iron	  8-23
       Figure 8.4c:  Partial Autocorrelation Function of Iron	  8-23
       Figure 8.4d:  Autocorrelation Function of Aluminum	  8-24
       Figure 8.4e:  Partial Autocorrelation Function of Aluminum	  8-24
       Figure 8.4f:   Autocorrelation Function of Intrvals	  8-25
       Figure 8.4g:  Autocorrelation Function of pH	  8-25
       Figure 8.4h:  Partial Autorcorrelation Function of pH	  8-26
       Figure 8.4i:   Autocorrelation Function of Manganese	  8-26
       Figure 8.4j:   Partial Autocorrelation Function of Manganese  	  8-27
       Figure 8.4k:  Autocorrelation Function of Sulfate 	  8-27
       Figure 8.41:   Partial Autocorrelation Function of Sulfate 	  8-28
Chapter 9.0   Statistical Summary and Review of Quality Control Limits

       Figure 9.1:    Example graph Log Discharge versus Days	  9-5
       Figure 9.2:    Flow Chart for Box-Jenkins Time Series Analysis	  9-9
APPENDIX A:      Hamilton Discharge Data

       Figure A. 1:   Map of Hamilton Site	  A-2

-------
Table of Contents

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

LIST OF TABLES


Chapter 1.0  Introduction


Chapter 2.0  Statistical Analysis of Mine Drainage Data


Chapter 3.0  Mine Drainage Data Analysis Algorithm

       Table 3.1:     Summary Statistics for S3CLAR (N=96)	 3-5


Chapter 4.0  Analysis of Data from the Arnot Site

       Table 4.la:    Summary Statistics for Arnot 001 Data 	 4-5
       Table 4.1b:    Summary Statistics for Arnot 003 Data 	 4-6
       Table 4.1c:    Summary Statistics of Arnot 004 Data 	 4-7
       Table 4.2:     Coefficient of Variation (%)	 4-8
       Table 4.3:     Comparison of Confidence Belts Around Mean and Median 	 4-17
       Table  4.4:    Observations Falling Beyond Confidence Limits of 2 Standard
                    Deviations Around the Mean Beyond the 1.58* H-Spread  	 4-17
       Table 4.5:     Comparison of Total Iron and Ferrous Iron 	 4-21


Chapter 5.0  Analysis of Data from the Clarion Site

       Table 5.1:     Summary Statistics for S3CLAR (n=96) 	 5-3
       Table 5.2:     Summary Statistics for S3CLAR Adjusted Data Deck (n=79) 	 5-7
       Table 5.3:     Summary of Time Series Models for pH, Clarion Mine	 5-17
       Table 5.4:     Summary Statistics for Time Series Models of SO4 from Clarion Site 5-17
       Table 5.5:     Summary of Time Series Models for Ferrous Iron Clarion Site	 5-19
       Table 5.6:     Comparison of Statistics Used to Calculate the QC Limits	 5-20
       Table 5.7:     Comparison of QC Limits (Spreads) around Mean and Median	 5-21
       Table 5.8:     Comparison of QC Limits Around the Median Using Various Forms
                    of Spread	 5-22


Chapter 6.0  Analysis of Data from the Ernest Site

       Table 6.1:     Summary Statistics of Data (n=174)	 6-3
       Table 6.2:     Summary Statistics of Log-transformed Data (n=174)	 6-4
       Table 6.3:     Correlation Coefficients for 9 Parameters (n=174)	 6-8
       Table 6.4:     Base Data for Calculation of Quality Control Limits of Ernest Data  . 6-20

-------
Table of Contents
       Table 6.5:
Two Measures of Quality Control	  6-20
Chapter 7.0  Analysis of Data from the Fisher Site

       Table 7.1:     Summary Statistics for 79 Log Transformed Observations	  7-3
       Table 7.2a:    Summary statistics for 57 Log Transformed Observations
                    (pre-remining)	  7-4
       Table 7.2b:    Summary statistics for 19 Log Transformed Observations (during
                    remining)	  7-5
       Table 7.3:     Correlations of Zero Order Among the Seven Variables	  7-15
       Table 7.4:     Summary of the Important Cross-Correlations (CCF) Among
                    Seven Variables  	  7-16
       Table 7.5:     Equations of Models Fitted to Variables from the Fisher Deep Mine  7-28


Chapter 8.0  Analysis of Data from the Markson Site

       Table 8.1:     Summary Statistics (n=107)	  8-3
       Table 8.2:     Summary Statistics (n = 253)	  8-4
       Table 8.3:     Correlations Among Variables (n = 107)  	  8-9
       Table 8.4:     Correlations Among Variables ( n = 253)	  8-9
       Table 8.5:     Cross-correlations Among the Variables  	  8-13
       Table 8.6:     Quality Control Limits X ±  2o	  8-20
       Table 8.7:     Quality Control Limits: Md ± [1.96^^3)71.35Nn]  	  8-20
       Table 8.8:     Tests of the Different Models for Each Parameter  	  8-29
       Table 8.9:     Model Equations for the  Variables (see Table 8.8)	  8-31


Chapter 9.0  Statistical Summary and Review of Quality Control Limits

       Table 9.1:     Guidance and Protocols for Water Sample Collection	  9-1
       Table 9.2:     Acf of the Residuals from Fitting an MA Model to the Original
                    Observations After Taking a First Difference:  SO4	  9-12
       Table 9.3:     Pacf of the Residuals from Fitting an MA Model to the Original
                    Observations After Taking a First Difference:  SO4	  9-12
       Table 9.4:     Arrangement of Significant Deviations of the Residuals
                    (> 2 Standard Error = 66.02) 	  9-12


Appendix A

       Table A. 1:    Alternate Models Fitted to Log Data	A-4
Vlll

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 1: Introduction

From 1984 through 1988, the U.S. Environmental Protection Agency (EPA) and the
Pennsylvania Department of Environmental Resources (PA DER, now PA DEP) studied the
water quality of long-term pre-existing discharges from abandoned mine lands throughout
Pennsylvania as part of a cooperative project on remining. Water quality data from these
discharges were examined using univariate, bivariate, and time series statistical  analyses to
assess coal mine drainage discharge behavior. The results of the statistical analyses were
included in a series of eight unpublished reports prepared for PA DEP and EPA by Dr. J. C.
Griffiths of the Pennsylvania State University in 1987 and 1988.

This report presents a compilation of the work by Dr. Griffiths and co-authors and was prepared
by PA DEP and EPA, to support proposal of the Coal Remining Subcategory under existing Coal
Mining industry regulations (40 CFR part 434). This report specifically supports statistical
procedures provided in EPA's Coal Remining Statistical Support Document (EPA-B-001-001),
and is intended to be a companion to that document.  Chapter 1 of the Coal Remining Statistical
Support Document contains a description of the remining program history in Pennsylvania from
1984 to  1999, including the development of the REMINE computer program and permitting
procedures used in issuing approximately 300 remining permits during that time period. Chapter
1 of the  Statistical Support Document also contains the results of an evaluation of state remining
programs in  20 states that was completed by the Interstate Mining Compact Commission
(IMCC).

Several publications described and documented the mining  engineering and treatment costing
components  of the original  cooperative remining project of  EPA and PA DER (listed and briefly
described in  Chapter 1 of the Coal Remining Statistical Support Document), but the statistical
work of Dr. J.C. Griffiths and co-authors was not published or widely disseminated, and John C.
Griffiths died at age 82 in June, 1992. This report was compiled, edited and completed by his
co-authors and DynCorp, I  & ET. J.C. Griffiths is listed as  the major author posthumously
because this  document contains his original work and is a tribute to him and his work.

There are several additional correlations between this report and the Coal Remining Statistical
Support Document.
•  Chapter 2 of the Coal Remining Statistical Support Document contains descriptions of the
   three fundamental acid mine drainage discharge types and their respective behaviors (flow
   and water quality relationships) that are based on work done in the statistical studies of the
   Arnot, Ernest, and Markson discharges featured in Chapters 4, 6, and 8 of this report.
•  Chapter 5 of the Coal Remining Statistical Support Document includes numerous figures and
   tables depicting various options in baseline pollution load development (e.g., Table 5. la) that
   are based upon the data sets in Chapters 4 through 8 and Appendices A through F of this
   report.
•  Chapter 5 of the Coal Remining Statistical Support Document contains additional data from
    1988 to 1999 of the Fisher and Markson sites, providing excellent additional information on
   the long term variations in these discharges.
                                                                                    1-1

-------
Chapter 1

The establishment of the baseline pollution load for a coal remining permit requires the proper
sampling and chemical analysis of the abandoned mine drainage discharges, and the appropriate
statistical analysis of the flow, water quality and pollution load data. The term proper sampling
in this report, is taken in two contexts:  (1) following the recommended procedures for
collection of surface- and ground-water samples, (including measurements of flow and water
quality parameters, and fixing,  storing and transporting the samples to the laboratory for
chemical analyses), and (2)  collecting a sufficient number of samples over an adequate duration
and sampling interval in order to be representative of the variations in flow and water quality of
the discharges throughout the year.

Guidelines and protocols for water sample collection from EPA, the U.S. Geological Survey
(USGS), and other sources, are compiled in Table 9.1.  These 14 manuals and related
publications represent some of the most recent technical guidance disseminated by Federal
agencies on water sampling.  Much of this information is founded on common sense and earlier
publications on this subject, and include for example, recommendations, sampling streams and
major mine discharges at approximately mid-stream and mid-depth to avoid unrepresentative
effects of surface debris, bottom sediments, chemical stratification or lack of mixing near stream
banks. Water sampling procedures are as important as the laboratory analysis and the statistical
analysis of the discharge data.  If the water sampling procedures are flawed or unrepresentative,
the laboratory analyses, regardless of its high degree of accuracy and precision, may be
meaningless. Similarly, the most rigorous statistical analysis may be worthless if it is based
upon  faulty laboratory analyses or flawed sampling procedures.

The statistical aspects of proper sampling are summarized in Chapter 9 of this report and are
discussed in numerous other references  including Griffiths (1967) and Griffiths and Ondrick
(1968) concerning the proper sampling of geologic populations. In statistical analyses, it is
always important to work with  samples  that are representative of the population from which they
are drawn (see Chapter 2 of this report).  Since most of the abandoned mine discharges included
in this report flow continuously, there is an almost infinite number of samples that could be
drawn throughout the water year. For example, one sample collected every hour equals 720
samples per month or 8,760 samples per year. Representative sample collection  should be
assessed in regards to practicality, feasibility, and cost.

Chapter 2 of this report provides an introduction to the statistical methods that may be employed
in establishing baseline pollution load, and Chapter 3 describes the data analysis  algorithm that
was developed for evaluating mine drainage discharges (see Figure 1.2a of the Coal Remining
Statistical Support Document and Figure 3.1 of this report). Abandoned mine drainage
discharge data from six sites in Pennsylvania are statistically analyzed and presented in graphs
and tables in Chapters 4 through 8 and Appendixes A through F of this report.  The locations of
these  sites are shown in Figure 1.1. More detailed site maps and descriptions of the site
characteristics are included in the beginning of each chapter. Chapter 9 is a summary of the
statistical analyses presented throughout this report, with emphasis on the interpretations of time
series analysis and quality control limits.
1-2

-------
                            Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
 4*
 4*
 s
•o
 s

 s
 s
 o
u

.2
'a
 SS
 S


 0)

PH

CM
 O

 a
 SS
g
0)


S

W)
                                                                                                    1-3

-------
Chapter 1
1-4

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 2:  Statistical Analysis of Mine Drainage Data

If discharges from abandoned mines did not vary in flow or water quality parameters through
time, it would not be necessary to use statistics to determine the baseline pollution loads of a
remining site. In fact, the baseline determination would involve little effort, in terms of
representative sampling and chemical and data analyses.  A mine operator or regulatory agency
could simply collect one sample to initially establish the baseline flow and water quality, and
then collect a second sample at some later time before remining commences to document that the
flow and water quality parameters do not vary through time. However, abandoned mine
discharges typically vary significantly in flow and/or quality throughout the water year, and it is
necessary to use statistics to quantify and explain these variations.  Data representing this
variation and the statistical  analysis of such variation are presented in the succeeding chapters of
this report.  This chapter provides an introduction to the statistical methods that may be
employed in determining the baseline pollution load.

Variation
The fundamental problem to be addressed in determining baseline pollution load is how to
statistically  summarize the natural variations in flow and water quality parameters before
remining commences, in order to enable the separation of mining-induced changes in pollution
load from natural seasonal variations in pollution load during and following remining operations.
This problem is depicted in Figure 2.1, which  shows hypothetical variations in acid load of an
abandoned mine discharge before (pre), during, and after remining.  It is important to note that
Figure 2.1 is presented for graphical description of statistical triggers only and that the  after-
mining scenario represented in this figure is atypical. In  almost all cases, remining will improve
water quality. Whatever the case, water quality data should be plotted and statistically analyzed
to determine whether adverse effects have occurred.

Figure 2.1:   Example of Acid Load Variation Before, During, and After Remining
         1000
      Upper 95%
      Confidence
      Limit
 H
 —i
 9
           500
      Lower 95%
      Confidence
      Limit
            0
                       Pre Mining
                                        TIME
During & After Mining
                      HYPOTHETICAL VARIATIONS IN WATER QUALITY
                                                                                      2-1

-------
Chapter 2

In Figure 2.1, the acid load varied greatly before remining commenced ranging from nearly zero
pounds of acidity per day (Ibs/day) to nearly 1,000 Ibs/day. Observe that before remining, the
discharge usually varies somewhat symmetrically above and below the central value of 500
Ibs/day (central tendency) and that the variations are generally contained between the values of
50 Ibs/day and 950 Ibs/day which have been labeled the lower and upper control levels. Also
note that the acid load was higher than the upper control level on one or two occasions before
remining commenced.  However, during remining the acid load was above the upper control
level much more frequently, while the acid load is still varying somewhat symmetrically above
and below the central tendency value for at least the first two years during remining. Finally,
during the last three years of remining and following the completion of remining, the acid load
still varies above and below the central tendency value, but there appears to be a trend of
increasing acidity between the central tendency value  and  the upper control level.

In order to determine baseline pollution load, it is necessary to statistically analyze the data to
find a measure of central tendency (e.g., mean or median)  and a measure of the patterns of
variation or the dispersion of the individual observations (i.e., samples around the central
tendency as shown in Figure 2.1). In order to separate mining-induced changes in pollution load
from natural seasonal variations, it is necessary to develop a statistical mechanism to determine
when variations in the pollution load are out of control; that is, when significant deviations from
the pre-remining baseline have occurred which can be attributed to factors other than natural
seasonal variations (e.g., problems within remining operations, unrepresentative baseline,
inappropriate monitoring).

There are two types of variation in pollution load which are of interest in evaluating monitoring
data during and after remining in order to determine whether the variations are out of control
from the established baseline conditions.

•  Dramatic Trigger - The first and most obvious pattern  of variation occurs when  there are a
    series of extreme events which consistently exceed the upper control level as shown in
   Figure 2.1 during the first two years of remining.  During this time, the variation pattern
   indicates a sudden and dramatic increase in pollution load which may be attributed to
   remining, and which is referred to as the dramatic  trigger.

•   Subtle Trigger - The second pattern of variation of concern is a trend of gradually increasing
   pollution load (as shown in the right side of Figure 2.1,) where the general pattern of acid
   load observations is increasing above the baseline  central tendency value for several years
   without ever exceeding the upper control level.  In this case, when the central tendency
   values are calculated for each water year during remining, a corresponding gradual increase
   in central tendency values will be detected until a  significant difference exists between the
   baseline central tendency and a central tendency calculated for a water year after remining
   has commenced.  As this second pattern of variation is much less dramatic than  the first, and
   takes much  more time and effort to detect, it is referred to as the subtle trigger.

The reason that these two patterns of variation are referred to as triggers is that they can be used
to set off or initiate the requirement for a mine operator to treat a pre-existing discharge to a

2-2

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

numerical effluent limitation. In issuing a remining permit, the regulatory authority makes a
determination that the site can be mined without causing additional pollution, and that the
pollution abatement plan in the permit application demonstrates that the existing baseline
pollution load will be reduced. The mine operator and the regulatory authority anticipate
environmental improvement through remining without the need to treat the pre-existing
discharge.  However, the possibility exists that degradation of the discharge may occur,
temporarily or permanently, as the result of remining if the pollution abatement plan is not
implemented as required or if unforeseen circumstances develop.

If fair and reasonable consideration is given to the concerns of the mine operator and protection
of the environment, the treatment triggers must be carefully established so that they are: (a)  not
set off prematurely or erroneously, adversely affecting the mine operator, or (b) set off too late
resulting in additional mine drainage pollution without treatment.  Even the most thorough
representative sampling program of a given water year may not capture the most extreme events,
because the worst storm (flood) and the most severe drought are rare events and do not occur in
every water year. Although it is unreasonable to require a mine operator to collect baseline
water samples until the 100 year storm  event or a significant drought are captured, it is also
unreasonable to require the mine operator to commence treatment the first time that the extreme
event or upper control level of the baseline is exceeded.  The reagent costs  alone for treating
some pre-existing pollutional discharges can be several hundred dollars per day and the total cost
of building a treatment plant can be more than one million dollars. Costs for treatment of some
worst case post mining discharges in the State of Pennsylvania were as high as $ 700 /day
(hydrated lime). Cost for construction of these discharges were greater than $ 2.1 million.

Conversely, the regulatory authority is not fulfilling its environmental protection mandate if the
upper control level and extreme events  of baseline are routinely being exceeded and the
additional mine drainage pollution effects are obvious, but treatment has not yet been required
because statistical analysis of the water year has not been completed. In light of these concerns,
problems that need to be resolved statistically with respect to the dramatic and subtle triggers
are:

•   Dramatic trigger -  how high should the upper control level  or tolerance level be, and how
    many excursions above this upper level are tolerable before it is determined that the system
    is out of control and treatment of the discharge must be initiated.

•   Subtle trigger - how much deviation from the baseline central tendency value is tolerable in
    succeeding water years before it can be determined that a significant difference exists.

Both of these problems may be addressed statistically with a relatively simple quality control
approach to the data.

Normal Distribution
The quality control approach used in this report and much of statistical work in general, is
dependent upon the frequency distribution of the sample data. It is important to collect
representative samples, because it is usually impossible or impractical to measure and analyze

                                                                                       2-3

-------
Chapter 2

the entire population of the parameter being studied. Whether the samples represent variation in
a single point through time (e.g., seasonal variations in the acidity of an abandoned mine
discharge) or spatial variations in a parameter of interest (e.g., variations in the mean acidity of
surface mine discharges from the lower Kittanning coal seam of 200 sites in western
Pennsylvania), one of the first steps of statistical analysis, typically, is to plot the frequency
distribution of the data.  According to Sir Ronald A. Fisher (1970), the founder of many
important statistical advances since the 1920's:

     "The idea of an infinite population distributed in a frequency distribution in respect of
     one or more characters is fundamental to all statistical work.  From a limited experience,
     for example, of individuals of a species, or of the weather of a locality,  we may obtain
     some idea of the infinite hypothetical population from which  a sample is drawn, and so of
     the probable nature of future samples to which our conclusions are to be applied. If a
     second sample belies this expectation we infer that it is, in the language of statistics, drawn
     from a different population; that the treatment to which the second sample of organisms
     had been exposed did in fact make a material difference, or that the climate (or the methods
     of measuring it) had  materially altered. Critical  tests of this kind may be called tests of
     significance, and when such tests are available we may discover whether a second sample
     is or is not significantly different from the first." (p. 41)

Fisher (1970) also states:

     "Statistics may be regarded as (i)  the study of populations, (ii)  the study of variation,
     (iii) the study of methods of the reduction of data (p. 1)... [and] ... .A  statistic is a value
     calculated from an observed sample with a view to characterizing the population from
     which it is drawn."  (p. 41)

The frequency distribution is a graphical summary of the sample data, and its shape and
accompanying summary statistics enable a greater understanding of how a variate behaves. This
understanding is gained through comparison of the frequency distribution of observed  data to the
shape and characteristics of a known mathematical or theoretical distribution, such as the normal
distribution or binomial distribution.  The normal distribution shown in Figure 2.2 is the most
widely known and most useful frequency distribution. It is also known as the Gaussian error
curve or bell-shaped curve.

The key statistical parameters of a normal frequency distribution are the mean, as the measure of
central tendency (i.e., shown as X  in Figure 2.2), and the standard deviation or the variance, as
the measure of variation or dispersion (i.e., the standard deviation is shown as (7  in Figure 2.2).
The mean is the arithmetic average of the data, which is computed by dividing the sum of all of
the observations by the total number of observations.  The variance is the sum of the squares of
the deviations of all of the observations from the mean.  The standard deviation is the positive
square root of the variance.
2-4

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

According to Fisher (1970) and Griffiths (1967), the sample mean (X) and standard deviation
(<7 ) determined from a random sample are best estimators of the corresponding population
mean (jj, ) and standard deviation (.
g 100
I 80
4>
i 60

40
20
0
" s = 0.5583
V2>j « 0.1009
- S* = 0.0504
«?, = 0. 1 0 1 4
*, - 4 1656
H



b b
n cj
I 1
i i
! 1
— 99.73 	 H Parometers of ..........
Areas under normal population IUH1HJ1
*— 95.46 — H normal curve p- = 1.4278
K67.45H j o-2 = 0.3117
er =0 5583

1
b I /

~~\ y
- 1
/C = 1.1656 LeptoNurtic fl
-^ = 1. 1926
_ N - 450
y^ i fi 9fl
^ D. c. O




d.f - 3 0-)0>/J>0.05
~
-

._ £
& c


fflfD
1 yunf
/
/
,,/h
I I/I

/


t v^", = 0
R b b ^ S* - 0
\ + + + n « o
III
\ /? • 3
«¥] K =3
jp=M\| r2 - o

cr»


»S*


LA
*°"-4
]\
ran
*'SH \
V
Tifgl
OOOOOOOOOOOO
5 — Odd— " — Wra«M^r' P*1' UnitS
III -1
(- + + + + + + •*•
 In addition, probability statements, which are used in significance testing, quality control
 techniques and other statistical methods, are frequently based upon some special properties of
 the normal distribution (see Griffiths, 1967, pp. 263 - 267). The area under the curve of the
 normal distribution in the interval between the mean minus one standard deviation and the mean
 plus one standard deviation (as shown in Figure 2.2, from Griffiths, 1967, p. 259) is 67.45%,
 while 95.46% of the area of the normal distribution is contained in the interval of the mean plus
 and minus two standard deviations (i.e., X  +1-2 G ). Therefore, from the table of areas of the

                                                                                       2-5

-------
Chapter 2

normal distribution it may be stated that 95% of the area of the distribution will be contained in
the interval of X +/- 1.96(7 , Griffiths, 1967, p. 265).

Quality Control - Normal Distribution
The type of statistical analysis known as quality control was largely developed by Shewhart
(1931, 1939) and others to evaluate tolerable amounts of variation in manufacturing processes.
Since then, the quality control approach has been applied to many other fields of study. Many of
the variates studied in very large samples, such as the number of defective light bulbs produced
by a manufacturing process, were empirically shown to closely approximate a normal
distribution.  Consequently, the most typical applications of quality control statistics involve a
normal distribution.

The frequency distribution of the data is essentially arranged  along the vertical axis of the quality
control graph as shown in Figure 2.3. The actual histogram of value classes is typically omitted
from the graph.  The mean of the data set, or grand mean of the means of sets of observations, is
usually plotted as the measure of central tendency.  Quality control levels, known as confidence
intervals, are established at plus and minus two or three standard deviations from the mean.
Individual observations through time, or comparisons of sets  of data representing variations in
operator performance, are then plotted along the horizontal axis in order to evaluate the patterns
of variation in these observations with respect to the confidence intervals around the mean.  As
95.46% of the area of the normal frequency distribution is contained in the interval of the mean
+/- two standard deviations,  it is expected that approximately 95 out of 100 observations will
occur within  the confidence  intervals.

Figure 2.3:    Example of a Quality Control Graph (Griffiths, 1967, p. 318)

According to Griffiths (1967):
2
-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

     If the observations are in control, they will fluctuate randomly around the mean value, and
     some 5 in 100 will fall outside the 2 
-------
Chapter 2

In the normal frequency distribution, the values are symmetrically distributed around the mean,
and the mean and standard deviation are best statistical estimators of the population.  In a highly
skewed frequency distribution, the mean may not be the best estimator of central tendency, and
the standard deviation may not be the best measure of dispersion.

For example, the few extreme values bias the mean toward the high values, and 95% of the area
of the curve is not contained within ± 2 standard deviations from the mean. In cases where the
frequency distribution is not normal, the concept of the quality control approach may still be
pursued, but data analysis adjustments must be made to either:  (a) transform the observed
frequency distribution to approximate normality, or (b) employ different statistics (e.g., use of
the median instead of the mean) in the quality control technique.

The logarithmic transformation of the data is usually the most effective transformation to reduce
positive skewness in the frequency distribution. The lognormal distribution had been
extensively described by Aitcheson and Brown (1973) and examples of lognormal behavior of
variates are found in Griffiths (1967), Krumbein and Graybill (1965) and other sources.
However, a logarithmic transformation of the raw data will not solve all problems of asymmetry
or other conditions of non-normality of the frequency distribution. Additional information on
transformations of data is described  later in this chapter and in Box and Cox (1964), Griffiths
(1967, p. 306), and Krumbein and Graybill (1965, p. 216).

In order to evaluate different statistics that may be applicable to the quality control approach, it
is necessary to explain and differentiate nonparametric statistics, distribution free statistics, and
order statistics. It is also necessary to compare exploratory data analysis with confirmatory data
analysis.

•  Conventional Parametric Statistical Analysis - statistical estimators, such as the mean and
   standard deviation,  are used to approximate the corresponding parameters of the population,
   the true mean and variance.
•  Nonparametric Statistical Analysis  - tests of significance are performed without depending
   on the constraints of a known frequency distribution and the parameters of that known
   frequency distribution (e.g., the mean and variance of the normal distribution).
   Nonparametric statistical tests are also used where the scale level of the data are only
   nominal or ordinal, rather than on interval or ratio scales used in more rigorous statistical
   analyses. However, if the data conform to a known frequency distribution, there are
   parameters for that distribution.
•  Distribution-free Statistics - is used to describe  statistical analyses where parameters are
   estimated independently of the shape of the frequency distribution, such as the use of the chi-
   square statistic to test the class by class departure from the expected value.
•  Order Statistics  - is applied to statistical analyses where the shape of the frequency
   distribution is important, but  is evaluated less rigorously than in conventional parametric
   statistical analyses. In order  statistics, the median is typically used as the measure of central
   tendency instead of the mean, and quartiles or related values are typically used to measure
   dispersion, the spread of values about the median, or the shape of the distribution. The
   position of the median in an ordered set of observations is the middlemost position. For

2-8

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

    example, when 15 values are ordered from low to high, the depth (position) of the median is
    at the (N+l)/2 position = 8th position. The position of the two quartiles (Ql3 Q3) in this
    ordered set is halfway between the median and the extremes (e.g., lower quartile (Qx) is at
    the (8+l)/2 = midway between the 4th and 5th observation). The quartile (Qx in this case), is
    found by counting in from either extreme to the 4th observation.  The quartiles essentially
    divide the frequency distribution into fourths, so that half of the values in the distribution are
    contained in the interval between the lower quartile and the upper quartile as shown in Figure
    2.5 (i.e., within box). Other values of spread or dispersion are similarly determined based
    upon their rank or order in the frequency distribution.

Figure 2.5:   Net Alkalinity Boxplot for Fisher Mine Site Discharge (from U.S. EPA Coal
            100  -f
       CD
              0  -
       c
       "co
       *  -100
        -
       0
           -200  -
                        Premining
During
Postmining
              Remining Statistical Support Document, March 2000, EPA-821-B-00-001)

As a final note on the relationships of the various frequency distributions discussed herein and
elsewhere (see Fisher (1970), Fisher (1973), Griffiths (1967), Krumbein and Graybill (1965) and
Tukey (1977)), regardless of the shape of the frequency distributions in samples of water quality
parameters or almost any other variable of interest, the distribution of the means of sample sets
or means of repeated sampling efforts tend to be normally distributed.  Generally, frequency
distributions tend toward normality as the number of observations in the sample set becomes
very large (i.e., greater than  1 million observations).  However, most samples of mine drainage
                                                                                      2-9

-------
Chapter 2

data used in remining permitting and monitoring will contain a relatively small number of
observations (i.e., less than 30).

Exploratory and Confirmatory Data Analyses
Most of the statistical analyses discussed thus far, especially significance tests, can be included
in the realm of confirmatory data analysis rather than exploratory data analysis.

According to Tukey (1977):

     The principles and procedures of what we call confirmatory data analysis are both
     widely used and one of the great intellectual products of our century. In their
     simplest form, these principles and procedures look at a sample — and at what that
     sample has told us about the population from which it came — and assess the
     precision with which our inference from sample to population is made.  We can no
     longer get along without confirmatory data analysis. But we need not start with
     it....(p. vi)

     Once upon a time, statisticians only explored. Then they learned to confirm exactly -
     to confirm a few things exactly,  each under very specific circumstances. As they
     emphasized exact confirmation,  their techniques inevitably became less flexible.  The
     connection of the most used techniques with past insights was weakened. Anything
     to which a confirmatory procedure was not explicitly attached was described as
     "mere descriptive statistics," no  matter how much we had learned from it (p. vii).

     Exploratory data analysis is detective work... Confirmatory data analysis is
     judicial or quasi-judicial in character.... Unless the detective finds the clues, judge or
     jury has nothing to consider. Unless exploratory data analysis uncovers
     indications, usually quantitative ones, there is likely to be nothing for
     confirmatory data analysis to consider, (p. 1).

From the preceeding discussion of statistical analyses, it is apparent that there are many
statistical methods and approaches to  analyzing data. In order to establish the statistical methods
to be used in analyzing abandoned mine discharge data for remining permitting and monitoring,
it is necessary to consider the relationship between the characteristics of the sample data and the
types of questions to be addressed in determining the baseline pollution load of the discharges.
Sometimes, the characteristics of the available data do not lend themselves well to the type of
statistical analysis which would be most appropriate to solve the problem.  The type of statistical
analysis which is:  (1) appropriate to  apply to a specific data set, and (2) desired or necessary to
answer specific questions about the data depends upon numerous factors.  These factors include:

•  the sampling method,
•  the number of observations included in the sample,
•  the interval between observations  in time (or space),
•  the number of measurements performed (e.g., analyzing a water sample for 12 chemical
   constituents),

2-10

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

•   the scale level of the data (i.e., nominal, ordinal, interval, ratio), and
•   the frequency distribution of the data.

Univariate/Bivariate and Multivariate Analysis
Statistical analyses which evaluate a single variable are referred to as univariate analyses, while
bivariate analyses evaluate the relationship between two variables. Multivariate statistical
analyses concurrently evaluate the relationships among more than two variables.  Statistical
methods involving the frequency distribution of a variable(e.g., chi-square "goodness of fit" test,
T-test of the significance of means, F-test of variance ratios) are examples of univariate
statistical analyses. Linear regression and correlation (e.g., correlation coefficient (r), and
coefficient of determination (r2)) are examples of bivariate analyses, while multiple regression,
factor analysis, principal components analysis, and cluster analysis are examples of multivariate
analyses.

It is obvious that it will be very difficult, if not impossible, to use a univariate statistical method
to solve a multivariate problem.  For example, assume  a mine drainage data set contains 100
water samples (i.e., number of observations, N = 100) which have been analyzed for 20 chemical
constituents (i.e., number of parameters, p = 20), an N  x p data matrix of 100 x 20 results, within
which some of the parameters may be highly correlated or dependent upon each other (e.g.,
acidity, sulfate, and iron may vary in a closely associated pattern).  If the problem to be solved is
"how many independent sources of information are contained in the data matrix," a multivariate
or "p-dimensional" problem exists that should be addressed with a multivariate statistical method
such as principal components analysis or factor analysis.  The evaluation of the shape of the
frequency distribution of any or all of the 20 variates, in a univariate statistical context, may be
an important part of the data analysis process, but it would not solve the multivariate problem.

As the level of sophistication and rigor of the statistical analysis increases from univariate
through bivariate and multivariate to include some very powerful statistical methods such as
time-series  analysis, the requirements placed upon the  quality of the data set increase in a
corresponding manner. As described earlier, many parametric, univariate statistical methods are
based upon the assumption that the sample data are normally distributed. Many bivariate
statistical methods, such as linear regression which uses a least-squares method to determine a
best fitting regression line, assume that the scatter of data points (when the two variates are
plotted together) occurs in a uniform pattern, known as homoscedasticity. In general terms for
correlation and regression analyses, this means that:  (a) the scatter of the data points does not
increase as the data values of the two variates increase, and (b) the data  are normally distributed
orthogonal to the regression line (i.e., within sections drawn perpendicular to the regression line
at equal intervals along the line). Many multivariate statistical methods are based upon the
assumption of joint normality of the data matrix (i.e., that all of the variates are normally
distributed). Most multivariate statistical analyses are  also greatly impeded by missing data
(e.g., where 75 of 100 water samples were analyzed for 20 parameters, and the remaining 25
samples were analyzed for 12 parameters), as adjustments are made to the data matrix in order to
enable the use of the matrix algebra necessary to mathematically solve the problem. The proper
use of time  series analysis generally requires a very large number of observations, equally
spaced in time (i.e., equal intervals between observations), with no missing data.

                                                                                       2-11

-------
Chapter 2
Time Series Analysis
As stated earlier, the fundamental statistical problem to be addressed in determining baseline
pollution load for remining permitting and monitoring purposes is how to summarize the natural
variations in flow and water quality parameters before remining commences, in order to enable
the separation of mining-induced changes in pollution load from natural seasonal variations in
pollution load during and following remining operations.  Conceptually, this is the type of
statistical problem which is ideally solved by time-series analysis or a specialized area of time-
series analysis, known as intervention analysis.  However, the data quality requirements for these
types of statistical analyses will exceed the available data for most remining cases, and to require
remining permit applicants to collect sufficient data for these analyses would be an onerous and
expensive task. The principles of time-series analysis will be briefly introduced here, and more
fully explained in later Chapters.

The use of time series analysis in this report is chiefly for research purposes where adequate data
exist. The results of research with time series analyses of relatively large mine drainage
databases provide a better understanding of the behavior of abandoned mine discharges as they
vary through time, and facilitate the application of a relatively simple quality control approach to
the statistical analysis of the smaller sets of discharge data typically used in computing baseline
pollution load in remining permits.

According to Vandaele (1983):

     A time series is a collection of observations generated sequentially through time. The
     special features of a time series are that the data are ordered with respect to time, and
     that successive observations are usually expected to be dependent.  Indeed, it is this
     dependence from one time period to another which will be exploited in making
     reliable forecasts....  It also will be useful to distinguish between a time series process
     and a time series realization.  The observed time series is an actual realization of an
     underlying time series process.  By a realization we mean a sequence of observed
     data points and not just a single observation.  The objective of time series analysis is
     to describe succinctly this theoretical process in the form of an observable  model that
     has similar properties to those of the process itself, (p. 3).... A time series model
     consisting of just one variable is appropriately called a univariate time series model.
     A univariate time series model will use only  current and past data on one variable	
     A time series model which makes explicit use of other variables to describe the
     behavior of the desired series is called a multiple time series model. The model
     expressing the dynamic relationship between these variables is called a transfer
     function model. The terms transfer function model and multiple time series model
     are used interchangeably, (p. 8).... Finally, a special form of transfer function model
     is the intervention  model.  The special characteristic of such a model is not the
     number of variables in the model, but that one of the explanatory variables captures
     the effect of an intervention, a policy change, or a new law. (p. 9).
2-12

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

The use of intervention analysis to evaluate remining discharge data might be particularly
appropriate providing that adequate data quality exists. One of the seminal works in intervention
analysis is described in Tiao and Box and Hamming (1973) and Box and Tiao (1975) in which
photochemical smog data from Los Angeles was analyzed in order to evaluate the effect of a
new law requiring the reduction of reactive hydrocarbons upon the oxidant pollution level in the
city.  This is analogous to analyzing abandoned mine drainage pollution load data collected
before, during and after remining in order to determine the effect of remining upon the level of
the baseline pollution load in the presence of significant seasonal variations.  An example of the
use of intervention analysis of abandoned mine drainage data is the study by Duffield (1985) of
the Arnot discharges, that are also featured in Chapter 4 of this report.

According to Box and  Tiao (1975, p. 70):

      Data of potential value in the formulation of public and private policy frequently
      occur in the form of time  series. Questions of the following kind often arise: "Given
      a known intervention, is there evidence that change in the series of the kind expected
      actually  occurred, and, if so, what can be said of the nature and magnitude of the
      change?" ...  In the examples quoted, however, the data are in the form of time
      series, in which successive observations are usually serially dependent and often
      nonstationary and there may be strong seasonal effects. Thus, the ordinary
      parametric or nonparametric statistical procedures which rely on independence or
      special symmetry in the distribution function, are not available nor are the blessings
      endowed by randomization.

Intervention analysis and other methods of time series analysis are very powerful statistical tools
which would be desirable and useful in evaluating baseline pollution load data, but these types of
statistical analyses will usually be inappropriate for remining permitting due to inadequate data
availability and data quality.  Therefore, it was necessary to develop a data analysis algorithm
which recognized or allowed for the use of time-series analyses, but did not require the routine
use of these statistical methods in order to answer the desired questions about the remining
discharge data.

A flow chart outlining the data analysis algorithm for determining the baseline pollution load is
shown in Figure 3.1. The algorithm includes evaluations of data quality, univariate statistical
analyses, bivariate statistical analyses and time  series analyses methods to  establish quality
control limits.  The algorithm includes steps to evaluate the normality of the frequency
distribution and transform the data if the distribution is not normal (i.e., positively skewed);
however, the use of the statistical methods in the algorithm does not require the  distribution  to be
normal.  The algorithm contains elements of parametric statistical analysis, but it is primarily
based upon order statistics and  non-parametric statistics.
                                                                                      2-13

-------
Chapter 2
2-14

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 3:  Mine Drainage Data Analysis Algorithm

A flow chart outlining the data analysis algorithm for determining baseline pollution load is
shown in Figure 3.1.  The algorithm includes evaluations of data quality, univariate statistical
analyses, bivariate statistical analyses and time series analyses. The algorithm also includes
steps to evaluate the normality of the frequency distribution and logarithmically transforms the
data if the distribution is not normal (i.e., positively skewed); however, the use of the statistical
methods in the algorithm does not require the distribution to be normal.

All of the statistical analyses included in the algorithm are contained in the MINITAB1 computer
software package, which was used to assess the data presented in this report. The analysis
contained in MINITAB was incorporated into the REMINE2 computer software package
developed by EPA, PA DEP, and Pennsylvania State University.  Other software packages
included  Statistical Analysis Software (SAS) and Stat Graphics. A  significant feature of the
algorithm and the MINITAB program in general is that a user with limited statistical analysis
experience can perform the rudiments of the baseline pollution load analysis without
encountering  too much difficulty, while the user with greater statistical training can expand the
statistical analysis to include a much greater array of statistical methods if desired.  The
remainder of this chapter is devoted to explaining the elements of this remining data analysis
algorithm.

Data from six study sites were submitted to the standard procedures shown in Figure 3.1.  (These
data are described in detail in Chapters 4 through 8.)  There are twelve steps in the complete
analysis,  and  it should be emphasized that only the first nine are needed for routine remining
permits. Steps 11 and 12 are for research purposes only. The most important step is initial
examination of the data (Step 1, Figure 3.1). Following this examination, missing values are
identified and adjusted.  Additionally, any extreme outliers (Step  3) should be examined to see if
they are real observations or errors of entry at some stage in the data collection procedure.

The next step (Step 4) is to graph discharge (flow) versus days (ordered observations).
Frequently, it is advisable to plot log discharge in order to reduce extreme variations. This
procedure also helps to reduce extreme positive skewness (if present in the data). This reduction
of asymmetry improves the subsequent analysis of the data and makes the probability statements
more reliable. Because extreme observations may result from unusual events (such as heavy
downpours, snowmelt), the reduction of variation should be used with discretion. In many cases,
these unusual extreme data values may indicate events of considerable importance in the study of
the natural variation in the data.
                 is a commercial software package from Minitab, Inc. ©1986, 3081 Enterprise Drive,
State College, PA 16801.

       2REMINE is a computer software package developed by EPA, PA DEP and the
Pennsylvania State University, Version 1.0 (November 1988) and Version 2.0  (April 1992), page
R-2.
                                                                                     3-1

-------
Chapter 3

Step 5 is also crucial for determining regularity of the sampling to further identify the larger gaps
in the data.  The plot of discharge versus days prepared in Step 4 is one way of seeing this aspect
of behavior in the data.  Another useful procedure is to take "first differences" of days (or order
of observation). This procedure leads to a frequency distribution  and a histogram in which the
intervals between observations are clearly displayed.
3-2

-------
                          Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 3.1  Algorithm for Analysis of Mine Drainage Discharge Data
                                                   Raw
                                                   Data
                                     1. Initial examination.
                                     2. Adjust missing values to *.
                                     3. Examine for unusual values.
                         4. Graph discharge or Log discharge vs days.
                         5. Check for unequal intervals, missing data, and extremes
                                                 MINITAB
                                          6. Univariate statistics.
                                     DESCRIBE - Summary Statistics.
                                         HISTOGRAM: symmetry?
                              Yes
No
                                           7. Describe Histogram
                                        8. Examine and Edit Outliers
                                     9. Bivariate analysis.
                                      Var. (x) vs. pH
                                      Var (x) vs. Disch. or Log Disch.
                                      Association r2
                                      Cross-correlation.
                                      Regression if required.
                            10. Time series plots (TSPLOT) for each variable.
                                1. Search for missing values.
                                2. Periodicity?
                                3. Outliers
                                4. Quality Control Graphs.
               11. Box - Jenkins Time Series Analysis
                 1. Identification: Acf, Pacf, Acf,  Pacf
                 2. Estimation of Model parameters.
                 3. Residuals to check for outliers.
                 4. Forecasts (wher required).

               12. Sampling by Simulation
                 1. Choose samples (18 for example) according to recommended procedure.
                 2. Test by quality control graphs.
                 3. TSPLOTs using mean, median and various multiples of standard deviation.
                             13. Adopt procedure for routine analysis in the field.
                                                                                                   3-3

-------
Chapter 3

Ideally, all the intervals between the observations should be equal; in practice, this is rarely
achieved.  One or two days on either side of the ideal date is adequate for fourteen-day intervals.
Many gaps of five or six days make the subsequent analysis much less exact and larger intervals
(e.g., 90 days) make the analysis more difficult to interpret correctly.  Large gaps in the data
preclude rigorous time series analysis which requires a very close approximation to  equal
intervals between observations.  In general, the more sophisticated the statistical analysis, the
more sensitive it is to data gaps.

As a general recommendation, it is helpful to insert a missing data symbol (e.g., *) where there
are data gaps (i.e., a few missing flow measurements or a few missing values for water quality
parameters) and produce the mean, median, standard deviation, etc. of the truncated data set. If
the frequency distribution of the variable is reasonably representative (e.g., symmetric), or has
been made so by log transformation, then the means may be substituted for each missing data
symbol (*) and the frequency distribution and summary statistics (mean, median, standard
deviation) rerun on more  complete data. Of course, insertion of the mean does not gain
information; it only makes subsequent analysis more correct. If the data are asymmetric, the
median is  a more representative estimate of the "central tendency" and should  be used rather
than the mean.

This entire procedure (Steps 1-5, Figure 3.1) is aimed at "massaging" the data into a form
suitable for statistical analysis.  If there are only a few observations (18 or so, for example) it is
somewhat arbitrary whether or not one wishes to smooth the data, because very little extended
analysis will be appropriate.

Univariate Analysis (Algorithm Steps 6,  7, and 8)

In Step 6,  the data are analyzed and plotted to obtain the summary statistics and to examine
graphical displays of the data to determine the presence of skewness and extreme values.  Stem
and Leaf plots can be used in place of histograms of frequency distributions as shown in Figure
2.4.

This procedure includes calculating statistics for each individual variable (univariate statistics).
An example of this procedure is displayed in Table 3.1 of the summary statistics for the analysis
of the data from the  Clarion site (discussed in Chapter 5). In this example, there are seven
parameters and eleven summary statistics that were calculated using the REMINE program.
There are N = 96 observations (column 1 of Table 3.1); N* (column 2) is the number of missing
observations (19 for the discharge variable). Columns 3 and 4 list the means and medians
respectively.  Column 5 is a special kind of mean, called by Tukey (1977, p. 46) the "trimmed
mean." Columns 6 and 7 contain the standard deviation (STDEV) and standard errors
(SEMEAN) of the mean as measures of spread. Columns 8 and 9 list two extremes  (min and
max) yielding the range of the values.  Columns 10 and 11  contain the quartiles (Qx  and Q3),
yielding a measure of spread around the central tendency (mean or median); this spread is less
sensitive to the extremes and so is often preferred in distributions which are irregular (e.g.,
strongly skewed). The coefficient of variation (Column 12), usually expressed in percent
(CV%), is defined as the ratio of the standard deviation to the mean multiplied by 100.  This is a

3-4

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

useful approximate guide to the degree of variation in a parameter. In general, a CV < 30%
represents a stable (in control) parameter. Most of these parameters, however, show much larger
variation, principally because of the large effects of extreme events.

Table 3.1:     Summary Statistics for S3CLAR (N=96)

PH
Discharge
Acidity
Total Iron
Ferrous Iron
SO,
Ferric Iron
N
96
77
96
96
96
96
96
N*
0
19
0
0
0
0
0
Mean
3.696
12.58
522.4
82.40
54.84
1528.
27.56
Median
3.195
6.30
483.5
75.00
39.50
1569.0
23.60
Trimmed
Mean
3.612
9.00
505.6
79.31
47.44
1525.9
31.01
Standard
Deviation
0.985
22.66
346.4
51.01
66.99
566.0
70.58
Standard Error
of the Mean
0.101
2.58
35.4
5.21
6.84
57.8
7.2

PH
Discharge
Acidity
Total Iron
Ferrous Iron
SO,
Ferric Iron
Minimum
2.670
0.05
1.0
8.70
0.90
296.0
-581 .68
Maximum
6.430
172.00
1546.0
257.00
612.18
3241.0
152.00
First
Quartile
3.002
3.59
232.5
39.70
25.12
1181.5
5.00
Third
Quartile
4.455
12.54
737.7
110.25
68.60
1878.2
55.00
Coefficient of
Variation
26.6
188.1
66.3
61.9
122.2
37.0
256.1
A second series of statistics referred to as letter values (e.g., H-spread) is sometimes calculated
to identify various measures of spread. These spreads can be used to set limits for water quality
(see Tables 8.6 and 8.7, (Q3-Qi)). These letter values (LVALS) were first defined by Tukey
(1977, p. 22) and are mentioned in the MINITAB Reference Manual (p. 168). These values are
best described in Velleman and Hoaglin (1981, p. 33).

If the data are positively skewed (i.e., skewed towards the high end of the values on the variable
scale) the data should be logarithmically transformed and the univariate analysis repeated (Step
7, Figure 3.1). The log transformation tends to make the histogram more symmetrical, although
there is a tendency to over-correct in some cases and introduce negative skewness.

It is possible to use another transformation such as the square root of the variable, which may
well suffice to avoid over-correction that came from the logarithmic change.  The use of various
transformations is reviewed in Tukey (1977, Chapter 3) and specifically for symmetry, in the
MINITAB Handbook (p. 72 - 76) and the MINITAB Reference Manual (p. 50 - 52).  It is also
discussed in Velleman and Hoaglin (1981, p. 46 - 49) and Box and Cox, (1964).
                                                                                     3-5

-------
Chapter 3

In evaluating the statistics produced for the transformed data, the user should be cautious of the
coefficient of variation values. Use of the coefficient of variation with log transformed data may
result in extreme distortion because the transformation leads to a mean of small value. This
results in a denominator of the ratio that is small resulting in a CV that is inflated.

Step 8 in Figure 3.1 is used to check and accept or modify outliers. Outliers tend to inflate the
variance or spread of the data and make  the statistical tests less sensitive.  For this reason,
outliers should be reduced only after deciding that such extreme values are not "real" or when it
is specifically desired to make the statistical testing more sensitive. As mentioned earlier, some
outliers are indicators of unusual events  (e.g., floods, storms) and should not be removed or even
subdued, but instead should be used to reflect the occasional unusual events.

Bivariate Analysis

The next step in data analysis (see Step 9) concerns the relationship between pairs of variables
(bivariate analysis).  If two variables are closely associated (e.g., a correlation coefficient, r >
0.8), both may be reflecting the same source of variation and one may be considered redundant.
It is possible to use this kind of feature to select the simpler test (or less expensive analyte) and
ignore the other parameter in subsequent studies. Sometimes several variables reflect the effects
of the same events.

One expects, for example, pH to  decline with increasing acidity and sulfate.  In the case of
calcium and manganese, on the other hand, one expects sympathetic variation.  If examination of
the data shows that this expected relationship is not present, the reason for its absence should be
sought.

The correlation coefficient (r) is usually used to represent the (linear) relationship between any
pair of variables.  The coefficient of determination (r2) is, however, a better measure of the
intensity of the association between a pair of variables; for example, r = 0.7 looks large because
the range of r is from -1 to +1, but it means that r2 = 0.49 or 49% of the variation is common to
the two variables and there is 51% of the variation "unexplained" by the association. It is
necessary, therefore, to realize that one needs r > 0.8 to claim that a strong association exists
(i.e., > 64%  in common).

Another feature which  is illuminated by using r2 as well as r is the statistical test which
accompanies a specific value of r. For a sample size of N = 174 (Table 6.3), a value of r > 0.124
is significantly different from zero at the five percent probability level.  This should be
accompanied by the corresponding value of r2. In Table 6.3, the correlation coefficient between
pH and acidity is r = -0.365.  This value comfortably exceeds the r = (±) 0.124, thus it is
statistically significant. Nevertheless, the corresponding r2 = 0.133 means that only 13.3% of the
variation is common to both variables.

In the graphs presented in Figures 6.5a and 6.5c, the variation of both parameters increases as
their values increase. This phenomenon is called heteroscedasticity. In general, it is advisable to
plot the logs of the variables which tends to make the variables homoscedastic.  Since

3-6

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

heteroscedastic variables show a difference in variability with changes in values of the
parameter, no probability statement should be made without transformation so that the variables
are homoscedastic.  Peculiarly, the change from heteroscedasticity to homoscedasticity does not
lead to a major change in the value of r, but does make the probability statements more reliable.

One more avenue should be explored in bivariate analysis, and that is to determine whether there
is any lag in correlations between pairs of variables.  Cross-correlation analysis is performed to
see if a weak relationship at zero lag may be much stronger at greater lags.  This could result
from a delayed effect. For example, suppose discharge increases and sometime later, pH drops.
Correlation at zero lag may be quite low, but at some higher lag it may increase showing that it
takes time for the effect of changes in discharge to affect pH or some other variable. The cross-
correlation function (CCF) is the measure used for this purpose. For example, suppose that an
event occurs and affects one variable immediately but only affects another variable five
observations later. In this case, the linear correlation coefficient at zero lag may be quite low but
may show a strong association after a five day lag. The cross-correlation function calculates the
linear association between observation 0 to t days apart and so gives a picture of when the
association is strongest.  The range oft is from - (sqrt (N) + 10) to (sqrt (N) +10) where N is the
number of observations in the series.  In most of the examples presented in this report, there did
not appear to be any lag in the effects.

Time Series Analysis

The remaining steps (10 through 12, Figure 3.1) were used to set up baseline behavior based on
relatively  long data records. In this way, expected behavior of various parameters are
established for comparison with the shorter data records that are commonly used in routine
remining permitting. The likelihood of unusual events is then displayed, and the frequency of a
single or a few unusual observations may be used to judge how often these events occur.  In this
way, these events can be distinguished from other departures that lead to warnings, triggers, or
exceedances in pollution load and therefore, would be less likely to result in false alarms.

One procedure which is readily available as part of the full Box-Jenkins treatment, but was not
used in these studies, was Transfer Function analysis. This analysis would be a most attractive
way to correct variation in some parameter (e.g., Fe) for variation in flow and then proceed to
analyze the residual variation  in the parameter after the effects of flow were removed. This
would  also be an alternative way of looking at the "load" variable in place of concentration.

Similarly, there is a procedure in Box-Jenkins analysis called "intervention  analysis" which may
be used to compare and contrast variation in a parameter before and after treatment is applied.
This has obvious applications to remining operations. Needless to say, use  of these procedures
requires an extensive set of observations taken at equal intervals, with few data gaps.

Variation  in many of the parameters,  from the different locations,  appears to follow a common
pattern. There is usually some type of gradient present in the data which may be increasing or
decreasing over time.  This results in a typical autocorrelation function (Acf) pattern and a large
spike at lag 1 in the partial autocorrelation function (Pacf).  This trend should be removed before

                                                                                      3-7

-------
Chapter 3

fitting a model. This is best done in nearly all the examples in this particular series of
investigations by taking first differences of the variable of interest. The subsequent model-fitting
usually leads to a moving average model. In Box-Jenkins notation this is an IMA (0,1,1) model.
It is essentially a random walk after first differences are taken.

Quality Control (QC) Limits

Step 10 of the algorithm on times series plots of the variables (Figure 3.1) includes an item (# 4)
on quality control graphs.  Items 2 and 3 of Step 12 (sampling by simulation) also refer to quality
control graphs.  The final step of the algorithm (Step 13) is a procedure for routine statistical
analysis of data contained in remining permits. From the discussion of quality control
throughout Chapter 2, it is obvious that the development of a relatively simple quality control
approach for mine drainage data analysis is a major objective of this report and a significant
component of the routine procedure in Step 13 of the algorithm.  Chapters 4  through 8 contain
further discussion, tables and plots of various examples of quality control limits.  Examples from
the six mine drainage case study sites lead to the statistical summary and review of quality
control limits in Chapter 9, wherein options for the routine use of quality control limits are
presented.

Throughout this report the conventional quality control limits based upon the mean and standard
deviation of the normal frequency distribution are compared to another set of non-parametric
quality control limits based upon the median and  other order statistics (e.g., quartiles, H-spreads,
C-spreads), which  may be more applicable to mine drainage data that frequently do not conform
to normality. The  quality control options in Chapter 9 of this report are a component of the
routine procedures for establishing baseline pollution load and monitoring in remining permits.
These procedures are related to the recommended statistical procedures set forth in Chapter 3
and Appendix A of EPA's Coal Remining Statistical Support Document. However, the user of
these routine procedures should be ever mindful that no single set of quality  control limits or
specific statistical test will be perfectly applicable to all mine drainage sets or even to all
discharge parameters within the same data set.  The user should carefully examine the data and
follow the fundamental steps of the algorithm in order to properly use the statistical tools that are
most applicable to the characteristics of the data.

In the following chapters, more than one equation was used to calculate QC  interval spreads.
These equations were chosen based on the distributions of the parameters collected in the given
data sets (i.e., number of results, amount of variability, lack of normality, etc.)

The first equation (X+ 2(7 ) is based on the  typical confidence interval for a mean under the
normal distribution. However, unlike the typical equation for a confidence interval around a
mean, the standard deviation was not divided by the square root of the number of results (N).
The exact interpretation of the usual confidence interval is that the true mean of all post-
remining results for the given site will fall into the calculated interval with 0.95 probability. For
the purpose of quality control, however, this  interval may be extremely tight, given the large
number of results collected for each dataset. For baseline permit pollution load data sets, the
number of results collected would likely be much less, and therefore would produce wider

3-8

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

intervals.  A different value of N (N') could be used in the equation, reflecting the number of
results likely to be collected and used to calculate the mean that will be compared to the interval.
For example, if monthly samples are collected for a year, N' would equal 12. However, if the
purpose of the interval is to evaluate individual results  rather than a mean, then N' should equal
1. This is what was done in Chapters 4, 6, 7 and 8, where the above equation is used.

The two other equations that are used in quality  control tables in the following chapters are non-
parametric, in that they do not require that the collected data follow a normal distribution. They
are based on the non-parametric equivalent of the mean (the median) and the non-parametric
statistic for variability (the interquartile range). The first interval,
             1.25 * H-spr.
Md ± 1.96 * (	!=£—) ,
            v   1.35* V^V"
is discussed in McGill, Tukey and Larsen (1978). This interval is used to assess whether a
median follows the same population as the baseline pollution load data, and is therefore divided
by the square root of N', where N' is the expected number of remining results. The chosen
multiplier, 1.96, is appropriate when it is assumed that  the variability of the baseline data and the
remining data are approximately equal. However, if the variability of the baseline data and
remining data are different, a smaller multiplier (1.39) is appropriate. When it is not known
whether the two variances will differ, the midpoint of 1.39 and 1.96, (i.e., 1.7) could be used.
The above equation is used in Chapters 6, 8 and 9.

A second equation, Md ± 1.58 * (H-spr.), was used in Chapter 4. In this second equation, the
value of 1.58 was chosen by using the midpoint  multiplier (1.7) and simplifying the equation by
multiplying by 1.25 and dividing by 1.35. The purpose of this equation differs from the previous
one, in that it is designed to evaluate individual results, rather than the remining median.
                                                                                      3-9

-------
Chapter 3
3-10

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 4:  Analysis of Data from the Arnot 001, 003 and 004 Discharges
The Arnot mine site is located in Tioga County, Pennsylvania in the northeastern portion of the
bituminous coal region.  The Arnot discharges are from an abandoned underground mine on the
Bloss (B) coal seam, which is the subject of a hydrogeologic study by Duffield (1985).  The
relationships between flow and water quality parameters of the Arnot mine site are also
described in Smith (1988) and Hornberger et al. (1990).  A map of the Arnot site is shown in
Figure 4.1.  The data set for the Arnot site contains 82 samples from each of the 3 mine drainage
discharges for the time period from January 28, 1980 to August 14, 1983.

It is advisable to examine the distribution of the missing values because they will lead to
difficulties as the analytical (statistical) tools get more sophisticated. In particular, time series
analysis demands observations at regular time intervals.  On the other hand, it is impractical to
expect that there will be no missing values, because during a storm  event, the sampling location
may become inaccessible for various time intervals. It is best, therefore, to recommend time
interval limits for the period in which a sample may be taken, and which, for statistical analysis,
will be considered to be within the time interval (e.g., any time within a two week period will be
assigned as an observation taken 2 weeks apart at the mid-point of the time interval). In any case
it is advisable to examine the data carefully before attempting a quantitative analysis. Therefore,
it is recommended that a graph of discharge (and/or other variables) against time be prepared and
examined carefully to determine the distribution of missing values,  position of the extremes, etc.

A typical example is illustrated in Figure 4.2 which is a plot of log (base 10) of flow versus time
for all three point sources from the same mine.  The flow for Arnot 001 is usually the largest
followed by Arnot 004, then Arnot 003. All three show the same general pattern of variation.

The samples were supposed to be taken at 14 day intervals but, in practice, the intervals vary
from 1 day up to 40 days.  All intervals equal to or exceeding 20 days are accented in Figure 4.2.
These longer intervals include, of course, many missing values. When a time series model is
fitted to these data, they are "forced" into equal interval status. The effect of these departures
from equal intervals is to suppress any seasonal periodicity that may be present.

It may be observed that in 1980 the runoff occurred in March, April and May; in 1981 in March
and April; in 1982 in March and June; and in 1983  in April and May.  These variations tend to
suppress any seasonal effect in the occurrence of extreme values.
                                                                                     4-1

-------
Chapter 4

Figure 4.1:    Map of Arnot Mine Site
                                                            Enlarged Portion of the Blossburg & Cherry Flats
                                                                   USGS 7.5 Minute Quadrangles
                                                                      1000       2000
                                                                                         3000  Feet
4-2

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 4.2:   Log Flow vs. Time (Arnot 001, 003, and 004) Procedures to Adjust the Data
              Set for Missing Data
 Vertical lines indicate more than 20 days between sampling events

Jan 1980                  Apr 1981                Apr 1982
2.0
                                                                          —x—Arnot 001
                                                                          --•-•Arnot 003
                                                                          —O—Arnot 004
                                                                      Apr 1983
                                            Sample
In some cases, it will be advantageous to insert some suitable value in place of the missing
observation and the procedures for selection of a suitable value can differ.  One such approach is
to insert the mean value for the series or, if the frequency distribution is somewhat skewed
(asymmetric), the median may be more representative.

There are also smoothing procedures varying from simple ones, such as the average of a pair of
values on either side of the missing observation, through running averages using any of several
larger sets of numbers. These, smoothing procedures, are described in Velleman and Hoaglin
(1981, Chapter 6) and (Cleveland, 1979).

A typical, but rather elaborate example,  specifically designed for time series analysis, is
described by Damsleth (1986).  This example begins with "simple linear interpolation between
observations preceding and following the gap" then identifying and estimating a univariate time
series model for the "adjusted series" which, in turn, yields "optimal estimators using the
model."  The new series is used to build a transfer function model between two series (such as
acidity and flow) and calculating new optimal values which are in turn used to estimate new
                                                                                       4-3

-------
Chapter 4

model parameters (Damsleth, p. 46-47). The conclusions reached by Damsleth (p. 47) are:  "The
various steps in the process gave only small changes in the estimates for missing values, and the
model and parameter estimates were almost unaffected....".

It should be clear that missing observations can be a very difficult problem.  Another aspect of
this "data massaging" procedure arises when attempts are made to reduce the magnitude of the
error of residuals when fitting a time series model.  In a series of flow observations, for example,
there maybe some extremely large values that arise from unusual events (e.g., heavy rainfall,
perhaps persisting for several days, sudden water run-off from snow melt, etc.). These "natural"
events  of limited duration can increase the residual error quite seriously and usually do not
represent persistent increased contamination.  In the series of mine drainage data examined in
this chapter, these unusually large values are often associated with missing data. This means that
if one inserts a very small value (near  zero) for the missing value, the entire  range in parameter
values  occurs within a short period.  It is advisable to reduce this wide range, first by not using
low values for zero or missing values but by using one of the procedures described above.
Secondly, the extreme high values should be smoothed out (i.e., large variance, and wide
confidence  limits which tend to be insensitive to large departures in the data). The effects of
these adjustments may be estimated by running the series, after removing the zero values, both
with the original  extreme values and with the extremes adjusted by some form of smoothing.

In comparing the results  of the more sophisticated smoothing technique described by Damsleth
with other "quick and dirty" techniques, it was found that the changes were not very different.
Therefore, it was concluded that elaborate smoothing procedures are unnecessary for mine
drainage data sets.

Univariate Analysis

The analysis commences with the summary statistics displayed in  Tables 4. la to 4. Ic. The
number of samples (N) for each variable is listed first, followed by the number of missing values
(N*).  The statistical summary then follows with values for the arithmetic mean, median,
trimmed (10%) mean, standard deviation, standard error of the mean, minimum, maximum, and
the first and third quartiles.  A convenient procedure for comparing variabilities among different
variables, and among the same variables from different sources, is by means of the Coefficient of
Variation (CV) where CV% = (standard deviation / mean) * 100 expressed in percent.  The
values  are displayed in Table 4.2 for convenient comparisons. For all three Arnot sources, pH
has the smallest variability (around 4%), whereas, discharge has the largest variability (Arnot 1:
CV=112%, Arnot 3: CV=70.0%, Arnot 4: CV=78.1%).
4-4

-------
                        Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Table 4.1a:    Summary Statistics for Arnot 001 Data

PH
Temperature
Discharge
Acidity
Alkalinity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
N
81
67
81
81
81
81
81
81
75
75
75
72
N*
0
14
0
0
0
0
0
0
6
6
6
9
Mean
4.8505
9.448
0.7961
20.04
6.457
0.21111
0.11728
173.23
109.52
86.03
1.7104
1.425
Median
4.8400
9.100
0.5000
16.00
5.000
0.20000
0.10000
177.00
1 1 1 .00
82.00
1 .6200
1.045
Trimmed
Mean
4.8479
9.403
0.6747
19.42
5.918
0.21096
0.11507
173.22
109.73
85.76
1 .6776
1.384
Standard
Deviation
0.2221
1.424
0.8955
11.26
5.480
0.07583
0.07872
44.05
22.76
24.87
0.6666
0.982
Standard Error
of the Mean
0.0247
0.174
0.0995
1.25
0.609
0.00843
0.00875
4.89
2.63
2.87
0.0770
0.116

PH
Temperature
Discharge
Acidity
Alkalinity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
Minimum
4.2000
7.000
0.0100
3.00
0.000
0.00000
0.00000
66.00
66.00
31.00
0.5400
0.100
Maximum
5.4500
12.900
5.0910
64.00
37.000
0.40000
0.30000
277.00
152.000
145.000
3.9500
3.640
First
Quartile
4.6800
8.400
0.2300
11.00
3.000
0.20000
0.10000
140.50
93.00
69.00
1 .2800
0.602
Third
Quartile
5.0200
10.000
0.8615
28.00
8.000
0.25000
0.20000
201 .50
127.00
104.00
2.0300
2.277
Coefficient of
Variation
4.5
15.1
112.0
56.2
84.8
35.9
67.1
25.4
20.8
28.9
39.0
68.9
                                                                                          4-5

-------
Chapter 4
Table 4.1b:    Summary Statistics for Arnot 003 Data

PH
Temperature
Discharge
Acidity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
N
82
67
82
82
82
82
82
75
75
77
73
N*
0
15
0
0
0
0
0
7
7
5
9
Mean
3.2782
8.551
0.2157
86.37
1 .0963
0.3610
168.99
59.75
73.60
3.203
5.079
Median
3.265
8.600
0.1610
84.50
1.1000
0.3000
165.00
61.00
70.00
2.760
4.680
Trimmed
Mean
3.2727
8.548
0.2671
85.7
1.0919
0.3405
168.66
59.52
72.49
3.110
5.060
Standard
Deviation
0.1095
0.916
0.1509
22.55
0.2843
0.2340
43.79
11.69
23.00
1.338
2.213
Standard Error
of the Mean
0.0121
0.112
0.0167
2.49
0.0314
0.0258
4.84
1.35
2.66
0.152
0.259

PH
Temperature
Discharge
Acidity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
Minimum
3.0400
6.200
0.04
42.00
0.3000
0.0000
85.00
38.00
38.00
1.540
0.700
Maximum
3.7000
1 1 .700
0.5650
151.00
2.0000
1 .5000
262.00
90.00
142.00
6.900
9.440
First
Quartile
3.2100
8.100
0.1010
67.75
0.9000
0.2000
1 34.00
49.00
55.00
2.040
3.400
Third
Quartile
3.3325
9.000
0.3282
104.00
1 .2000
0.4000
211.25
69.00
89.00
4.350
6.960
Coefficient of
Variation
3.3
10.7
70.0
26.1
25.9
64.8
25.9
19.5
31.2
41.7
43.6
4-6

-------
                         Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Table 4.1c:    Summary Statistics of Arnot 004 Data

PH
Temperature
Discharge
Acidity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
Log Discharge
Ferric Iron
N
81
67
81
81
81
81
80
75
75
75
73
81
81
N*
0
14
0
0
0
0
1
6
6
5
8
0
0
Mean
3.2794
8.466
0.5307
96.99
1 .2630
0.4198
171.80
54.293
67.68
2.714
6.453
-0.3954
0.843
Median
3.2800
8.600
0.4030
96.00
1.200
0.3000
166.50
54.000
65.00
2.445
5.900
-0.3947
0.800
Trimmed
Mean
3.2675
8.487
0.4887
95.85
1.243
0.3973
170.79
54.164
67.27
2.637
6.317
-0.4024
0.845
Standard
Deviation
0.1409
0.906
0.4143
26.61
.418
0.2638
39.04
8.022
18.52
0.979
2.590
0.3266
0.382
Standard Error
of the Mean
0.0157
0.111
0.0460
2.96
0.0464
0.0293
4.36
0.926
2.14
0.112
0.303
0.0363
0.043

PH
Temperature
Discharge
Acidity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
Log Discharge
Ferric Iron
Minimum
3.0000
6.100
0.1220
62.00
0.600
0.0000
86.00
39.000
17.00
1.200
0.710
-0.9136
0.000
Maximum
3.9400
10.700
1 .8380
168.00
2.8
1.4
268.00
79.000
110.00
6.500
13.560
0.2643
1.700
First
Quartile
3.1900
8.100
0.2090
73.00
0.900
0.2500
143.00
49.000
54.00
1.987
4.325
-0.6799
0.600
Third
Quartile
3.3350
9.000
0.7365
121.00
1.500
0.5000
200.00
60.000
75.00
3.247
8.350
-0.1330
1.100
Coefficient of
Variation
4.3
10.7
78.1
27.4
33.1
62.8
22.7
14.8
27.4
36.1
40.1
82.6
5.1
                                                                                           4-7

-------
Chapter 4
Table 4.2:     Coefficient of Variation (%)
Variable
PH
Temperature
Flow
Log (Discharge)
Acid
Alkalinity
Total Iron
Ferrous Iron
Ferric Iron
S04
Ca
Mg
Mn
Al
Arnot 001
4.5
15.1
112.0
-
56.2
84.8
35.9
67.1
-
25.4
20.8
28.9
39.0
68.9
Arnot 003
3.3
10.7
70.0
-
26.1
-
25.9
64.8
-
25.9
19.5
31.2
41.7
43.6
Arnot 004
4.3
10.7
78.1
82.6
27.4
-
33.1
62.8
5.1
22.7
14.8
27.4
36.1
40.1
The same CV order of magnitude is maintained by each variable in each of the three sources.
Log discharge does nothing to reduce the relative variation (CV) as can be seen from the value
for Arnot 004 (82.6%). Discharge is highest in Arnot 001, moderate in 004, and lowest in 003.
The coefficient of variation reflects this order and suggests that this parameter varies in
proportion to its absolute value (heteroscedastic), again reinforcing that the appropriate
transformation is to logarithms.

The majority of the variables in the histogram-like displays of data from Arnot 001 are
symmetrical, such as sulfate shown in Figure 4.3. The most asymmetric is discharge which is
seen in Figure 4.4.  When this variable is transformed to logarithms it becomes symmetrical.
4-8

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figure 4.3:   Stem-and-leaf of Sulfate (Arnot 001)
N = 81
Leaf Unit =
1
5
9
17
30
(13)
38
21
12
8
1

10
0
0
1
1
1
1
1
2
2
2
2


6
9999
0011
22223333
4444444555555
6666677777777
88888888888889999
000001111
2233
4445555
7
Figure 4.4:   Stem-and-leaf of Discharge (Arnot 001)
       N = 81
       Leaf Unit = 0.10
       40        0       0111100000222222222222222223333344444444
       (24)       0       555555555566666777888899
       17        1       0123334
       10        1       69
       8         2       13
       6         2       69
       4         3       014
       1         3
       1         4
       1         4
       1         5       0

The Arnot 003 and 004 data are substantially similar to that of Arnot 001.  The histogram of pH
data for the Arnot 003 discharge is very symmetrical, as shown in Figure 4.5, as is the histogram
of sulfate  data for the Arnot 004 discharge shown in Figure 4.6. Flow measurement data of the
Arnot 004 discharge are asymmetric and positively skewed, as shown in Figure 4.7.
                                                                                      4-9

-------
Chapter 4
Figure 4.5: Stem-and-leaf of pH (Arnot 003)
N = 82
Leaf Unit
1
2
5
17
36
(15)
31
19
10
6
3
2
1
1
1

= 0.010
30
30
31
31
32
32
33
33
34
34
35
35
36
36
37


4
7
134
555667888999
0011111112234444444
555666777789999
001111222234
556666778
1122
679
0
7


0
Figure 4.6:   Stem-and-leaf of Sulfate (Arnot 004)
N = 80
Leaf Unit
1
3
16
33
(18)
29
21
12
5
2

= 1.0
0
11
1
1
1
1
2
2
2
2

N* = 1
8
00
2222333333333
44444444444555555
666666666666667777
88888999
000000111
2222233
455
66
4-10

-------
_ Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figure 4.7:   Stem-and-leaf of Acidity (Arnot 004)
            Figure 4. 7 : Stem-and-leaf of Acid.
            N = 81
            Leaf Unit = 1.0
            13         6        2444456677799
            32         7        0011223333445677899
            38         8        001238
            (6)         9        256778
            37         10       0000337
            30         11        112245558
            21         12       112445678
            12         13       01123677
            4         14       05
            2         15       2
            1         16       8

Bivariate Analysis

The relationships between log discharge and every other parameter are similar (i.e., inverse and
approximately linear).  That is, as discharge increases in volume the amount of each variable,
calcium, magnesium, manganese and aluminum, decreases, or in high flows the concentration is
diluted.  A good example of this relationship is the plot of manganese versus flow for the Arnot
001 discharge, shown in Figure 4.8.

Figure 4.8:   Plot of Manganese vs. Log Flow (Arnot 001)
   UTi >      C8 VS 68
        4.0+                          *
           —                          *
   MN

         3-.0*
                                     *2 *
                                    * i
           —                          **  *  *
           -                         • *      *
         2,0+                        **    ** 2 *
                                      3 2    2 *2  *•
                                           2 2*2      1
                                     *       »  2 2  3   »  * «
                                       '*        2    2-  * *
          2,00     -1.50     -1,00     -0,50      0.00      0.50
                                                                                     4-11

-------
Chapter 4




Figure 4.9:   Plot of Acidity vs. Flow (Arnot 003)
          MTB > PLOT C4 VS C3



          ACID
              140
               35+
                0.00      0.12      0.24     0.36     0.48     0.60
                                                                    -DISCH(cfs)
Figure 4.10:  Plot of Manganese vs. Flow (Arnot 003)
      MTB > PLOT CIO VS C3




      MN






           6.0+
           1.5 +
             0.00      0.12       0.24      0.36      0.48      0.60



              N* = 5
                                                                       -DISCH (cfs)
4-12

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

For Arnot 003, acidity vs. discharge possesses a clear, curvilinear association (Figure 4.9), which
would become inversely linear if discharge was expressed in logs. Cross-correlation of these
variables had maximum association of-0.648 at zero lag, or about 42% of the variation (r2)is
common to both variables.  Sulfate, manganese and aluminum vs. discharge also showed this
same curvilinear association of dilution with increasing flow. The example of manganese is seen
in Figure 4.10.

A plot of sulfate versus acidity from the Arnot 004 discharge data showed the expected positive
association but again the scatter around a straight line is very large.  The expected association of
calcium and magnesium is extremely weak. Any relationship between manganese and total iron
is obscured by an extreme value in iron. It seems somewhat strange that the data from Arnot
004, which is located between Arnot 001 and 003, should present such a confused picture of
these bivariate relationships relative to those of Arnot 001 and 003 data; possibly  Arnot 004
contains more outliers than 001 or 003.

Time Series  Analysis

A qualitative time series analysis was performed by plotting successive variables against (equal
interval) time periods. It is convenient to start with the variable discharge (flow) for Arnot 003
(Figure 4.1 la) which may be compared with the same plot on a much larger scale (Figure 4.2).
The four maxima (peaks) are quite striking in both graphs.  Since the date of the first observation
is January 28,  1980, the first peak is in March (1980), marked in the graph by the  number 3; the
numbers in Figure 4.1 la go from 1 to 10 (=0) and then start at 1 again and so on for each cycle
of 10. The next peak is 22 (March, 1981) followed  closely  by another at 26  (May, 1981).
Subsequent peaks occur at 43 (March,  1982), 48 (June, 1982) and then 73 (April, May, 1983).
Suppose there existed an annual cycle (i.e., 26 observations, one every two weeks) then, starting
with March =  3,  the next peak should be 29, then 55, 81, etc.  Missing observations (see Figure
4.2) and peak  discharges at varying intervals, not equal annual cycles, make a seasonal pattern
obscure.

Using discharge as the base which controls the concentration of acidity for example, one would
expect pH to also show similar cycles in Figure 4.1 Ib.  Instead, the first peak and the following
double peak are similar to those shown by discharge, but the peak at 40 is not.  There is a peak at
48 in both plots but then the pH declines and stays below its mean throughout the subsequent
series; there is no sign of the discharge peak at 73.  The scatter diagram of pH vs.  discharge
showed no relationship.

The relationship between acidity and time (Figure 4.1 Ic), tends to be inversely related to the
relationship between discharge and time, i.e., the peaks of discharge coincide with the minima
(maximum dilution) of acidity. This is supported by the scatter diagram between acidity and
discharge (Figure 4.9).  There is a slight tendency for this to be true of total iron (Figure 4.1 Id)
but there was no sign of such a relationship in the scatter diagram of iron vs. discharge.

Sulfate, as expected from its scatter plot against discharge shows inverse relationships in Figure
4.1 le, with peaks coinciding with discharge troughs. Calcium, magnesium,  manganese, and

                                                                                     4-13

-------
Chapter 4

aluminum also show this inverse relationship to discharge (see Figure 4.1 If of aluminum for
example).

Figure 4.11a: Plot of Discharge vs. Time (Arnot 003)
     0.600+
     o.ooo-
                  10       20       30       40      50       60       70       80
                                                                                   80
Figure 4.11b: Plot of pH vs. Time (Arnot 003)
   MTB > TSPLOT Cl
     3.600-
     3.400-
                                         40      50      80      70      80      90
      3 .00,
4-14

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load


Figure 4.11c: Plot of Acidity vs. Time (Arnot 003)
   MTB > TSPLOT C4
      MO.O*'


   ACID
5.0*
3.0*
2
90 34./V fed
7y *
r
56
/
9
/ /M
/ '"
/V 9o/
j 's
2 C
1
U /s/
1/5^ 234
y V01'

                                                          23- 5*
                                                       9ff
                                                       9 i  a    / Y"
                                                         . 1  JJ
                                                        12 k

                                       40      SO
Figure 4.11d: Plot of Total Iron vs. Time (Arnot 003)

MTB > TSPLOT CS

TFE
    1 .50-1
^	/\4-  fJt-	..
                  23          6
 Ml-
       - I2-.
  &
    1.00+    5
             i    4
             i   f
             W
    A 90
	;V
234 /
                                                '\   /, ^
    0.50"
                                                 w/
                        -2"              —   l-J-U
    0.00+
                10
                        2C
                                 30
                                         40
                                                 SO
                                                          £0
                                                                  70
                                                                           80
Figure 4.11e: Plot of Sulfate vs. Time (Arnot 003)
    MTB > TSPLOT C7•
    S04
                           	— —  4 2cr-
              r
             1
        TV -
        60*
                                          ~2_c-
                                                               67

                                                       />!
                                                          0 1
                  10
                                         40      50       6O       70
                                                                         80       90
                                                                                      4-15

-------
Chapter 4

Figure 4.1 If:  Plot of Aluminum vs. Time (Arnot 003)
 MTB >  SPLOT C11  Al

 C1 1
     0.00
        -0       10
The time series plots shown in Figures 4.1 la to 4.1 If can be used as quality control graphs in the
following manner. Confidence limits around the mean are simple to prepare from the descriptive
statistics in Tables 4.la to 4. Ic and these can be inserted in, for example, Figure 4.1 Ic.  Two
kinds of confidence limits are included for comparison. The first is based upon the mean and
standard deviation of the normal frequency distribution.  The second is based upon the median
and other order statistics and is for use in cases where the frequency distribution is not normal
(e.g. skewed) or in other non-parametric applications.  These two kinds of quality control
approaches are discussed in more detail in Chapter 5.  The most typical quality control limit is
the conventional range of the mean (plus or minus two standard deviations) which, in a normal
distribution includes some 95 percent of the observations (i.e., one expects in a moderately long
(say > 30) series about 2-3  observations outside these limits on either side of the mean). If we
wish to relax the requirement of a normal distribution we may use the range encompassed by
order statistics, for example Md ± 1.58 (H-spr.), which is approximately  equivalent to the
conventional measure (Velleman and Hoaglin,  1981, p. 81). The multiplier (2) in the
conventional example may be replaced with 3 for a more stringent test in which only 3 in 1000
are expected to fall outside the (3  a ) limits, strictly in a normal distribution.  The limits for each
of the eleven variables from the Arnot 003 data are displayed in Table 4.3, including the range
around the means and around the medians. The range around the mean exceeds that around the
median in pH, temperature, ferrous iron, and total iron, whereas the range around the median
exceeds that around the  mean in the seven other variables.  These  seven variables show
associated variation either directly or inversely so this consistency is to be expected.  The reason
for the reversal in relationship for the other four may arise from inconsistent occurrence of
outliers in the data for these variables. pH is usually symmetrical and probably closely normal;
temperature, ferric iron and total iron have very marked peculiarities.
 4-16

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Table 4.3: Comparison of Confidence Belts Around Mean and Median (Arnot 003 Data)

Variable
PH
Temperature
Flow
Acidity
Total Iron
Ferrous Iron
S04
Ca
Mg
Mn
Al
Mean
X
3.2782
8.551
0.2157
86.37
1 .0963
0.361
168.99
59.75
73.6
3.203
5.079
Std. Dev.
a
0.1095
0.916
0.1509
22.55
0.2843
0.234
43.79
11.69
23
1.338
2.213
Median
Md
3.265
8.6
0.161
84.5
1.1
0.3
165
61
70
2.76
4.68
H-spr.
Q3-Q1
0.1225
0.9
0.2272
36.25
0.3
0.2
77.25
20
34
2.31
3.56
Lower
Upper
Around Mean
3.059
6.719
-0.086
41.27
0.528
-0.107
81.41
36.37
27.6
0.527
0.653
3.497
10.383
0.518
1 31 .47
1.665
0.829
256.57
83.13
119.6
5.879
9.505
Lower
Upper
Around Median
3.071
7.178
-0.198
27.225
0.626
-0.016
42.945
29.4
16.28
-0.89
-0.945
3.459
10.022
0.52
141.775
1.574
0.616
287.055
92.6
123.72
6.41
10.305
Range
Mean
0.438
3.664
0.604
90.2
1.137
0.936
175.16
46.76
92
5.352
8.852
Median
0.388
2.844
0.718
114.55
0.948
0.632
244.11
63.2
107.44
7.3
11.25
The mean, median, and their associated ranges are included in Figures 4.1 la to 4.1 If.  The means
and medians are reasonably close with the median usually being less than the mean. This
suggests that the outliers are on the large side (i.e., positive skewness) and are pulling the mean
up more than the median. The seven variables which show associated variation should probably
all be log transformed.  The pH is already in log units, but temperature and the iron variables are
not, on the whole, consistent enough to make any general recommendation.  Total iron or any
combination of these should be carefully checked because their variation is open to a variety of
problematic explanations, and until one can be sure that these measures are meaningful, they
should be treated with circumspection.

From the point of view of setting up triggers, either of the ranges around the mean or median
would suffice. If the confidence belts were constructed around the mean, then for the Arnot 003
data, the following observations fall on, near or totally  outside them, as shown in Table 4.4.
Apparently the 2 sigma limits are more sensitive to these deviations and the H-spread usually
shows less observations outside the limits; since 2 sigma = about 95% confidence limits, then 2.5
are expected to  exceed the upper limit. Three, therefore, is an expected number and needs no
reaction. The iron observations are again somewhat inconsistent.

Table 4.4:    Observations Falling Beyond Confidence Limits of 2 Standard Deviations
             Around the Mean Beyond the (1.58*)  H-Spread (Arnot 003 Data)

Variable
pH
Temperature
Discharge
Acid
Number of Observations
>2c7
4
3
3
3
>(1.58) H-Spread
8
7
3
1
                                                                                    4-17

-------
Chapter 4

Variable
Total Iron
Ferrous Iron
SO4
Ca
Mg
Mn
Al
Number of Observations
>2c7
6
2
1
2
3
5
0
>(1.58) H-Spread
8
8
0
0
2
0
0
The approach to Box-Jenkins Time Series analysis may be simplified to accomplish preliminary
exploration of the data. We may, therefore, examine the autocorrelation function (Acf) and the
partial autocorrelation function (Pacf) to the data and evaluate their first differences, if necessary.
From this analysis it can be decided whether the data appear to represent the Integrated Moving
Average (IMA) (0,1,1) model described in Chapter 3, or whether a new model should be fitted.

In general, if the Autocorrelation Factor (Acf) looks more or less J-shaped (e.g., Figure 4.12a for
Arnot 001 discharge data), it is close enough to the model already described to need no further
analysis. If it is subsequently decided to pursue the analysis to model fitting then the full Box-
Jenkins  procedures described in Chapter 3 should be undertaken.


For the Arnot 001 data, the Acf for discharge (Figure 4.12a), calcium (Figure 4.12b), and
aluminum (Figure 4.12c) all conform to the J-shape and are considered to be adequately modeled
by an IMA (0,1,1) model. The total iron (Figure 4.12d) and ferrous iron graphs do not show this
form of Acf so would require a more formal analysis. From these Acf s, however, it is suspected
that a simple Moving Average (MA) (0,0,1) would be adequate to represent these data. In other
words, the data appear to represent a random walk.
4-18

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 4.12a: Autocorrelation Function of Discharge (Arnot 001)

  •?• > ACP C1

  ACP Of

             -1.0 -0.8 -0.6 -0,4 -0.2   0.0  0.2   0.4  0.6  0.8   1.0
1
2
5
4
i
§
?
•
f
10
11
12
11
14
IS
IS
1?
IS
o.89i
0,431
0,211
0.078
o.oso
-0,044
-0.107
-0.164
•0.188
-0.182
-0.188
-0.111
-0.186
-0.186
-0.177
-0.142
-0.117
-0.10?
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXX
XJOCXXX
XXX
XX
XX
xxxx
XJCXJOI
KXXXXJC
XXXJCMX
XXXMXX
xxxxxx
xxxxxx
xxxxxx
xxxxx
xxxxx
xxxx
xxxx
Figure 4.12b: Autocorrelation Function of Calcium (Arnot 001)
  MTB > ACF  ,4

  ACF Of  CA

              1.0 -0.8 -0.8 -0.4 -0.2   0.0  0.2  0,4   0.6  O.i   1.0

    1   0.678                             KMXXXXXXXKXKXXXXXX
    2   0,448
    3   0,363
    4   0,178
    S  -O.OOS            -                 X
    6  -0.171
    I  -0.312
    a  -0.33S
    9  -0,320
   10  -0.343
   tt  -0.349
   !2  -0.2S4
   13  -0.182
   14  -0,120
   15  -0.002                             X
   16   0.036                             XX
   t ?   0.059                             XX
   IB   0,145
                                                                            4-19

-------
Chapter 4
Figure 4.12c: Autocorrelation Function of Aluminum (Arnot 001)
 MTB > ACF C?
 ACF of AL
1
2
3
4
5
6
7
&
9
10
\ 1
12
13
14
15
16
17
18
O.i§4
0,478
0.3BI
0,300
0 , 1 64
0. 12?
-0.062
-0.213
-0,323
-0.286
-0,360
-0,312
-0.272
-0.304
-0. 184
-0. 1 10
-0,1 15
-0.061
XXXXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXX
xxxxxxxx
xxxxx
xxxx
XXX
XXXXXX
xxxxxxxxx
xxxxxxxx

xxxxxxxxx
xxxxxxxx
xxxxxxxxx

xxxx
XKXX
XXX
Figure 4.12d: Autocorrelation Function of Total Iron (Arnot 001)
     MTB  > ACF  C2
     ACF  Of TFE
                                                                            1.0

1
2
3
4
5
6
7
§
9
10
1 1
12
13
14
IS
16
17
18
-t .0 -0.8
0,235
0.020

0. 142
0.QS6
0. 120
0.071
0.073
0,099
0,004
-0.215
-0.188
-0,024
-0.145
-0,023
-0.023
-0.189 •
-0. 169
-0.6 -0,4 -0.2 0,0 0.2 0.4 0.6 0.8 1
XXXXXXX
XX
XXX
XXXXX
XXX
.
XXX
XXX
XXX
X
XXXXXX
xxxxx
XX
xxxxx
XX
XX
XXXXXX

4-20

-------
	Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

To check these conclusions, the discharge parameter was run through the full Box-Jenkins
autocorrelation function analysis and, as in Chapter 3, a first difference was required to reduce the
Acf to that expected from white noise. An autoregressive integrated (ARI) (1,1,0) model was
fitted for diagnostic purposes, and while most criteria were satisfactory, the confidence belts
around the coefficient of the differenced series included zero.  For that reason, this model was
rejected and the IMA (0,1,1) appears most appropriate. This analysis of Arnot 001 data was then
terminated.

Arnot 003 data yielded similar results and the Acf s of discharge and log discharge were almost
identical.  Acf s for calcium, magnesium, manganese, and aluminum were similar in form; total
iron and ferrous iron are peculiar and probably representative of random variation. A comparison
of the standard deviations of the raw data from Table 4.1b and the residuals after fitting the model
is illustrated in Table 4.5.  There is little improvement from fitting the models, further confirming
that the variation in these data are essentially random.


Table 4.5:    Comparison of Total Iron and Ferrous Iron
Variable
Total Iron
Ferrous
yv.
 ACF Cl
           *CP  of PH
                     -1.0 -0.8 -O.8 -0.4 -0.2  0.0   0,2  0.4  0.6  0.8   1.0
1
2
3
4
S
e
7
S
S
JO
1 \
11
13
I*
15
ti
If
18
IS
0, 108
0.003
O.OS>
0.093
-O.OSS
-0.170
-0.080
-0.03Q
-0. 143
-0,215
-0. 143
0.185
-0.165
-0.06?
-0.008
-0.0<8
-0. 128
-0 .
0.094
xxxx
X
XX
XXX
XX
XXXXX
XXX
XX
xxxxx
• xxxxxx
XXXXX
xxxxxx
XXXXX
XXX
X
XX
xxxx
XX
XXX
                                                                                     4-21

-------
Chapter 4




Figure 4.13b: Partial Autocorrelation Function of pH (Arnot 004)
     PACF Of

1
2
S
4
S
1
7
S
S
10
n
12
13
14
11
16
1?
16
IS
-1.0 -0.8
0.106
-o » ooi
0.052
o.ots
-0.078
-0.161
•0.058
-0.020
-§,117
-O.lfl
-0.131
i.toi
-0.209
-0.028
-0.070
-0 , 1 7 S
-0.165
•0.0456
0.008
•0.6 -0.4 -0,2 0.0 0.2 0.4 0.6 0.8 1.0
xxxx
X
XX
XXX
XXX
xxxxx
XX
n
xxxx
XXJCXX
xxxx
xxxxxx
xxxxxx
XX
XXX
XKXXX
KXXXX
XXX
X
Figure 4.13c: Autocorrelation Function of Log Discharge (Arnot 004)







     ACF of LGDIS



   i            -1.0 -0.8 -Q.6 -0.4  -0.2  0.0  0.2   Q.<4  0,6  0.8   1.0
1
2
3
4
5
6
7
8
9
10
11
12
T3
14
15
V€
17
18
W
0.689
0.724
0.527
0.323
0.112
-0.075
-0.219
-0.358
-0.442
-0.488
-0.488
-0.478
-0.417
-0.319
-0.222
-0.1 J2
-0.013
0.046
0.092
xxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxx
xxxx
XXX
xxxxxx
xxxxxxxxxx
xxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxx
xxxxxxx
xxxx
X
XX
XXX
4-22

-------
_ Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 4.13d: Partial Autocorrelation Function of Log Discharge (Arnot 004)
      HUTS  > -PACF C12

      PACF of LC01S

1
2
S
4
5
§
I
a
i
10
11
12
13
14
IS
16
If
1S
IS
-1.0 -0.8
0.889
-0,314
-0. 206
-o . i i §
-0, 187
-0.051
0,010
-0.262
0.073
-0.06S
-6,022
-0, 144
0.119
O.O14
-0.077
0.040
-0.051
-0.243
0.170
-0,6 -0.4 ~§»2 0.0 0.2 0,4 0.6 .Q.S 1.0
XXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXX
xxxxxx
xxxx
xxxxxx
XX
X
xxxxxxxx
XXX
XXX
XX
XXXXM
xxxx
X
XXX
XX
XX
XXXXXXJC
XXXXX
Figure 4.13e: Autocorrelation Function of Ferric Iron (Arnot 004)
           -1.0 -0.8 -0.6  -0.4  -0.2  0.0   0.2   0.4  0.6  0.8   1.0
             +	+	+	+	+	+	+	+	+	+	+
  1    0.556                           XXXXXXXXXXXXXXX
  2    0.324                           XXXXXXXXX
  3    0.310                           XXXXXXXXX
  4    0.249                           XXXXXXX
  5    0.139                           XXXX
  6-0.026                          XX
  7   -0.075                         XXX
  8   -0.025                          XX
  9    0.057                           XX
 10   -0.001                           X
 11   -0.021                          XX
 12   -0.069                         XXX
 13    0.017                           X
 14    0.001                           X
 15   -0.120                        XXXX
 16   -0.105                        XXXX
 17   -0.105                        XXXX
 18   -0.120                        XXXX
 19   -0.100                         XXX
                                                                               4-23

-------
Chapter 4

Figure 4.13f: Partial Autocorrelation Function of Ferric Iron (Arnot 004)

           -1.0  -0.8 -0.6  -0.4  -0.2  0.0   0.2  0.4   0.6  0.8   1.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
0
0
0
0
-0
-0
-0
0
0
-0
0
-0
0
-0
-0
0
-0
-0
0
.556
.023
.176
.012
.051
.181
.040
.057
.154
.048
.007
.163
.104
.069
.073
.021
.055
.066
.046
XXXXXXXXXXXXXXX
XX
XXXXX
X
XX
xxxxxx
XX
XX
XXXXX
XX
X
XXXXX
xxxx
XXX
XXX
XX
XX
XXX
XX
Ferric iron shows similar patterns to log discharge, suggesting an IMA (0, 1, 1) model. This is
similar to some of the measures of iron content in Arnot 001 and Arnot 003.

Summary

One of the most interesting features in the time series analyses of the Arnot site is the absence or
lack of obvious seasonal patterns. Based upon this data set, it appears that this arises for the
following reasons:

       The peak flow occurs during Spring snow-melt and runoff. This varies over several
       months, from February to April, so that successive maxima may not occur at the same
       time each year.

•      Another peak flow may occur in early summer as the result of intense short duration
       storms. Again this is not strictly confined to exactly the same period from year to year.

       If the missing values occur during these events, and they often appear to be so related,
       then the extreme values do not occur in a uniform cycle; this confuses any seasonal
       pattern which may be present.
4-24

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 5: Analysis of Data from the Clarion Site

The Clarion mine site is located in northern Clarion County, Pennsylvania, in the northwestern
portion of the bituminous coal region. The acid mine discharge (S3CLAR) was from an
abandoned surface mine on the Upper and Lower Clarion coal seams that was the site of
abandoned mine reclamation and a cooperative research project by the Bureau of Abandoned
Mine Reclamation of the Pennsylvania Department of Environmental Resources and the U.S.
Bureau of Mines. During this project, alkaline addition (in the form of crushed limestone) was
incorporated into the reclamation procedures as an attempt to  reduce the acid mine drainage
pollution, as described by Lusardi and Erickson (1985). A map of the Clarion site is shown in
Figure 5.1, which is adapted from Lusardi and Erickson (1985).

The data set used for most of the statistical analysis of the Clarion site contains 96 samples for
the time period from December 15, 1981 to August 4, 1986. Of these data, approximately half
(N = 49) are pre-treatment, and the other half (N = 47)  are post-treatment with the crushed
limestone, alkaline-addition reclamation procedure. Missing data presented a problem in  the
statistical analysis of the discharge and water quality characteristics.
                                                                                    5-1

-------
Chapter 5

Figure 5.1:   Map of Clarion Mine Site

                \   K                                     	}
                             i  r   ^linlto
                       \ !  U • 'f^:--'/ i^'—f
                               -  X     ^ 1
                                                           Enlargrf Portion of the Fryburg, Lucinda,
                                                                  Tionesta & Tylersburg
                                                               USOS 7.5 Minute Quadrangles
5-2

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Univariate Analysis

Initially, there were 104 observations in this data set. However, at least 19 samples were missing
flow measurements, and others were missing one or more water quality parameters. Discharge
and pH were plotted against date to see where the largest gaps occurred.  After careful
examination, the data set was reduced to 96 rows containing seven columns:  pH, discharge,
acidity, total iron, sulfate, ferrous iron, and ferric iron. Ferric iron is determined by subtracting
ferrous iron from total iron. The statistics describing the variables were derived and mean values
were inserted in rows with missing values. The data were then rerun to yield Table 5.1.

Table  5.1:     Summary Statistics for S3CLAR (N=96)

PH
Discharge
Acidity
Total Iron
Ferrous Iron
SO,
Ferric Iron
N
96
79
96
96
96
96
96
N*
0
17
0
0
0
0
0
Mean
3.696
12.57
522.4
82.40
48.38
1528.4
34.02
Median
3.195
6.70
483.5
75.00
37.75
1569.0
23.60
Trimmed
Mean
3.612
9.09
505.6
79.31
45.97
1525.9
31.44
Standard
Deviation
0.985
22.37
346.4
51.01
34.46
566.0
32.05
Standard Error
of the Mean
0.101
2.52
35.4
5.21
3.52
57.8
3.3

PH
Discharge
Acidity
Total Iron
Ferrous Iron
S0d
Ferric Iron
Minimum
2.670
0.05
1.0
8.70
0.90
296.0
-4
Maximum
6.430
172.00
1 546.0
257.00
143.00
3241 .0
152.00
First
Quartile
3.002
3.60
232.5
39.70
23.87
1181.5
6.00
Third
Quartile
4.455
12.48
737.7
110.25
66.65
1878.2
55.00
Coefficient
of Variation
26.6
178.0
66.3
61.9
71.2
37.0
94.2
The very high magnitude of variation in discharge is shown by the value of the Coefficient of
Variation (CV% = standard deviation/mean *100) = 180.1%.  The coefficients of variation for
pH and sulfate are reasonable.  However, the CV% for acidity and all iron parameters are rather
large.  Correction for some exceptional values is proposed when the variables take values which
are either very unlikely or even sometimes impossible.

The frequency distribution for sulfate appears to be  symmetrical (Figure 5.2). However, some
variables exhibit positive skewness including discharge (Figure 5.3a) and acidity (Figure 5.4a).
Attempts were made to make these frequency distributions more symmetrical by transforming to
logarithms, but this transformation over-corrected and resulted in negative  skewness.
                                                                                      5-3

-------
Chapter 5

For example, log discharge (Figure 5.3b) is slightly negatively skewed and log acidity is
extremely skewed (Figure 5.4b).  It was decided, therefore, to use the data without
transformation.

Figure 5.2:   Stem-and-leaf of Sulfate
          N = 96
          Leaf Unit = 100
          4         0        2333
          6         0        45
          10        0        677
          17        0        8889999
          26        1        000011111
          38        1        222223333333
          (12)       1        444445555555
          46        1        66666666677777777
          29        1        8888889999
          19        2        000011111
          10        2        22223
          5         2        44
          326
          228
          1         3
          1         3        2
5-4

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figure 5.3a:
N = 77
Leaf Unit =
(53)
24
13
7
4
3
2
2
2
1
1
1
1
1
1
1
1
1
Figure 5.3b:
N = 75
Leaf Unit =
4
7
11
19
(35)
21
6
1
Stem-and-leaf of Discharge

1.0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

N* = 19


00000000001 1 1 1 222333333344444555555556666667788999999
00222244444
001288
066
0
0


3








2
Stem-and-Leaf of Log

0.10
-1
-0
-0
0
0
1
1
2

N* = 22
3000
766
4330
01124444

















Discharge






55555566666777777777788888899999999
000111111333344
55679
2



                                                                        5-5

-------
Chapter 5
Figure 5.4a: Stem-and-leaf of Acid
N = 96
Leaf Unit
8
21
30
38
(14)
44
34
26
20
18
8
6
4
2
1
1

= 10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15


00114568
0234466667788
033356899
01245778
01114456788899
0112778899
03344678
014499
15
1 455568899
38
09
04
8

4
Figure 5.4b:  Stem-and-leaf of Log Acid
       N=97
       Leaf Unit = 0.10
       1         -1       0
       1         -0
       2-00
       303
       3        0
       5        1        22
       9        1        6679
       31        2       0011122222222333444444
       (58)      2       5555555566666666666666777777777777888888888889999999999999
       8        3       00000011
5-6

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Due to the 19 missing discharge values, a second modified data set was prepared by omitting
each row with a missing value for discharge; this left 79 rows with complete observations. This
step was essential to the study of the association between discharge and other parameters using
plotting routines or cross-correlation. Summary statistics for the modified data set are presented
in Table 5.2.

Table 5.2:     Summary Statistics for S3CLAR Adjusted Data Deck (N=79)

PH
Discharge
Acidity
Total Iron
Ferrous Iron
SO,
Ferric Iron
N
79
79
79
79
79
79
79
Mean
3.624
12.57
556.1
86.70
51.23
1586.3
35.47
Median
3.160
6.70
499.0
78.50
40.00
1619.0
26.00
Trimmed
mean
3.533
9.09
542.1
83.80
49.01
1578.6
32.89
Standard
Deviation
0.967
22.37
361.2
53.45
35.92
559.6
33.80
Standard Error
of the Mean
0.109
2.52
40.6
6.01
4.04
63.0
3.80

PH
Discharge
Acidity
Total Iron
Ferrous Iron
S0d
Ferric Iron
Minimum
2.670
0.05
1.0
8.70
0.90
364.0
-4.00
Maximum
6.430
172.00
1 546.0
257.00
143.00
3241 .0
152.00
First
Quartile
2.950
3.60
234.0
43.00
25.50
1193.0
6.00
Third
Quartile
4.050
12.48
819.0
118.00
73.50
1 948.0
55.20
Coefficient of
Variation
26.7
178.0
65.0
61.6
70.1
35.3
95.3
In this adjusted data set (N=79), the coefficients of variation for pH, discharge, acidity, total
iron, and sulfate remain close to the same values even after adjustment.  CV% for ferrous iron
and ferric iron were greatly reduced.  The frequency distributions showed similar positive
skewness except for sulfate which appeared essentially symmetrical.

Bivariate Analysis

In order to have a measure of the degree of linear association among pairs of variables and to
ensure that any relationship is not obscured by time lags, cross-correlation functions for each
parameter were obtained (Figure 5.5 for N=79 observations). Discharge and pH showed their
maximum degree of association at zero lag (0.357, Figure 5.5a) and it is suspected that the value
of this relationship is inflated by the one exceptional value. In other words, it is doubtful that
these data could be used to substantiate any real degree of association. The maximum degree of
association between acidity and discharge is -0.3 (Figure 5.5b); the direction of association
(increase of acidity with decrease of discharge) is likely to be correct, but the  degree (r2 = 9%) is
                                                                                       5-7

-------
Chapter 5	

very small.  Similarly, sulfate and discharge show several values of cross-correlation greater than
0.2 at lags of 0, 7, 8, and -15, so that no real association can be claimed (Figure 5.5c).

The use of the cross-correlation function in bivariate and time series analyses is discussed in
Chapters 3 through 9 of this report. In these chapters, an r value of 0.2 or a more conservative
value of 0.3 have been arbitrarily selected as critical values, with the inference that r values of
less than these critical values are not significantly different than 0, and therefore can be deleted
from consideration.  These arbitrary critical values were selected by rounding off the values of r
that are significant at the 5% level  for the sample sizes contained in this report (e.g., for this data
set of the Clarion discharge where  N = 96, the value of r that is significant at the 5% level with
90 degrees of freedom is 0.205 (Table 22 in Arkin and Colton, 1963, p. 155)).

Figure 5.5a:  Cross Correlation  Function for pH and Discharge
     MTB >  CCF  C1  VS C2
     CCf -  corr»)«t»*  PHftJ  «nd BISCH(t+kJ
                -1.O -O.B -0.6 -0.4 -0,2  0.0  Q.I 0.4  0.«  O.8  1.0
     -1»   Q.031                           XX
     -IT   O.O15                           X
      -»e   Q.O70                           xxx
     -18   0.159                           XHXXX-
     -14   0.142                           SOUOtX
      -13   0.174                           XXXXX
     -13   0.1t7                           XXXX
      •11   o.lie                           jowcx
      -1O   O.1O8                            XXXX
      -»   O»153                           XMXXM
-»
-T
—6
— 9
-4
-*
-a
—i
a
t
2
a
4
e
0
7
g
9
10
11
12
13
14
15
16
17
18
0. 191
O.I 78
0.256
0.265
0.292
Q. 193
Q. 224
0.149
0. 3S7
0.149 ..'Jr
o.oso
0,039
0.047
O.122
0.099
0.201
O. 164
0.229
0.1«3
0.122
0.071
0.050
0.10O
O.130
-0.TI9
— o . i fui '
-O.101
XXMXXX
XXXXX
XXMXXMM
XXXXXXXX
xxxxxxxx
JCXKXXX
XXXXXXX
'., • .. • •: ' / XXXXX
xxxxxxxxxx
XXXXX
XX
XX
XX
xxxx
XXX
xjo«xx
XXXXX
XXXXXXX
X'XXAX
XXXJi
XXX
XK
JOCK
xxxx
JOCXX
XXXXX
xxxx

-------
                        Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 5.5b:  Cross Correlation Function for Acidity and Discharge
                 3 VS C2
      MT» > CCF €»
       CCF - corr«I«t*a  ACZD »t\a
                .-1.O -0.8  -0.6 -O.4 -0,2  0,0  0,2  €t.<4  O,«  O.a
-18
-ft
-16
-15
-14
-13
-It
• 1 1
-1O
~~@
-8
-7
-e
-s
-4
-3
™Z
-1
o
1
2
1
4
5
S
7
ft
a
to
1 1
12
13
1«s
t6
16
IT
t8
-O
-O
-O
-o
-o
B^rt
o
-o
-o
-0
-o
-a
-o
-o
-o
-o
-o
-o
-o
	 -o'
o
-o
-o
-o
-o
-o
-o
-o
-o
_o
—.A
-o
-o
o
-o
o
o
.O3O
,oze
, 1OB
,228
. IB*
.OSS
,Q87
.059
, 153
. 138
, 133
.034
.409
. ) 16
,233
. J23
, 151
, 102
,3OO
TTIf 	 " 	
,O»2
, IBS
,ewo
,2O4
, ISO
.258
.206
. 1ZO
,118
. 123
, 1-4S
.OO3
.OT9
,oo»
.016
. J3B
,
KX
KX
KKXX
XXXXKKX
JOtXXX
XXX
XXX
XX

HXKX
XXXX
XX
XXXXXX
xxxx
xxxxxxx
xxxx
XXKXX
xxxx
XXXMXXXX
xxxx
XXM
xxxxx
XXX
xxxxxx
KXXXX
xxxxxxx
XX XXXX
xxxx
xxxx
xxxx
xxxxx
X
XXX
X
X
xxxx
xxxxx
Figure 5.5c:  Cross Correlation Function for Sulfate and Discharge
         «TB  >  CCF  C6 VS C2
         CCF  -  correlates  SO4(t)  an<9 OISCH(t*k)

                    -1,O  -0.6  -O,8 -O,*  -0.2  O.O   Q.2   0,4   O,6   O,8
-18
-17
-16
- 1 s
-14 "
-13
-12
-1 1
-1O
	 Q
-a
— 7
-6
~5
-a
-3
-2
~ 1
O
1
2
3
4
5
6
~_Z_
a
»
10
1 1
12
13
14
15
16
17
18
-0
0
-o
, -o
-o
-o
o
-o
-o
-0
-o
-o
-0
-o
-o
0
-o
-o
-o
~"o
-o
-0
o
-0
o
-0
-o
-o
-o
-o
o
0
-o
0
-o.
-o .
0,
,012
.053
,Q81
.263
.083
,O86
,075
, OO7
.079
.031
. 129
.124
.087
. 1 1O
. »48
-O17
.008
.065
.23-4
.OO7
. 176
,O?2
.046
. 153
,Q17
_2S.3
, 2OO
. 109
.004
.Oil
-O26
. 1O3
.061
, 137
.013
.032
, 1O7
X
XX
XXX
___ 	 _ 	 XXMXXXXX

XXX
XXX
X
XXX
XX
xxxx
xxxx
XXX
xxxx
xxxxx
X
X
MMX
_ _JSXX»OQC__»_
X
xxxxx

XX
xxxxx
X
XXXXXXXM
xxxxxx
xxxx
X
X
XX
xxxx
XXX
xxxx
X
XX
xxxx
                                                                                         5-9

-------
Chapter 5	

By omitting discharge, it is possible to use the data deck of N=96 and again examine
relationships among pairs of variables (see Figure 5.6). The plot of pH against acidity (Figure
5.6a) is curvilinear with pH decreasing rapidly as acidity increases. Below a pH of 3, acidity
still increases but the pH stabilizes. Acidity and sulfate (Figure 5.6b) show positive direct linear
association with a wide scatter of data points.

Figure 5.6a:  Plot of pH vs. Acidity
   MTB  > PLOT  Cl  VS C3

   PH
            — *
            — * *
         6.0 +   *
            -    *      *
            —        *
            -    *  *3             *
            —    *
         4.8+      *2*
            -       *   * 2   * *
                  *       *      •
            —        *    *
                       2  *
         3.6*       *       *  *      «
            -        »»*.*     *  * *      *       *    *
                          **2*233 23**   *   * **22
                                2    2*22  2«     2  *  *  *
         2.4
                       300
                         600
                                             900
                                                                            -ACID
                                                       1 200
                                                                  1500
Figure 5.6b:  Plot of Acidity vs. Sulfate

   MTB  >  PLOT  C3  VS C6

   ACID


        1500+                         *
        1000+
500+
           0+
                                          *    **
                                    *                *
                                *         *  2*  **  *      *
                              *                     *
                                 *   * *   *
                                *     32      «
                                                  *  * * »
                                    *2*****
                    ***»  2*   *      *
            *      **2    ******     *
           *2           *          ** *
             *  **      *      *       *
           	-i	1-	*	>•-
               600       1200       1800       2400
                                                                              -SO4
                                                                    3000
5-10

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 5.6c:  Plot of Iron vs. Sulfate

  MTB  > PLOT C4 VS  C6

  TOTLFE   -

            _                                              *
        240 +
        160+
           0+
               +	+	.___ +	+	+___»___--+
               0         600       1200       1800       2400       3OQO
The importance of these plots is that, despite a wide scatter and tendency to show
heteroscedasticity (See Figure 5.6c, total iron vs. sulfate), the association is similar for all pairs
of variables (i.e., direct linear association).  The comparison of the characteristics of
homoscedasticity and heteroscedasticity in bivariate plots of data is described in Chapters 3, 6
and 9 of this report and shown in Figures 6.5a through 6.5c. When the variation of two variables
increases as their values increase, heteroscedasticity is present, and it is generally advisable to
logarithmically transform these variables, which tends to make the plot of the variables
homoscedastic if the standard deviation increases approximately proportionally to concentration
prior to transformation.

The persistence of these linear relationships allow us to use one, or at most two, variables for
detailed analysis. The conclusions from this analysis may be applied to the other variables. In
most cases, there appear to be relationships that are roughly linear. However, the very wide
scatter of the data makes the associations rather weak.

Time Series Analysis

Under time series analysis, the data are analyzed in three steps. First, the data may be displayed
as graphical plots against time, and quality control limits of two standard deviations (using
results from Tables 5.1 and 5.2) may be inserted to show the unusual departures from the mean
or median. Second, the data for each variable may be submitted to autocorrelation function
analysis (Acf). This permits comparison of variability over time for each parameter and serves
to yield a preliminary identification of suitable models for more complex analysis. Third and
finally,  the data for selected variables are subjected to more complete Box-Jenkins analysis to
identify and compare appropriate time series models.

                                                                                      5-11

-------
Chapter 5	

The plot of pH versus time is shown in Figure 5.7a. The most striking feature is the large change
in the magnitude of variation after the 50th observation.  It appears as if an entirely different
environment occurred after the 51st observation (June 30, 1984). Because the two parts of the
curve are so different, the two standard deviation limits for the mean underestimate the variation
in the later part of the curve. This leads to six values exceeding the two standard deviation
limits. It would require data representing a much longer period  to determine if this change is a
unique circumstance or a regular occurrence.

Figure 5.7b displays  a plot of acidity versus time and shows a somewhat different pattern, hence
the low degree of association earlier described. Total iron versus time (Figure 5.7c) has a pattern
similar to that of acidity versus time, although there is an extreme peak for acidity at time period
73 and iron has smaller peaks at 77 and 79.  Sulfate versus time (Figure 5.7d) varies in the same
manner as acidity and total iron.

It seems evident that for pH, acidity, total iron, and sulfate there is a break after the 40th
observation (Figures 5.7a through 5.7d) reflecting the effect of lime treatment at that time. In
this series of graphs, it can be observed that the effect of lime treatment was not persistent, and
instead, disappeared  with time.

Figure 5.7a:   Plot of pH vs. Time
  MTB > TSPLOT Cl

     6.2S+"

         .
                                       »       (VI  j^
-A
               10      JO      30     40      50      SO      70      10      SO     100
 5-12

-------
                         Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figure 5.7b:  Plot of Acidity vs. Time
     MTB > TSPLOT C3
 ACID
     1500
     1000
                to      to      so       40
                                                                                     100
Figure 5.7c:   Plot of Iron vs. Time
    MTB > TSPLOT C<


    TOTLPE -
                                                                      Jl
                ">     *0      30      40      SO      OO      TO      80      QO      too
Figure 5.7d:  Plot of Sulfate vs. Time
   MT8 > TSHOT C«

   SO4
      3000.
     2000+
      1000*
           I   l\   ftu\
i	:4'i  jl'-V"    1
/v      \n/   '"      :
                                       12
                                             9 I
                ID      20      30      40
                                             •0      M

                                                     N» • J
                                                            0

                                                            70
                                                                              45 6
                                                        V
                                                                   10      90
                                                                                           5-13

-------
Chapter 5

Autocorrelation Functions

It should be noted that the autocorrelation function (Acf) may be used to identify the kind of
model which best represents the data for more detailed analysis and curve-fitting. The Acf of pH
(Figure 5.8a) shows a sharp decline with increasing lag, and would probably require a first
difference to remove this "trend." Acf s of acidity (Figure 5.8b) and total iron (Figure 5.8c) are
similar and possess similar implications. Sulfate (Figure 5.8d) shows a much weaker degree of
autocorrelation but is still of the same general form.

Figure 5.8a:  Autocorrelation Function of pH
        >  ACF Cl

    ACF of PH

               -1.0-0.8-0.6-0.4-0.2   0,0  0.2  0.4   0.6   0.8   1.0
1
2
3
4
S
6
7
8
§
10
11
12
13
14
IS
18
17
18
It
0.815
0,706
0.62?
0.523
0.388
0.283
0.257
0.247
0.220
0.276
0.25S
0,20?
0.224
0.223
0.156
0.123
0. 157
0.192
0.212
xxxxxxxxxxxxxxxxxxxxx
MXXXXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxx
xxxxxxx
xxxxxxx
xxxxxxx
xxxxxxxx
xxxxxxx
xxxxxx
xxxxxxx
xxxxxxx
xxxxx

xxxxx
xxxxxx
xxxxxx
5-14

-------
                     Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 5.8b:  Autocorrelation Function of Acid

     MTi > ACF  €3

     ACF of ACID

                -1.0 -0.8 -0.6 -0.4 -0.2  0,0  0,2  0.4  0.6   0.8   1,0

       1   0 - 607                           XXXXXXXXXXXXXXXX
       2   0.4S3                           XXXXXXXXXXXXX
       3   0.31?
       4   0,344                           XXXXXXXXXX
       5   0.219'                          XXXXXXXX
       6   0.220
       7   0.204
       8   0.224                           XXXXXXX
       S
      10   0,233                           XXXXXXX
      11   0.1S7                           XXXXXX
      12   0.173
      13   0,20?
      14   0.21T
      15   0.124
      IS                                   XXX
      17   0.115                           XXXX
      18   0.239                           XXXXXXX
      li   0.114


Figure 5.8c:  Autocorrelation Function of Total Iron


     ACF Of TOTLFi

                -1,0 -0.8  -0.6 -0.4 -0.2  0.0  0.2   0.4   0.6  O.i  1.0
                  + -»«-4. _»._»+_»__4,,»_--,^,-,___^.	1	— — •f-	^.___»^,__-»_^
       1
       2   0.487                             XXXXXXXXXXXXX
       3   0.341
       4   0.292
       5   0,304
       S   0.278
       7   0.255                             XXXXXXX
       fl   0.295                             XXXXXXKX
       i   §.333
      10   0,372                             XXXXXXXXXX
      11   0.24?
      12   0,207
      T3   0.187
      14   0.167
      tS   O.OS3                          .   XXX
      16   0.009                             X
    '17                                    XM
      IS  -0.02Q                            XX
      19  -0.018                             X
                                                                            5-15

-------
Chapter 5

Figure 5.8d:  Autocorrelation Function of Sulfate


    ACF  of SQ4

                -1.0  -0.8 -0.6 -0,4 -0,2   0,0   0.2   0.4   0.6   0.6   1.0

       1    0,344                              XXXXXXXXXX
       2    0,122
       3    0,150                              XXXXX
       4    0.08S                              XXX
       5    0,093                              XXX
       6    0.033                              XX
       7   -G.Q6S                            XXX
       8   -0.013                              X
       0   -0.02?                             XX
      10    0,108
      11    0.1 IS
      12    0.037                              XX
      13    0.039                              XX
      14    0.055                              XX
      15   -0,062                            XXX
      16   -0.013                              X
      17    0,036                              XX
      18    0.018                              X
      It   -O.OS8                             XX
The Acf of ferrous iron showed no evident pattern and initially, at least, could be considered to
show random variation.  Ferric iron effectively showed no variation. One cannot but suspect that
these variables need careful examination, in regards to field measurement and laboratory testing
procedure.

Modeling Selected Variables by Box-Jenkins Time Series Analysis

Three of the variables were chosen for more detailed analysis; pH, sulfate, and ferrous iron. pH
shows, essentially, variation that is similar to sulfate.  Presumably, they should both possess
somewhat similar models. Ferrous iron was included to see if variation was random.

The Acf of first difference for pH gave a chi-square (goodness-of-fit test for the given model) of
40.17 with 25 degrees of freedom yielding a probability of less than 0.05 and greater than 0.02.
The original data gave a chi-square of 285.3 with 25 degrees of freedom (P < 0.001). Taking a
second difference led to an increase in the chi-square value to 70.26 which suggests over-
differencing.

The chi-square of 32.56 with 22 degrees of freedom (df) for Acf of residuals after fitting a one
step autoregressive AR (1,0,0) model gives a 0.10 > P > 0.05 (Table 5.3). This effectively
reduced the Acf, and the accompanying partial autocorrelation functions (Pacf) possessed what
appear to be significant spikes at lags 10 and 19.  These spikes were ignored because, to


5-16

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

conclude that they were reflections of real seasonal effects would require the existence of a
significant spike at low lags (say at lag 5) and this did not occur. Up to period 42, the residuals
are very small.  At period 42, there is a serious departure and residuals show larger fluctuations
from period 50 onwards.

Table 5.3:    Summary of Time Series Models for pH, Clarion Mine
No

1.
2.
Model

AR(1,0,0)
MA(0,1,1)
Residuals
Chi-sq.
32.56
36.26
df
22
23
P
0.10 >P> 0.05
0.5>P>0.2
Standard Deviation
Residual
0.572
0.579
Original
0.985
0.985
The standard deviations, after fitting either model, are almost the same (Table 5.3) each
representing about a 60% reduction. The relevant equations for the models of the pH variable
are:
       1.      AR:    zt  =  0.821zM  +3.676  +a,
       2.      MA:   z,  =  at - 0.247aM

As may be seen in Table 5.4, the two models used for sulfate variation are the AR (1,0,0) and the
moving average, MA (0,0,1) models. The chi-square statistics are similar but the AR (1) yields
an Acf of residuals without any significant spikes.  The MA (1) does not achieve as clean an Acf
of residuals.

The standard deviations of the residuals (Table 5.4) from both models offer only minor reduction
in the original standard deviation of the raw data (< 10%). A comparison of both models as
predictors of future observations is displayed in Figure 5.9.  The projections and the 75%
confidence limits are similar in both models. It is quite clear that both models show the one step
memory and then approximate the overall mean value for the next nine periods.  It is fairly
evident in Figure 5.9, that the expected values from the AR (1) model fluctuate around the
overall mean and fail to duplicate closely, the wide swings present in the raw data.  This is
because the model is based on the entire record of 96 observations and the fluctuations are very
large during the first 35 and the last 30 periods (see Figure 5.7d).

Table 5.4:    Summary Statistics for Time  Series Models of SO4 from Clarion Site
N

1.
2.
Model

AR( 1,0,0)
MA(0,0,1
Residuals
Chi-sq.
9.184
9.204
df
22
22
P
>0.99
>0.99
Standa
Residual
522.3
535.6
rd Deviation
Original
566.0
566.0
                                                                                     5-17

-------
Chapter 5


Figure 5.9:   Projections of Sulfate Data
                                                                             T-

                                                                             I
                                                                            -3
                                                                             •o
5-18

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

The relevant equations for the models of the sulfate variable are:
1.
2.
AR(1):
MA(1):
                       z, = 0.348zM  + 1550.9
                    =  a
                                    0.345aM +1526.1
Four models were used in an attempt to find a "best" fit for the ferrous iron variable. The usual
one step models AR (1) and MA (1) led to satisfactory results which were very similar (Table
5.5). The autocorrelation functions of the residuals from both led to chi-squares of 20.50 and
26. 10 respectively. The degrees of freedom were 23 in both cases, and the probability
statements are similar.  Hence, in effect, either of these models are adequate representations of
the raw data.  The standard deviations of the residuals were close (29.81 and 31.00 respectively).
However, these standard deviations represent very little improvement over the standard deviation
of the raw data (see Table 5.5).

Since there were some irregular spikes in the lag 2 position of the autocorrelation functions of
the residuals from the first two models, more complex models were applied, (MA (2) and an
ARMA(1,1)).  From Table 5.5, it can be seen that the outcomes, in terms of probability of
achieving a chi-square value as large as these from a white noise (i.e., random) series, is very
likely. The standard deviations are close to those of the simpler models and nothing was gained
by attempting to fit these more elaborate models.

Table 5.5:     Summary of Time Series Models for Ferrous Iron, Clarion Site
No.

1.
2.
3.
4.
Model

AR(1,0,0)
MA(0,0,1)
MA(0,0,2)
ARMA(1,0,1)
Residuals
Chi-sq.
20.50
25.10
17.60
20.60
df
23
23
22
22
P
0.6
-------
Chapter 5

Quality Control Limits

There is a very large number of methods for defining quality control limits and there are
arguments for and against all of them.  This section of the chapter is an attempt to compare
different limits for the Clarion site data. Unfortunately, the standard deviations and spreads for
the variables in this  data set are very large, and may be atypical.  Also, the probability statements
refer to comparisons of single samples; multiple comparisons using several samples may require
inflation of the control limits or a reduction in the probability statements.

Table 5.6 contains the statistics from which the quality control limits may be derived.  The
original summary statistics for this data set (N=79) are shown in Table 5.2, and Appendix C
contains a table of various spreads for this data set.  The column in Table 5.6 labeled H-spr/1.349
is included because  it is supposed to be an approximate estimate of the standard deviation
(Velleman and Hoaglin, 1981, p.54). These values may be compared with the corresponding
standard deviations  in the adjacent column. The H-spread estimate for the standard deviation of
pH is smaller than the observed value.  The estimate for the standard deviation of discharge is
much smaller than (one-third of) the observed value, probably reflecting the marked skewness of
these data, which arises from a few extremely large values.  The H-spread estimate for acidity is
larger than the observed value, and is suspected to be a reflection of the skewed data.  The
estimates for sulfate, total iron, ferrous iron and ferric iron are all similar to their observed
values.

Table 5.6: Comparison of Statistics used to calculate the QC limits (N' =79)
Variable
PH
Discharge
Acid
Total Iron
Ferrous Iron
S04
Ferric Iron
Mean
3.624
12.57
556.1
86.7
51.23
1586.3
35.47
Median
3.16
6.7
499
78.5
40
1619
26
H-spread
1.06
8.745
562.5
73
47.45
716
49.15
C-spread
3.38
50.44
1381
181.5
128.2
2125.99
99.1
Standard
Deviation
0.967
22.37
361.2
53.45
35.92
558.6
33.8
H-spread/1.349
0.786
6.48
416.98
54.11
35.17
530.76
36.43
A number of possible spreads which could be used to set up quality control limits are listed in
Table 5.7. The first example is the conventional spread of the mean plus and minus twice the
standard deviation.  In a normal frequency distribution this would include about 95 percent of the
distribution or, alternately, it is expected that about 5 observations in every 100 would fall
outside these limits.  The constraint of strict normality may be relaxed considerably so that this is
a reasonably general confidence interval.  This spread would be used to compare to individual
results (i.e.,  N' =1).
5-20

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Table 5.7: Comparison of QC Limits (Spreads) around Mean and Median

Variable
PH
Discharge
Acid
Total Iron
Ferrous Iron
S04
Ferric Iron
Mean + 2 Standard
Deviation
Mean + 2 Standard
Deviation / ~JW
Median ± 1.58*
H-spread / JW
TV' =79
LL
1.69
-32.2
-166.3
-20.2
-20.61
469.1
-32.1
UL
5.56
57.3
1278.5
193.6
123.07
2703.5
103.1
LL
3.41
7.5
474.8
74.7
43.15
1460.6
27.9
UL
3.84
17.6
637.4
98.7
59.31
1712.0
43.1
LL
2.97
5.1
399.0
65.5
31.57
1491.7
17.3
UL
3.35
8.3
599.0
91.5
48.43
1746.3
34.7

Variable
PH
Discharge
Acid
Total Iron
Ferrous Iron
S04
Ferric Iron
Median ± 1.58*
H-spread / -JN7
Mean i 2 Standard
Deviation /•/A'7
TV' =18
LL
2.77
3.4
289.5
51.3
22.33
1352.4
7.7
UL
3.55
10.0
708.5
105.7
57.67
1885.6
44.3
LL
3.17
2.0
385.8
61.5
34.30
1323.0
19.5
UL
4.08
23.1
726.4
111.9
68.16
1849.6
51.4
If the number of samples is taken into account it must be emphasized that the calculated interval
refers to means of sets of samples of size N' ; for example, if the number of observations is
chosen as base, then iV^V7, in this case, = 1/V79 =0.113, or for 2 <7 (1/V^V7) = 0.226(7 .
These limits are much too restricted. Relaxing this requirement, to say an N' =18, gives
0.471 (7 , and this again refers to means based on sample sizes of 18.  This would appear to be
too restrictive, because too many observations would fall beyond these limits. Similar features
apply to each estimate containing N where Vrv7 > l .
Since the sample size is usually one (   N  = 1), the multiplier will be 2 times 6 or some
equivalent in non-parametric form.  The intervals (quality control limits) listed in Table 5.7 show
a comparison of the conventional parametric limits (based upon 2 <7 ), together with non-
parametric limits (1.58 (H-spread) H N' ) where the sample sizes are N'= 79 and  N' = 18.

Three different estimates of quality control limits around the median are given in Table 5.8: the
median plus or minus the [C-spread], the median plus or minus 1.58 times the [C-spread] over
root N, and the median plus or minus 3 times the H-spread. The conventional limits (means 2
                                                                                    5-21

-------
Chapter 5
O ) are given for comparison in the last column. The spreads are obtained from the Table in
Appendix C.

Table 5.8: Comparison of QC Limits around the Median using Various Forms of Spread

Variable
PH
Discharge
Acid
Total Iron
Ferrous Iron
S04
Ferric Iron
Median i C-spread
LL
-0.22
-43.7
-882.0
-103.0
-88.20
-507.0
-73.1
UL
6.54
57.1
1880.0
260.0
168.20
3745.0
125.1
Median ± 1.58*
C-spread /V^7
LL
1.38
-19.9
-228.3
-17.1
-27.52
499.3
-26.2
UL
4.94
33.3
1226.3
174.1
107.52
2738.7
78.2
Median i 3 * H-spread
LL
-0.02
-19.5
-1188.5
-140.5
-102.35
-529.0
-121.5
UL
6.34
32.9
2186.5
297.5
182.35
3767.0
173.5
Mean i 2 * Standard
Deviation
LL
1.69
-32.2
-166.3
-20.2
-20.61
469.1
-32.1
UL
5.56
57.3
1278.5
193.6
123.07
2703.5
103.1
The median ± C-spread compares reasonably well with the median ± 3 * (H-spread) in Table
5.8 except for discharge where the latter is much smaller than the former. The quality control
limits around the mean yield smaller spreads than either the 3 * (H-spread) or the C-spread,
except for discharge. The value of the mean + 2 6 compares well with the median + C-spread
for discharge.

These comparisons of quality control limits shown in Tables 5.7 and 5.8 are more easily
understood if the description is illustrated in graphs. Figure 5.7a is a graph of the pH of the
discharge from the Clarion site with the mean and median inserted. It is obvious that the
extremely high values after the 50th observation (i.e., after treatment begins), affect the mean
much more strongly than the median. The difference between the mean and the median is
largely due to the pronounced skewness induced by these few large values.

The C-spread and the 3 * (H-spread) quality control limits in Table 5.8 compare very closely.  If
these limits are used, only one value  falls on or near them.  About 7 values fall beyond the 2
sigma limits in Figure 5.7a. The spread of 1.58  * (H-spread / 4W' ) is more constraining than
these 2 sigma limits; if it were plotted on Figure 5.7a, 14 observations would fall on or beyond
this value of spread.

If adjustment is made for a sample size of N' = 6, the limits are much more restrictive and the
majority of values after the treatment was initiated fall beyond this limit. The lower quality
control limits for pH in Table 5.8 fall outside the limits of this graph, but are of little interest in
the present circumstances.  For pH, these arguments are mostly illustrative because prior to
treatment, the values vary around the median of 3.16, indicating the acidic nature of the
discharge.  After treatment (after the 50th observation in Figure 5.la), the pH frequently exceeds
5, but only one value exceeds 6.  From the 85th observation onwards, the pH has returned to on or
below the median value.
5-22

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

The second graphical example of quality control is the plot of sulfate as shown in Figure 5.7d.
In both Figures 5.7a and 5.7d it is assumed that the observations are evenly spaced in time,
which is not always true (see Figure 4.2 in Chapter 4). In Figure 5.7d, the mean and median
coincide fairly closely, suggesting a symmetrical frequency distribution.  It seems obvious that
the spread of median ± 3 * (H-spread) is much too wide (i.e., 3717.0); if it were plotted on
Figure 5.7d, no observation would come close to it.  The spreads of mean plus or minus two
sigma and the median ± 1.58 * (H-spread /V-/V'  ) are very similar and either would be equally
effective, although the 2 <7 limits plotted on Figure 5.7d appear to be more sensitive.

In conclusion, no simple recommendation on quality control limits can be made on the basis of
these observations  of the Clarion site, which may or may not be representative. It seems likely
that establishment of the form of the frequency distribution, particularly symmetry, appears to be
most desirable when setting up the quality control limits.  It seems clear in this analysis that the
limits are not consistent from variable to variable. If this is true in the analyses of data from
different sites, then perhaps different forms of spread with different limits may be necessary for
each variable.

Summary

It seems clear from the data that there was a marked change in the environment after the 50th
observation (May 8, 1984). This is confirmed by the knowledge that "Limestone application
was performed in May and June 1984" (Lusardi and Erickson, 1985, p. 318).  These authors also
conclude that "one year after the limestone application, the water quality in the seeps reflected
no substantial inhibition or neutralization. Improvements in water quality noted in late 1984
have not persisted." pH shows marked improvement (less acidic water) from June 30, 1984
onwards. However, by April 5, 1986 the pH has returned to pre-treatment levels; these changes
are in accord with the conclusions given above. Nevertheless, discharge also shows a change
and fluctuates over a much larger range after the treatment date. It is doubtful whether this  can
be attributed to the treatment.

The time series plot for acidity (Figure 5.7b) fluctuates above the mean up to the 50th observation
and then becomes much less variable until just beyond the 70th observation.  There is a large
spike of increased acidity at 75 and then acidity declines back to the mean value from the 90th
observation onwards.

In the case of total  iron, the first 19 observations vary closely around the mean of the entire
series; then, from observation 20 to 45, total iron shows much larger fluctuations, way above the
mean. From the 45th to 55th observation, the variations in concentration are suppressed and  from
55th observation onwards, the variability increases but remains around the mean value.  Sulfate
varies roughly in parallel with acidity and no special effect can be attributed to treatment.
                                                                                     5-23

-------
Chapter 5
5-24

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 6:  Analysis of Data from Ernest Refuse Pile Site,
              Indiana County, PA

The Ernest mine site is located in the Crooked Creek watershed in Indiana County, PA near the
town of Ernest (see Figure 6.0). The U.S. Army Corps of Engineers (USAGE) completed
construction of Crooked Creek dam in 1940 and has managed the lake since then for flood
control and recreational purposes. The Commonwealth of Pennsylvania constructed and
operated Crooked Creek State Park at the lake prior to 1981 when the USAGE acquired the
facility. Some portions of the Crooked Creek watershed were impacted by acid mine drainage
from extensive bituminous coal mining, particularly the McKee Run tributary, from the town of
Ernest downstream to the town of Creekside at the confluence with the main stem of Crooked
Creek. The Ernest mine complex, including a large underground mine and associated coal refuse
pile, was operated from the early 1900's to 1965 when the mine was abandoned.

An acid mine drainage treatment plant was constructed by the Pennsylvania Department of
Environmental Protection and operated from June 1978 until May 1980 when problems with iron
sludge recycling operations led to the closure of the plant.  The water quality samples and flow
measurements from the Ernest refuse pile discharge that are discussed in this chapter were
collected between March 1981 and December 1985 as part of studies to evaluate water quality
and aquatic biology in the Crooked Creek watershed following closure of the treatment plant.
The raw data are listed in Appendix D.  There are 198  observations (N = 198), consisting of
values for 10 parameters:  1) Days (developed from the date that the sample was taken); 2) pH;
3) Flow; 4) Acidity;  5) Acid load; 6) Total Iron (Fe); 7) Total Iron load; 8) Ferrous Iron
(FFe);  9) Sulfate (S04); 10) Sulfateload.

There is a rather large time gap  (four months) between the first three observations and the
remainder of the samples that were collected at approximately weekly intervals. There were also
at least 15 samples without pH and/or ferrous iron data.  After these samples were omitted and
other adjustments were made (see Figure 3.1), a revised data set of 174 observations was
compiled and used for most of the statistical analyses presented in this chapter. Time gaps in the
data should be considered in examining the time series analyses, because elements of the time
series analysis assume that there are equal intervals between observations.
                                                                                   6-1

-------
Chapter 6
Figure 6.0:  Map of Ernest Mine Site
  ""^^  ^^- ^'A'f^||j|!y


 v£v*" • /<$•
 • V-' •- -- ')>N
•'--.-; ^S> .«\
                       Enlarged Portion of the Emcst

                       USGS 7.5 Minute Quadrangle
                        1000  2000
                              3000 Feet
6-2

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Univariate Analysis

Summary statistics for the adjusted data (N = 174) are presented in Table 6.1.

Table 6.1:     Summary Statistics of Data (N=174)

Days
PH
Flow
Acidity
Acid Load
Total iron
Iron Load
Ferrous Iron
S04
SO, Load
N
174
174
174
174
174
174
174
174
174
174
Mean
985.6
2.5061
127.2
3621
3367
527.2
626.6
364.8
3887.4
3837
Median
1027.0
2.5000
51.0
3539
1843
515.5
275.0
360.5
3804.0
2108
Trimmed
Mean
995.3
2.5018
85.9
3585
3031
520.5
563.1
351.5
3915.5
3431
Standard
Deviation
457.2
0.1524
337.0
1357
3639
210.0
722.3
251.9
1105.2
4198
Standard Error
of the Mean
34.7
0.0116
25.5
103
276
15.9
54.8
19.1
83.8
318

Days
PH
Flow
Aciditv
Acid Load
Total iron
Iron Load
Ferrous Iron
S04
SO, Load
Minimum
0.0
2.1
2.0
778
111
20.0
10.0
8.0
142.0
117
Maximum
1735.0
3.1
3188.0
16401
17663
1929.0
2758.0
1760.0
6115.0
17746
First
Quartile
703.0
2.4
8.0
3016
412
395.0
50.5
161.5
3155.0
513
Third
Quartile
1343.5
2.6
163.2
4301
5641
653.0
1147.7
512.0
4759.5
6394
Coefficient of
Variation
46.4
6.1
265.0
37.5
108
39.8
115
69.1
28.4
109
The coefficient of variation (CV%) remains within fairly reasonable limits for pH, acidity, and
iron. However, variability in ferrous iron (69%) is large.  Sulfate is in reasonable control (CV =
28%).  Flow, acid load, iron load, and sulfate load show very large variability (all greater than
CV = 100%) which suggests that the large variability of the load-type variables is largely due to
the high degree of variability shown by flow. These parameters require log transformation to
control this variability.  The frequency distribution of pH is symmetrical (Figure 6. la) while
flow is skewed, although the major part of the skewness arises from two extremely high values
(Rows 142-3 in Appendix Table D, flow = 3003.0 gpm and 3188.0 gpm respectively).  All other
values of this parameter range from less than 10 to hundreds. Similarly, acidity has an extremely
high value (Appendix D Table, Row 143 = 16,401 mg/L); acid load (Figure 6.1b) is skewed.
                                                                                     6-3

-------
Chapter 6
Total iron and total iron load follow the same pattern.  Ferrous iron has one exceptional value.
Sulfate (Figure 6. If) is negatively skewed, whereas sulfate load is positively skewed. This
behavior is an indication of the effect that flow can have on a parameter.

Table 6.2:   Summary Statistics of Data (N=174)

Days
PH
Log Flow
Log Acidity
Log Acid Load
Log Total Iron
Log Iron Load
Log Ferrous Iron
Log SO4
Loq SOj Load
N
174
174
174
174
174
174
174
174
174
174
Mean
985.6
2.5061
1 .6062
3.5349
3.1854
2.6836
2.3564
2.3989
3.5646
3.2403
Median
1027.0
2.5000
1 .7076
3.5489
3.2654
2.7122
2.4393
2.5569
3.5802
3.3240
Trimmed
Mean
995.3
2.5018
1 .6081
3.5440
3.1930
2.6997
2.3703
2.4442
3.5815
3.2480
Standard
Deviation
457.2
0.1524
0.6970
0.1466
0.6138
0.2066
0.7244
0.4696
0.1748
0.6143
Standard Error
of the Mean
34.7
0.0116
0.0528
0.0111
0.0465
0.0157
0.0549
0.0356
0.0133
0.0466

Days
PH
Log Flow
Log Acidity
Log Acid Load
Log Total Iron
Log Iron Load
Log Ferrous Iron
Log SO4
Log SOj Load
Minimum
0.0
2.1000
0.3010
2.8910
2.0453
1.3010
1 .0000
0.9031
2.1523
2.0682
Maximum
1735.0
3.1000
3.5035
4.2149
4.2471
3.2853
3.4406
3.2455
3.7864
4.2491
First
Quartile
703.0
2.4000
0.9031
3.4795
2.6149
2.5966
1 .7032
2.2082
3.4990
2.7098
Third
Quartile
1343.5
2.6000
2.2127
3.6336
3.7514
2.8149
3.0598
2.7093
3.6776
3.8058
Coefficient of
Variation
46.4
6.1
43.4
4.1
19.3
7.7
30.7
19.6
4.9
18.9
Summary statistics for log (base 10) transformed data are listed in Table 6.2 (N = 174). The
variables are now either well-behaved (CV<  20%) or are not too extreme (CV< 50%). Load
variables show the largest CV%.  This is most likely largely due to flow variability.

Histograms of the log transformed data are displayed in Figures 6.1c, 6.1e, and 6.1g.  By plotting
the histograms of the original data alongside that of the transformed data, the effect of the
transformation is clear. Because pH is already expressed in logarithms, no transformation was
applied.  In all other parameters, log transformation expanded low magnitude values and reduced
asymmetry (for acid load in Figures 6.1b and  6.1c), sometimes perhaps, too much (Figures 6.Id
6-4

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

and 6. le, iron and log iron respectively).  Similarly, because the histogram of sulfate is
negatively skewed, log transformation accentuated the negative skewness (Figures 6. If and 6.1g)
making log transformation unnecessary. All load variables are strongly positively skewed when
untransformed and the log transformation helps to improve their symmetry.
Figure 6.1a:  Histogram of pH, (N = 174)
                           Histogram of PH    N = 174
                           Each » represents  2 obs.
           Midpoint   Count
                2,1        1   *
                2.2        3   **

                2.4       23   ************
                2.5       48   ************************
                2.6       51
                2.7       14
                2.8        3   **
                2.9        1   *
                3.0        0
                3.1        2   *
Figure 6.1b:  Histogram of Acid Load, (N = 174)
        Histogram of  C5
        Each * represents
        Midpoint
                0
             2000
             4000
             6000
             8000
            10000
            12000
            14000
            16000
            18000
Count
   70
   35
   18
   IB
   14
   11
    6
    1
    0
    1
          =  174
          obs.
***********************************
******************
*********
*********
*******
******
***
*
Figure 6.1c:  Histogram of Log Acid Load ( N=174)
          Hlstogran   »f ACIDLO   N = 174
Midpol nt
2.0
2. 2
2.4
2.6
2.8
3.0
3.2
3.4
3.6
3.8
4.0
4.2
Count
3
12
19
15
16
12
13
16
17
24
25
2
                                                                                6-5

-------
Chapter 6
Figure 6.1d: Histogram of Total Iron (N=174)
        Histogram of FE
        Each  *  represents
        Midpoint
                0
              200
              400
              600
              800
             1000
             1200
             1400
             1600
             1800
             2000
N = 174
2 obs.
Figure 6.1e:  Histogram of Log Total Iron (N=174)
          Histogram of FE
          Each * represents
Midpoint
1 .4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
3.2
Count
1
0
0
4
0
15
65
73
15
1
 N =  174
 2 obs .
                             **
                             ********
                             *******
                                                               **** *
Figure 6.1f:  Histogram of SO4 (N=174)
        Histogram  of  S04   N » 174
Midpoint
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
Count
I
0
2
2
6
10
27
38
IS
26
22
15
6
                         **
                         **
                         ******
                         **********
                         ***************************
                         **************************************
                         *******************
                         **************************
                         **********************
                         ***************
                         ******
6-6

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 6.1g:  Histogram of Log SO4 (N=174)
                of  $Q4   N  « 174
           *  r«pf*«»«nt»  2

     Midpoint
           2.2        I   *
           2.4        0
           2.6        II
           z.a        i   *
           3.0        1   *
           3,2        7   ****
           3.4       55   *»*»***#***»«»#*»*
           3.6       Sf
           3.8       30   ***************
Bivariate Analysis

The bivariate statistical analysis of the Ernest data includes bivariate plots (routinely used in
regression and correlation analyses), the use of a correlation matrix to compare and evaluate
correlation coefficients, and the use of cross correlation functions to determine if lags in the data
for certain parameters tend to obscure correlations that may be present.  The correlation matrix is
an element of some multivariate statistical analyses, such as principal components analysis  and
factor analysis (in the r mode). The cross-correlation function is an element of time series
analysis because it computes and graphs correlations between two time series. Both of these
statistical tools are included in this discussion of bivariate analysis because they are useful in
examining the relationship between pairs of variables.

The correlation coefficients for all pairs of variables are shown in Table 6.3.  The correlation
coefficient (r) at the five percent probability level is given above the table and all correlation
coefficients larger than this number are significantly different from zero. For example, only iron
vs.  pH (r = 0.124) is not significantly different from zero. Similarly, ferrous iron vs. acidity (r =
0.045) and sulfate vs.  ferrous iron (r = 0.083) are also not significantly different from zero.  All
other coefficients reflect a real association (statistically significant), however, in many cases, the
degree of association (r2 x 100%) is small. For example, the correlation of acidity and pH (r =
-0.365) indicates an inverse linear association between the two variables as would be expected,
but the degree of association is small (r2 = 13%).
                                                                                       6-7

-------
Chapter 6

Table 6.3:    Correlation Coefficients for 9 Parameters (N=174, r0 05 = 0.159)

Flow
Acidity
Acid Load
Total Iron
Iron Load
Ferrous iron
S04
SO, Load
PH
0.191
-0.365
0.483
0.124
0.498
0.248
-0.547
0.472
Flow

0.308
0.206
0.337
0.229
-0.020
-0.184
0.438
Acid

-0.224
0.526
-0.262
0.045
0.600
-0.030
Acid Load

0.263
0.913
0.337
-0.307
0.906
Total Iron

0.375
0.480
0.174
0.386
Iron Load

0.388
-0.339
0.890
Ferrous Iron SO4

0.083
0.285 -0.293
There are three large correlation coefficients between acid load vs. iron load, acid load vs. sulfate
load, and iron load vs. sulfate load. These correlation coefficients are all around r = 0.9 (i.e.,
about 80 percent in common), probably because of the domination of flow in the measurement of
load variables.  Whereas, the individual concentration variables acidity vs. iron (r = 0.526),
acidity vs. sulfate (r = 0.6), and iron vs. sulfate (r = 0.174) show much lower association (the
largest r2 is 36 %).  In addition, any load variable vs. concentration of the same variable shows
no appreciable relationship. Thus, the relatively high correlation coefficients due to the
inclusion of flow in all load variables is an artifact from the calculation for load (concentration x
0.01212 x flow).

When one examines the cross-correlation functions (Figures 6.2a to 6.2d), it can be seen that the
largest correlation occurs at lag zero in Figure 6.2a (pH vs. log flow) and at lag one in Figure
6.2c (pH vs. log acid load) and that the correlations are of the same order of magnitude.  Because
pH vs. log acidity (Figure 6.2b) yields the strongest r = -0.466 at lag zero, which is much
weaker than the value yielded by pH vs. acid load (Figure 6.2c), it is suspected that the effect of
flow on load is responsible for the higher correlation. The highest correlation in Figure 6.2d (pH
vs. log iron) occurs at lag  19 (r = -0.336), but values of r > 0.25 occur haphazardly at many lags
and any association is likely to be very weak.
6-8

-------
                           Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figures 6.2a and 6.2b:
Cross Correlation Functions of pH vs. Log Flow, and pH vs.
Log Acid (respectively)
  CCf C2 VS C3

  CCF - correlates PH(t) and FLOWCt+k)
         -1.0 -0.8 -0.6 -0.4 -0.2  0.0  0.2  0.4  0.6 0.8  1.0
                      CCF - correlates PH(t) ana ACID(ttk)

                              -1.0 -0.8 -0,6 -0.4 -0.!  0.0 0.2 0.4  0.6  0.8 1.0
•23
•22
•21
•20
•19
•18
•17
•16
•15
•14
•13
•12
• 11
•10
-9
-8
-7
-6
-5
-4
-3
-2
-1
_0
"l
2
3
4
5
6
7
a
9
10
11
12
13
14
IS
16
17
IB
19
20
21
22
23
-0.215
-0.251
-0.279
-0.312
-0.326
-0.339
-0.291
-0.284
-0.239
-0.200
-0.1S5
-0.064
-0.020
0.093
0.136
0.247
0.303
0.351
0.449
0.500
0.577
0.591
0.625
0.67:
0".661
0.606
0.541
0.479
0.396
0.304
0.313
0.227
0.096
0.013
-0.087
-0.135
-0.196
-0.245
-0.278
-0.307
-0.290
-0.272
-0.229
-0.196
-0.213
-0. 174
-0.138
XXXXXX
XXXXXXX
xxxxxxxx
xxxxxxxxx
xxxxxxxxx
xxxxxxxxx
xxxxxxxx
xxxxxxxx
XXXXXXX
xxxxxx
xxxxx
XXX
XX
XXX
xxxx
XXXXXXX
xxxxxxxxx
xxxxxxxxxx
xxxxxxxx xxxx
XX XXXXX XXXXXXX
xxxxxxxxxxxxxxx
XXXXXXXXXXXXX XXX
xxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
XXXXXXXXXXXXXXXXXX
xxxxxxxx xxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxx
xxxxxxxxx
XXXXXXX
XXX
X
XXX
xxxx
xxxxxx
XXXXXXX
xxxxxxxx
xxxxxxxxx
xxxxxxxx
xxxxxxxx
XXXXXXX
xxxxxx
xxxxxx
xxxxx
xxxx
-23 0.206
-22 0.1S8
-21 0.244
-20 0.266
-19 0.247
-18 0.122
-17 0.180
-16 0.138
-15 0.047
-14 0.081
-13 0.069
-12 O.OOS
-11 0.037
-10 -0.057
-9 -0.108
-8 -0.161
-7 -0.175
-6 -0.179
-5 -0.233
-4 -0.24B
-3 -0.262
-2 -0.262
-1 -0.2SI
0 -0 . 466
1 -0.244
2 -0.259
3 -0.186
4 -0.139
5 -0.111
6 -0.017
7 -0.075
8 0.184
9 0.040
10 0.074
11 0.209
12 0.160
13 0.194
14 0 . 1 38
15 0.172
16 0.138
17 0.044
18 -0.028
19 -O.OS3
20 -0.014
21 -0.039
22 -0.104
23 -0.064
XXXXXX
XXXXX
XXXXXXX
xxxxxxxx
XXXXXXX
xxxx
xxxxxx
xxxx
XX
XXX
XXX
X
XX
XX
xxxx
XXXXX
XXXXX
xxxxx
XXXXXXX
XXXXXXX
xxxxxxxx
xxxxxxxx
XXXXXXX
xxxxxxxxxxxxx
1 "XXXXXXX
XXXXXXX
xxxxxx
xxxx
xxxx
X
XXX
xxxxxx
XX
XXX
xxxxxx
xxxxx
xxxxxx
xxxx
xxxxx
xxxx
XX
xx
XX
X
XX
xxxx
XXX
                                                                                                   6-9

-------
Chapter 6
Figures 6.2c and 6.2d:       Cross Correlation Functions of pH vs. Log Acid Load, and pH
                             vs. Log Iron (respectively)
  CCP C* ¥$ CS

  CCr - eorrel»t«s PM(t) »na *COU)(.t»k)

         -1.0 -0.8 -O.f -0,« -9.Z 6.C  0-2 0.4  0.6 0.
I 1.0
      CCF - corntitH PH[t) «nd F6(t»k)

             -1.0 -0.8 -0.6 -0,4 -O.J  i.l 0.2  0.4 0.6  O.i 1.0
-13
-22
-21
-20
-IS
-II
-IT
-16
-15
-14
-It
-12
-11
-10

-i
-?
-0
-S

-S
-2
-1
0
1
2
3
4
S
t

S
9
10
1 1
12
IS
14
IS
IS
IT
18
It
20
21
22
2S
-0.118
-0.180
-0,204
-0.211
-0.270
-0.3IS
-0.265
-0.223
-0.202
-0.176
-0.114
-0,042
-0,003
0.072
a.tis
0.200
0.260
0.298
o.aai
0.483
0.564
0.571
O.S91
0.6J7
0.67S
O.SIT
0.916
0.484,
0.443
0.342
0.2SS
0.181
0.127
0.0««
0.002
-O.OST
-0,129
-0.212
-0.262
-0.3§8
-0.321
-0,331
-0.291
-0.283
-0.2T3
-0,221
-0.197
XXXX
xxxxxx
XXKXXX
xx.xxxx
xxxxxxxx
xxxxxxxxx
xxxxxxxx
xxxxxxx
xxxxxx
xxxxx
xxxx
XX
X
XXX
xxx
xxxxxx
xxxxxxx
xxxxxxxx
xxxxxxxxxxx
XXXXXXXX.XXXXX
xxx.xxxxxxxxxxxx
xxxxxxxxx xxxxxx
xxxxxxxxx xxxxxxx
xxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
~ 	 xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxxx
xxxxxxxxxx
xxxxxxx
xxxxx
xxxx
XXX
X
XX
xxxx
xxxxxx
xxxxxxxx
xxxxxxxxx
xxxxxxxxx
xxxxxxxxx
xxxxxxxx
XXXXXXXX
xxxxxxxx
XXXXXHX
xxxxxx
-2f
-22
-21
-20
-19
-ii
-n
-li
-IS
-14
-13
-IZ
-11
-10
~§
-S
-J
-i
-i
-4
-J
-2
-1
0
%
2
J
4
5
S
I
1
9
10
11
12
ii
14
IS
18
17
18
li
20
21
22
23
-0.044
-O.OS2
-0,0*8
-0.012
-0.049
-0.12?
-0.171
-0.167
-0.144
-0.1 IS
-0.100
-0.268
-0.026
-0.017
-0.021
o.ooi
0,032
O.OS8
0.081
0.090
0.105
0.136
0.1S4
0.019
0.194
1T.217
0,2iS
0.280
a. 223
8.170
0,134
0.290
O.t*3
o.oit
0.1SS
O.OBT
-0.005
0.011
-O.Ofi
-0.141
-0.206
-0.321
-0,338
-0.322
-o.aao
-0.30$
-0.212
XX
XX
XXX
X
XX
XXXX
XXXXX
xxxxx
xxxxx
xxxx
xxxx
XJSXBUKX
XX
X
XX
X
XX
«
XXX
XXX
xxxx
xxxx
xxxxxx
X
xxxxxx
xxxxxx
xxxxxxxx
xxxxxxxx
xxxxxxx
xxxxx
xxxx
xxxxxxxx
xxxxx
XXX
xxxxx
XXX
X
X
X

xxxxxx
xxxxxxxxx
XXXXXMXXK
xxxxxxxxx
xxxxxxxx
xxxxxxxxx
XXXXXJUCX
When either pH (which is a logarithmic measure) or logarithms of the other parameters are
plotted against days, they appear to show periodic variation with a very large degree of scatter
(see for example, pH vs. days (Figure 6.3a) and log flow vs. days (Figure 6.3b)). Log acidity vs.
days was not as evident, but log acid load vs. days (Figure 6.3c) is clearly periodic. Here again,
the effect of flow on load is likely to be responsible for the cyclical appearance.
6-10

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load





Figure 6.3a:  Plot of pH vs. Time (days)
    PLOT C2 VS C1
        3.00+
    PH
2




2



2

.70+ 2 . _ '* * * *»3 » ««*
- * * * 2
* ** *2 " S3 2833*43** »*23 2
** * * **2 *34 2 2 ** 34 *2 3»2 * 2»»
2 ** * *
,40+ » * **2 * 2 22* 3 **
- »**
'3 "2* 3 224
* 2
.10+ »
0 350 TOO 1400 1750
                                                                         OA¥S
Figure 6.3b: Plot of Log Flow vs. Time (days)
            C3 VS Cl



       FLOW
-*.
—
3.0+ — -
-
- 2
-
2.0+
- *
-
-
-
l'.0+
_
-*
***"
0.0+
0
*
*
_ ___ _____ __— . — _ __,_ ___ ___ , — , ,_
* * *2 * *
** 2 44* * * »*
** 33 * 4* 23 * 4*
* 2* * 2 **4 *2 2 2
* * ** 2 * *2 2 2
2* 2 * *** * *** * * « *
* * * * * * *
* * 2 * ** *2 *
** 2* *** * #2 2*
* ** »*2 * *2
2 * 232 2 **
* * * * 232

350 700 1400 1750
                                                                           DAYS
                                                                                      6-11

-------
Chapter 6


Figure 6.3c: Plot of Log Acid Load vs. Time (days)
         C5 ¥S CI
    f   4.20+ 2
   •€<3JF.    _            *                       2*2        2       *
            -      *    *»*           *3        *34*2       2     *
                         *            »2       23  2»*   *  «*
            ™        f               #?***      *   2  *
        3,50+ *         *   **                 *2*   **2           *
                   «*22            *             *«2*
            _     2****        *       *             **
        2,80+
        2, 10 +
4 * 2 * * *
»* 2 * * *2
» g **» *2**
* 222* »
3 2 * *** »
350 700 1400
*
*2
•
•3
1750
                                                                          OAVS
Bivariate plots of untransformed data were made and it was found that in most cases, there was

little relationship between concentration and load (e.g., Figure 6.4, acidity vs. acid load). The

only discrepancies are extreme values which occur as outliers (e.g., observation 158).


Figure 6.4: Plot of Acid vs. Acid Load



  MTB >  PLOT C4 VS Ci

  ACID

           -  * (<*!, 158)
      16000+
     10000-4
           -   t
      5000+  4+2  »»»****                        #           *
           -  5++4*22222  22       *    2  3  *   *
              ** 423*23*32 *3*323»3*4 **3*2  2 •*
           -  *»3*2**  **      *       * *        *
                4             *
         0+
            -4._________4,____.___,4,.,________^	.	+	+
             0      3500      7000      10SOO      14000

          ~N*  = 1
6-12

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

bivariate plots of untransformed load variables are included in Figures 6.5a through 6.5c. In
Figures 6.5a and 6.5c, the spread of the variables increases with magnitude (i.e., the data are
heteroscedastic and so should be expressed in logarithms). Figure 6.5b (acid load and sulfate
load) is reasonably homoscedastic, indicating that sulfate load and acid load are not skewed in
their frequency distribution. There are obvious extreme outliers in each of the three figures (e.g.,
observation 133 in Figures 6.5a and 6.5b, and observation 158 in Figures 6.5b and 6.5c).

Figure 6.5a:   Plot of Iron Load vs. Acid Loading
    FELO
        240i*
                                          >—+—.	—*———*——*COU»
                                                    14000
Figure 6.5b:  Plot of Acid Load vs. Sulfate Loading
    MTB > PLOT Ci ¥$  CIO

    ACDLD


       15000+



       1§§00*
                                f§§0
itsoo
                                                         US
                                                          *       *
                                                                  -+—--—S04LO
                                                    14000     17500
                                                                                      6-13

-------
Chapter 6
Figure 6.5c:   Plot of Iron vs. Sulfate Load
         HITS > PLOT C7 VS CIO

         FELD


             2400+



             1600+
              800*
                0* 7+9
                         3500
                                      	+	«.	*	S04LD
                                  7000     10500     14000     17500
Bivariate plots of logarithmically transformed data are shown in Figures 6.6a to 6.6d. Log
acidity vs. log flow (Figure 6.6a) shows no relationship.  The exceptional values of two
observations of flow occur as outliers.  Log acid load, iron load, and sulfate load vs. log flow
showed strong linear associations (Figure 6.6b), with various outliers for the extreme values of
flow.  There appears to be no simple relationship between log acidity and log acid load (Figure
6.6c). The only real association appears to be positive linear between log sulfate and log acid
(Figure 6.6d) which, as would be expected, tend to increase together. The presence of two
extreme outliers probably would diminish the value of the correlation coefficient between them.

Figure 6.6a: Bivariate Plot of Log Acidity vs. Log Flow
     PLOT C4 VS  C3

     ACTO


         4.00+
             -   3
         3.60+
         3.20+
427  2*2  ** **   2  3*   2*    **  *
3 S  2*2  22 23* *  *2»2»*     4   *

     •33   **2*  * *342»*23 2435 423
               *    2     22** 24*24*
                 •  ••      j   *   *
        *        *    «     *        *
                   *  *
         2,§0+
                   0,80
          1.10
I ,iO
                                                   2.40
3.00
	+FLOW
 3.60
6-14

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load


Figure 6.6b:  Bivariate Plot of Log Sulfate vs. Log Flow


  PLOT CIO MS €3
          -                                   *3**2  *
                                          **         *
                                         *332*»
      3, 50*                           3*3 32
                                    3S3  2*»*
                                  * 32
          -                      2*****   *
                             26* * *       *
      2,80+             * 22*3 **
                      •24 *2
                  *   532
                22ft     *
                8 *
      2.10*  3    *
                          1.20       1.10       2.40       3,00      3.SO
Figure 6.6c:  Bivariate Plot of Log Acidity vs. Log Acid Load
 PLOT C4 VS C5

 ACTO*


     4.00+
                        *     *
         -         3   *443» 2 * 2*  *»*  *    *  2* * ***     *  * *
     3.60+           »2*
-------
Chapter 6

Figure 6.6d:  Bivariate Plot of Log Sulfate vs. Log Acid
    PLOT C9  VS C4
   - e*f
    S04      -                                    * 53*
                                              *2*84-t-8i**
                                *         324S-I-7732
        3.50+                       »  2 749+6
             _            »    t r»   » t*»2*      *2
             _                  *2**    *
             —             *  *
             ™-          *
        3,00+
        2.50 +
        2.00-1-
               _	+	,____ + __	..	+ __.— ___,— + -	1——.	+ AC 10
                   3.0O       3.25       3,50       3.75       4,00       4.25
Time Series Analysis

Time series plots of six selected variables are displayed in Figures 6.7a through 6.7f. pH (Figure
6.7a) illustrates the gap of missing data (September through December, 1982) and possesses two
extreme positive values during July 1983 (pH = 3.1) and December 1984 (pH = 3.1). The July
1983 maximum is followed by an extreme minimum (pH = 2.1). Time series plots of flow
(Figure 6.7b) and acidity (Figure 6.7c) are dominated by extreme values (March 19 and 26 for
the former, and March 26 for the latter).

Time series plots of the load variables (iron, acid and sulfate Figures 6.7d, 6.7e, and 6.7f
respectively) are similar and appear to possess a seasonal component in May of each year. This
apparent cyclicity is confounded by maxima in March and September 1981, August 1984, and
April 1985.  The most striking feature is the remarkable similarity in all three graphs, a feature
not evident in graphs  of the variables expressed as concentrations.
6-16

-------
ON
Figure 6.7a: Time Series Plot of pH
m$ > TSW.OT « £ g
i * * i
i - » (
K w
3.000*— 	 — — 	 	 	 	 -—*— 	 .— U— —K™. . ,• 	 1
PH -«.•*.»,.__ --, A I
i |JM* 1! In ft
^ - /i s fVhAfi
, s ft /« A/MM. BI214SH
-TI fl lit S « s if IM - s ,i issnf- M

I " • ^»ii»
2"'™! - 1 J
0 10 " 20 30 "HI 50 &0 i TO S£
i
Figure 6.7b: Time Series Plot of Flow
...._•__ _>,_-..+ ktl^ff.
me f tspuor ca
FLO*
mm*
"5f*£27"i- Mis57.fi
JMM* • >»
r .
IflOOt
f m. tF
J —-• /\,,,/\ WIZ\ a 7i90l23«S
fl* 1 *i§B7s yl234|i?S §878 Ml234SS7S90t2345iT8§0!23^S87HSOt234Sg e789
*"- 	 4.-,w»-.-»; 	 f. 	 : 	 „,„„«•£ "**»: 	 ^™~™~Hf 	 .^nm,.™,,™^: 	 „ 	 «i.4,-,s,*i,»afci._^_: 	 : 	 _„
e 53 2Q 39 40 SO 60 70 g
4 _____ 	 ^ -k
	 ,
i t s ait A (1 1 1 ' .'
n A A /..VU /I '•
1 i 23456789 U YS6T 1 3* ««« 1 - 1
X*M4 I L^-J--I.U_- - -'--«f »|-- - i
1 S4 /igfliiKS 4si
y si ._ 1 	 , 	 ., . 	 „ ^ 	 	 , 	 — ,
I

9J KB l!4 ItO ISO Mi IS


.... 	 	 ,, — , — — __. — , — _ •„ — _ 	 ; —
,23/4 J
IZ31S6?(JO«l45iJff\i2345«7"y " ' "^2 \WIMl234S6rtsA345
12« ' ISO 140 150


413
iylW \ h ?
\U


3 1W JIO HO iiO

, tk.
f tn-r
« 	 — ,. 	 . . 	
v-v»,,» «;
no m . IK 19J



1
^'
n
^
a
o
f
Sa'
§
o-
andonedMine Drai
nage in the Assessment of
';«
G~
c\'
a
t>
o
o
a.

-------
oo
          Figure 6.7c: Time Series Plot of Acidity
MTB * TSf*LQT C4
ACID
ISOOO^ Pb-f^
: |^:w
I0eao*~" — - — ™^_ — ^ _ * |i
S80S* S23 »/tA SS7ff\ J23C B^^a ™~ ~~!~— -^^ 	 ~"3~sg?'aa"~'' 	 ' 	 ™- — s™™™ . i 1
r«i. ' ' ' 13° I*" . IBO ~it
P«%— 4ua/i?Vi_
^isnris "' 3SS— "— ^-Jt— — —
ii
        Figure 6.7d: Time Series Plot of Iron Load
         MTS I" TSPLBT C?



         FELO
                                                                                                                                  t
                                         6FS9011345ST8SO t234Si?iiO I


                                    30     40-     sa     so
\ <  i  f\7t  • "^YA  —"—
-------
         Figure 6.7e: Time Series Plot of Acid Load
            MTB * TSPLQT CS



            *£DLS  -  2
                        10     20     30      40
                                                                                                                                                      a
                                                                                                                                                      o
                                                                                                                                                      i
                                                                                                                                                      §•
                                                                                                                                                      a
Figure 6.7f: Time Series Plot of Sulfate Load

 •" ' '^' I1W  I         I
               S04LD  - 2
                                                                                                                                                      §
                                                                                                                                                      a'
                                                                                                                                                       '

                                                                                                                                                      I
                                                                                                                                                      s-
                                                                                                                                                      TO
VO
                                                                                                                                                      o
                                                                                                                                                      a
                                                                                                                                                      o
                                                                                                                                                      o

-------
Chapter 6
Quality Control Limits for the Variables

Two measures of quality control were used to illustrate this aspect of the analysis. The first is
conventional (mean ± 2x the standard deviation).  The second is non-parametric (median ± 1.96
x a function of the H-spread).  Since sample size,  N' = 1, the function is: (1.25 * H-Spread /
1.35). Both measures are based on analysis of the Clarion data ( "Quality Control Limits,"
Chapter 5).  Summary statistics for these measures are listed in Tables 6.4 and 6.5.  At the base
of each table are statistics for the three load variables expressed in logarithms.

Table 6.4:     Base Data for Calculation of Quality Control Limits of Ernest Data
No.
2.
3.
4.
5.
6.
7.
8.
9.
10.

5.
7.
10.
Variable
PH
Flow
Acid
Acid Load
Total Iron
Iron Load
Ferrous Iron
S04
SO4 Load

Log Acid Load
Log Iron Load
Loq SO, Load
Mean X
2.506
127.2
3621.
3367.
527.2
626.6
364.8
3887.4
3837.

3.120
2.277
3.175
Median
2.50
51.0
3539.
1843.
515.5
275.0
360.5
3804.
2108.

3.265
2.439
3.324
R = H-spread
0.2
153.0
1283.
5210.
358.
1096.
348.
1583.
5857.
Log. Data
1.127
1.352
1.089
C-spread
0.530
342.0
3493.
1 1 307.
647.
2193.
807.
3933.
13711.

1.869
2.168
1.917
,».
<7
0.1524
337.0
1357.
3639.
210.
722.3
251.9
1105.2
4198.

0.631
0.747
0.632
H-spread/1.345
0.148
113.4
951.1
3862.1
265.4
812.5
258.
1173.5
4341 .7

0.835
1.002
0.807
Table 6.5:     Two Measures of Quality Control (1) ± 2 £

                                               (2) 1.96 [(1.25 H-spread) / 1.
No.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Variable
PH
Flow
Acid
Acid Load
Total Iron
Iron Load
Ferrous Iron
S04
SOd Load
Mean
(X)
2.506
127.2
3621
3367
527.2
626.6
365
3887.4
3837
X ±2 (j
2.201 to 2.811
-546.8 to 801 .2
907 to 6335
-391 1 to 1 0645
107.2to947.2
-818.0to2071.2
-139 to 868.6
1677.0(06097.8
-4559.0(012233.0
Md ± 1.96(..)
2.137(02.863
-226.7 to 328.7
1211 to 5867
-7612(011298
-134.2(01165.2
-1714.0(02264.0
-271 to 992
931.1 (06676.9
-8521.4(012737.4
1.96 [(1.25 H-
spread)/ 1.35 -//V7
0.363
277.7
2328
9455
649.7
1989.0
632
2872.9
10629.4
Median
2.500
51.0
3539
1843
515.5
275.0
361
3804.0
2108.0
2 
-------
                        Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
No.
Variable
Mean
(X)
X ±2 (j
Md ± 1.96(..)
1.96 [(1.25 H-
spread)/ 1.35 -//V7
Median
2  1, could increase the
sensitivity too much and many values of these widely fluctuating parameters would fall outside
the limits thereby calling for action. If fluctuations arise from "natural causes" and not from
mining activity, this would be undesirable. Obviously, the entire range of pH, for example, is
small (2.1-3.1) and the discharge is consistently acidic.

Model Identification

Autocorrelation functions form the basis for model identification in applying full-scale Box-
Jenkins time series  analysis.  Hence, the autocorrelation and partial autocorrelation functions
were run on the data for each variable. The graphs are presented in Figures 6.8a, 6.8b, 6.8c,
6.8e, and 6.8g, for the autocorrelation functions (Acf) and Figures 6.8d, 6.8f, and 6.8h for the
partial autocorrelation functions (Pacf).
                                                                                      6-21

-------
Chapter 6
Figure 6.8a:  Autocorrelation Function of pH
   ACF C2
   ACF Of PH
                                                                            ,0

1
2
3
4
5
6
7
8
S
1O
1 1
12
13
14
IS
16
17
10
1ft
2O
21
22
23

O,
O.
O,
O,
O,
O.
O,
O.
O.
O.
-Q.
-Q.
-O.
-O.
-O,
-O.
-O,
-O,
-O.
-O.
-O.
-Q.
-O-
- 1 . O -O . 8 -O,6
54S
452
442
374
303
. .
247
127
131
O28
O71
09)
141
179
193
199
203
152
15d
24O
2Q3
104
144
-0.4-0.2 O.O O,2 O.-4 O.« O,8 1.
xxxxxxxxxxxxxxx
xxxxxxxxxxxx.
xxxxxxxxxxxx
XXX.XXXXXXX
XXKXXXXXX
xxxxxxx
xxxxxxx
xxxx
xxxx
XX
XXX
XXX
xxxxx
xxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxx
xxxxx
xxxxxxx
xxxxxx
xxxx
xxxxx
Figure 6.8b: Autocorrelation Function of Iron
   ACF  C6
   ACF  of FE

1
2
3
4
5
6
7
8
9
10
1 1
12
13
14
15
16
17
18
19
20-
21
22
23

0.
0.
0,
0.
0.
0.
-0.
-0.
-0.
0,
-0.
0.
-0.
-0.
-0.
-0.
-0.
-0.
-0.
-Q.
-0.
-0.
-0.
-1.0 -0.6
292
232
137
062
039
150
018
076
060
014
002
021
006
149
168
079
090
084
076
275
101
12O
136
-0.6-0.4-0.2 0.0 0.2 0.4 0.8 0.8 1
xxxxxxxx
xxxxxxx
xxxx
XXX
XX
xxxxx
X
XXX
XXX
X
X
XX
X
xxxxx
xxxxx
XXX
XXX
XXX
XXX
xxxxxxxx
xxxx
xxxx
xxxx
6-22

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 6.8c: Autocorrelation Function of Flow
ACF
ACF
1
2
3
A
8
6
7
8
9
10
1 1
12
13
14
15
16
17
IS
19
2O
21
22
23
C3
Of i
O
0
0
0
o
o
o
o
o
0
-o
-o
-0
-o
-0
-0
-0
-0
-o
-o
-o
-0
-o
FLOW
-1.0 -0,8
,B65
.768
.874
,588
.526
.445
,348
.224
,O9O
.000
,O83
. 15?
,231
,301
,357
,398 -
.407
,4O2
.389
.349
.318
,271
.224
-O.6 -O.4 -0.2 O,0 0.2 0,* O.C 0.8 1
XXXXXXXXXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxxxxx
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXX
XXXXXXXXXXXXXX
xxxxxxxxxxxx
xxxxxxxxxx
xxxxxxx
XXX
X
XXX
xxxxx
xxxxxxx
xxxxxxxxx
xxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxxx
xxxxxxxxx
xxxxxxxx
xxxxxxx
Figure 6.8d: Partial Autocorrelation Function of Flow
     PACF C3

     PACF Of FLOW

                 -1.o • -o.a
                              -O,6
-O.2  O.O
1
2
3
A
S
6
7
8
9
1O
1 1
1 2
13
14
15
16
1 7
18
19
2O
21
22
23
O
O
-O
-o
o
-o
-o
-o
-o
o
	 rt
•— -fl
-o
-o
-o
-o
o
o
o
o
...... Q
o
-o
. 86S
. O79
. O27
.012
. D-45
,093
. 13O
, 195
. 173
,O22
. O<»2
.051
. O65
. O28
. OO9
.019
. O39
.012
, O2O
,1O1
. OO8
-O2O
. O 1 A
                                                   XXXXXXXXXXXXXXXXXXXXJCXX
                                                   XXX
                                                 XX
                                                   X
                                                   XX
                                                XXX
                                            XXXXXX
                                              xxxxx
                                                   XX
                                                 XX
                                                 XX
                                                XXX
                                                 XX
                                                   X
                                                   X
                                                   XX
                                                   X
                                                   XX
                                                   xxxx
                                                   X
                                                   X
                                                   X
                                                                                    6-23

-------
Chapter 6

Figure 6.8e:  Autocorrelation Function of Acidity
ACF
ACF
1
2
3
4
5
6
7
S
9
1O
1 1
12
13
14
15
16
17
18
19
2O
21
22
23
C4
or
O
O
O
0
o
-o
o
-o
-o
o
-o
-o
-o
-0
-o
-o
— o
-o
-o
-o
-0
o
-0
ACID
-1 .0
.360
,369
.262
. 1O3
.Q73
,049
,O21
.078
.055
.003
.117
,O70
. 1 1O
. 103
.116
. 162
.O9O
.042
.040
.016
.023
,O4O
.024
-O.8 -O.6 -Q.4 -O.2 D.O 0.2 0,4 O.6 O.8 1
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXX
XXXX
XXX
XX
XX
XXX
XX
X
xxxx
XXX
xxxx
xxxx
xxxx
xxxxx
XXX
XX
XX
X
XX
XX
XX
Figure 6.8f:  Partial Autocorrelation Function of Acidity

PACF


1
2
3
4
S
8
7
a
S
10
11
12
13
14
15
.16
17
18
19
20
21
22
23
C4
of ACID
-1.0 -0,1
4,____+™
0.380
0.275
0,084
-0, 103
-0.03S
-0,100
0.073
-0.061
-o.ota
0.057
-0.111
-O.04S
-0.038
-0.022
-0 , 03fl
-0.0S4
-0.001
0,094
-0.010
-0 . 034
-0 . 030
0.048
-0.044
                               -0.4-0.2  0.0  0,2  0,4   0.6   0.8   1.0
                               ._+__—+	+___.—!.	+.	,_«,____+_-—+
                                           XXXXXXXXXX
                                           XXXXXXXX
                                           XXX
                                        XXXX
                                          XX
                                         XXX
                                           XXX
                                         XXX
                                           X
                                           XX
                                        xxxx
                                          XX
                                          XX
                                          XX
                                          XX
                                         XXX
                                           X
                                           XXX
                                           X
                                          XX
                                          XX
                                           XX
                                          XX
6-24

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load





Figure 6.8g:  Autocorrelation Function of Acid Load



    ACF C5
    ACF of  ACOt-D



               -1.O -O.8
                           -0,6 -O,4 -0.2  O.O  Q,2  Q.4  O.6   0.8
1
2
3
4
5
6
7
8
9
10
1 1 .
12,
13
14
15
16
1?
18
»9
20
21
22
33,
0.
0.
0,
0.
0.
0.
O.
O.
0.
0.
0.
-0.
-O.
-O,
-0,
-O,
-0.
-0.
-0,
-0.
-0,
-0.
-rp..
819
712
626
546
445
374
3O2
238
130
O52
009
058
141
2O 1
276
339
362
375
376
353
306
2S8
2,24
XXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
XXXXXXXXXXXX
XXXXXXXXXX
xxxxxxxxx
xxxxxxx
xxxx
XX
X
XX
xxxxx
XX XXXX
x'xxxxxxx
KXXXXXXXX

xxxxxxxxxx
KKXXXXXXXX
XXXXXXXKXX
xxxxxxxxx
xxxxxxx
xxxxxxx
Figure 6.8h: Partial Autocorrelation Function of Acid Load
PACf
P«F
1
2
3
4
S
ft
7
8
i
10
1 1
It
13
14
15
16
1?
18
19
20
21
22
23
CS
of
-1 ,0
0.819
0.126
0,043
-0.004
-0.099
o.ooe
-0.031
-0.021
-0.173
-0.04S
0.042
-0.078
-0. 122
-o.ose
-0 . 1 28
-o.ose
0,054
-o.oto
-o.oos

0.105
0.040
-0,015
-0.8 -0.6--0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

XXXX
XX
X
XXX
X
XX
XX
XXXKX
XX
XX
XXX
xxxx
KX
xxxx
KXX
XX
X
X
5CXX
xxxx
XX
X
                                                                                     6-25

-------
Chapter 6

pH (Figure 6.8a), flow (Figure 6.8c), and three load parameters (e.g., see Figure 6.8g for acid
load) yield similar Autocorrelation functions (Acf s).  The concentration variables acidity
(Figure 6.8e), iron (Figure 6.8b), and sulfate (Figure not available) also show similar Acf s, but
the former set (which includes load) differs from the latter.  The former set shows a strong
decline throughout the function.  This decline is confirmed by the single large spike  at lag 1 in
the corresponding partial autocorrelation factors (Pacf s, Figures 6.8d and 6.8h). This behavior
implies that all these variables require a first difference to remove the trend.  The Acf and Pacf
for each of the concentration variables (e.g., Figures 6.8e and 6.8f) suggest moving average
(MA) models with at most two terms (or one term and a first difference).  It is perhaps advisable
to try an auto-regressive moving average (ARMA) model in which the AR term could proxy for
the first difference and the MA term would take care of the remainder.

Model Fitting:  pH

It was decided to attempt to fit an auto-regressive integrated moving average (ARIMA) model
(1,1,1) to variation in pH.  The correlation  coefficient between the AR and MA coefficients was
r = 0.81, which implies that they are closely associated (i.e., both are unlikely to be necessary).
Testing the Acf of the residuals yielded a chi-square = 27.16 with 28 degrees of freedom (i.e.,
the Acf is not significantly different from that of white noise). There is only one significant
spike at lag 20 in this Acf, thus, it is effectively clean. Any further differencing results in
overdifferencing (i.e., chi-square of the Acf increases to significant again).  The model has
improved the variation (Pacf of the residuals has no significant spikes) but contains an
                        yv
unnecessary coefficient  Oi . Clearly, the AR (1) is  adequately taken care of by the first
difference.

If we now fit a moving average model with a first difference (i.e., an MA (0,1,1) model), the Acf
of the residuals yields a chi-square of 26.87 with 29 degrees of freedom (thus, not significantly
different from an Acf of white noise). Any further differencing overcompensates. The only
significant spike is at lag 20 as in the previous model. Because this is an isolated significant
autocorrelation way out from zero lag, it is considered a random discrepancy. The Pacf of the
                                                                          yv.
residuals is also clean. The 95% confidence limits around the MA coefficient (91)  does not
contain zero.  Hence, the MA coefficient is significantly different from zero (real) and,
                                                       yv.
incidentally, about the same size as in the ARIMA model (9 i = 0.594). The residual standard
deviation is (Je = 0.126, a reduction in the pH of the original data from 0.152 to 0.126. The
relationship may be expressed as:

                                   zt =zt_! + at-0.594at.1
6-26

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Model Fitting: Flow (Log)

An AR (1,0,0) model was fitted to the variation in logarithms of the flow variable; it was
considered that the AR(1) coefficient would "take care of the first difference. Chi-square of the
residual = 30.06 with 28 degrees of freedom (0.50 > P > 0.30; i.e., not significantly different
from that expected from white noise). There are no significant spikes in the Acf or Pacf values.
This model yields the following equation with standard deviation of the residuals Oe = 0.347
(reduced from 0.697 for the original standard deviation of the logarithms of flow in Table 6.2):

                               zt = 0.873 z,_!+ 1.636 + a,

Model Fitting: Acidity (Log)

From the Acf, developed during the identification step of the Box-Jenkins series, it was decided
to try an MA (0,1,2) model which would presumably clear out the large spikes at the first three
lags in the Acf. Upon fitting, it turned out that the correlation coefficient between the two
                             ys.      ys.
moving average coefficients ( 9 i and 9 2) was -0.612 (i.e., as one increased the other
decreased).  A chi-squared test of the residual Acf yielded 29.86 with 28 degrees of freedom
(0.50 > P > 0.30). The Acf spike at lag 6 is significantly larger than its error.

                                             ys.                                   ys.
Upon testing the coefficients of this model, the 9 i = 0.642 and is real, but the second  02 =
-0.640 and its confidence belt included zero. The standard deviation of the residuals is 0.139.

An MA (0,0,2) model showed no correlation among the two coefficients or between either
coefficient and the mean.  The residual chi-square = 32.77, with 27 degrees of freedom (0.30 > P
> 0.20) is not significantly different from that expected from white noise (random error).  The
relevant equation is:

                         zt =3.536 +at  +0.205at_1  +0.274at_2

with standard deviation of the residuals as Oe = 0.136, a small improvement over the MA (0,1,2)
model  and some slight improvement over the original standard deviation (0.147) of the variable
logarithms given in Table 6.2.

Model Fitting: Acid Load (Log)

As a first approximation, an MA (0,1,1) was fitted to these data and a trend term was included to
determine if it gave rise to any improvement. The Acf of the residuals yielded a chi-square =
22.41 with 28 degrees of freedom (0.80 > P > 0.70), not significantly different from an Acf of
white noise.  A barely significant spike occurred at lag 16 in the Acf and Pacf. It was not
supported by any other diagnostic characteristic and so was ignored. The correlation coefficient

                                                                                    6-27

-------
Chapter 6

between the trend constant and the MA coefficient (  9 i )= -0.01. Therefore, they are
effectively independent. However, on testing the trend term, its 95% confidence limits include
zero, and therefore, the trend constant does not make any real contribution to explaining the
                                              yv.
variation of log acid load.  The MA coefficient (9 i) = 0.247 and is real.  The equation may be
expressed as (the trend term is omitted for reasons given above):

                                    zt = a t - 0.247a M

The standard deviation of the residuals is 0.355, which is approximately half the original
standard deviation of 0.614.

Two other models were fitted to these data (an ARI (1,1,0) and an ARMA (1,0,1)), again
assuming that the AR coefficient would proxy for the first difference in the ARMA model. A
chi-square of the residuals from the ARI model yielded 22.11 with 29 degrees of freedom (0.90
> P > 0.80). Clearly, the first differences and the autoregressive coefficient (O i) reduced any
unusual occurrences in the data.  There were no significant spikes in the Acf but there is a
possible one at lag 16 in the Pacf (i.e. the MA (0,1,1) model). The AR coefficient was
                                s*.                                                      ^
significantly different from zero (O = -0.203) and the standard deviation of the residuals is (Je
= 0.353, a considerable reduction from the original value of 0.614 for standard deviation of the
logarithms (see Table 6.2). The equation is:

                               z t  = 0.797z M - 0.203z ,_2 + a t

The ARMA (1,0,1) model possessed two coefficients and a mean. Their respective correlations
were r12 (  Oivs.  X) = 0.03, r13 = 0.55, and r23 (  X vs.  9 i) = 0.01, effectively independent for
the first and third and not very large for the second. Acf of the residuals yielded a chi-square of
24.32 with 27 degrees of freedom (0.70 > P > 0.50), indicating no significant difference from an
Acf for white noise. The autoregressive coefficient ( Oi = 0.881) and the mean (X = 3.196)
                                                                                yv.
were real, whereas the 95% confidence limits around the moving average coefficient (9 i =
0.171) contains zero. The standard deviation of the residuals is 0.347, the same order of
magnitude as the previous models fitted to log acid load.
Summary

It is somewhat surprising that there appears to be no seasonal component in the time series
models, particularly in the load variables. The only satisfactory explanation appears to be the
existence of too many maxima at too many different times with very little repetition during the
same time period.

Most of the variables show the presence of a trend over time (pH, flow, acidity, acid load, iron
load, ferrous iron). These variables need a first difference to remove the effects of the trend. It

6-28

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

seems evident from the studies to date that a moving average model applied to the first
differences is almost universally the best choice. In some cases, the autoregressive model,
possibly with a first difference, is also appropriate. In  both cases, there is an indicator that the
variation in whichever parameter is being analyzed, when first differenced, leads to a random
walk.

The quality control analysis, in both cases, suggests that either the mean (plus or minus two
standard deviations) or the non-parametric median (plus or minus a function of the H-spread) are
equally appropriate.  For the present, it is recommended both should be used until one or the
other show superior performance.
                                                                                      6-29

-------
Chapter 6
6-30

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 7: Analysis of Data from the Fisher Deep Mine Site, Lycoming
             County, PA

The Fisher site is located in Lycoming County, Pennsylvania near the village of English Center
(Figure 7.0). Prior to remining on this site, the land surface was extensively disturbed by
abandoned mine pits and spoil piles, and the Fisher deep mine, a large abandoned underground
mine, occupied much of the subsurface.  Fisher deep mine discharge (monitoring point M-l)
characteristics have been discussed in numerous other reports including Section 5 of EPA's Coal
Remining Statistical Support Document (EP A-821 -B-00-001).

The Fisher deep mine discharge and its impact on the receiving streams is discussed in an
Operation Scarlift Report of 1977 on the Little Pine Creek Watershed. The Buckeye Run and
Otter Run tributaries of the Little Pine Creek were impacted by AMD from the Fisher deep mine.
Otter Run was a prolific native brook trout stream prior to being impacted by the Fisher deep
mine discharge, and it has returned to a trout fishery as the result of remining operations.
Descriptions of the remining operation, geologic characteristics of the area and water quality
improvements are included in Plowman (1989) and Smith and Dodge (1995).

The data set that was analyzed statistically in this chapter (see report by Dr. J.C. Griffiths,
December 1987) includes all baseline pollution load data (i.e., prior to issuance of the first
remining permit) and data from the first year and a half of remining. Baseline pollution load
data collection took place from May/June 1982 through 1985. The primary remining  permit was
issued on November 5,  1985, and remining operations commenced by February 1986.  Final coal
removal occurred on June 1995 and backfilling was essentially completed within that permit area
by February 1996. The primary remining permit for this site is contiguous to a previous permit
that did not involve daylighting and to a  subsequent remining permit that was issued in 1994 and
completed in July 1999 (that also drained to the M-l  discharge). The total acreage of these three
permits is 542, of which approximately 200 acres were mined under the initial permit (issued
prior to 1985). The data set included in Section 5 of the EPA Coal Remining Statistical Support
Document includes monitoring data for the M-l discharge from  1981  to 1998.  Time plots and
box plots of net acidity, acid load, iron load and net alkalinity show changes in water quality and
pollution load over the four year baseline period, ten years of remining, and two years following
the completion of backfilling of the remining site.

The data analysis presented in this chapter follows the usual flow diagram (Figure 3.1). The data
consist of 79 observations of seven parameters. Flow measurements began on June 9, 1982 and
remining of the site began on February 4, 1986.  There were three observations prior to June 9,
1982 (see Appendix E Table).  After  excluding these observations and inserting mean values for
samples with a missing parameter, 57 observations remained prior to remining and 19
observations remained after remining commenced. From the histograms  showing skewness of
varying degrees, it was decided to log-transform (base ten) the data.
                                                                                    7-1

-------
                   fex»/A-V,'
                                   -I7QO-

                        S5*y
                      ,•€/*>•* *V*
            ~-/r  ; | •

            xfti ^\'-,
            /,/. i    \ \ \
                                        r\
    • / / ••/'s^*^^^ •—"'"•»- ^ScxV'vv*!^ ^j"^--'•  ^s -'S i s '.  » \ \ \^_^"^ • -'v —} \ i

   ''J!&%£•—-  \N"""^iz*^s>:^ii:^^;5>--»''-/"^ ' \ X- X\X\ v  .- '//---"\V- V'-

   -^W  Xx.^X/40-^^^^^^^VX-  >^:>/v \})V


   %v   ."	"•<•.•.. ,^	X—'=^, -	 ' --A^.v ^-Ji- •" '   :- •
    ••, \-J K.  *      •-----, •••-. .„,.— ..„   , ,_.~™-  *;. ---,.--, -Y^-   x\ •-. s *>...--•'.
                                               , ,t»*» ™=- XX

                                              \fe^'
                                               - ^^  *«•, -W ' •• '
O




o
o




8
8

8

5
5

en
3
cd
(XJ (*B
0^
tn "n
H|
^* 5'
S =
3' 2,
C ,1?
5 g*
^g5
s,f
I f
So
f5 o
3
5

.•v-r-,\'T'^^^pifptf	U-v
 '"_--- %.   v  i '  • -T"t*.«>.*3?^K a>-^SS*r^A*-'3'^aifej?^s?(c *!^--~- • V  •'  £

 ^•^>\\'-^                       ^ U  s
    • •--.O-'-   I I    \ .. WvVVWl^&'^air.-iit.•:*&-*.*•-/Y,'- ..•  i :'-'.l/ >   \ .. •
                        '  I -/ /',

                / H4 ^^'^/-
^_   \ ^^  ."'
  S. I   ,  I I  —V.
  ^  \;  .!  -r,\
                        ^C> '^XS
                         j v.>
                    •^'ImlUc,
                  ^.x--^|#4v.''.

                'x¥ilOi-^S:
                             [,jji^y^ ('(f^
%
                                      ^:
                                                            era
                                                             c
                                                             •s
                                                             re
                                                             s
                                                             3!'
                                                             =
                                                             re
                                                             re
                                                                S


-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
After preliminary analysis of the data, the bivariate and time series plots appeared to be
somewhat irregular, and it was decided to measure the intervals between the observations by
creating a new variable (the first differences between the number of days).

The intervals between observations (days) vary from extremes of one to 104, with a mean of
26.7 days. This mean is nearly equal to the median (26.5), indicating that the frequency
distribution is symmetrical.  The central part of the distribution Qx to Q3 lies between 12.7 and
33 days.  The most serious discrepancies  are, however, that there are five observations between
70 and 104 days, and four of these are 90 days or more. These large gaps in the data preclude
rigorous time series analysis which requires approximately equal intervals between observations.

Univariate Analysis

The coefficient of variation  (CV%) for flow, acidity, sulfate and manganese (Table 7.1) are all
less than or equal to 20%. This is surprisingly low when compared to previous data analysis.
Iron, however, possesses a coefficient of variation of 929 % and aluminum also has a large CV
(71%).

Table 7.1:    Summary Statistics for 79 Log Transformed Observations

Flow
Aciditv
SO,
Total Iron
Mn
Al
Interval
N
79
79
79
79
79
79
78
N*
0
0
0
0
0
0
1
Mean
1 .7882
1 .8700
2.5316
0.0442
0.9396
0.4959
26.72
Median
1 .8062
1 .8274
2.5105
0.0825
0.9513
0.5539
26.50
Trimmed
Mean
1 .7754
1 .8660
2.5342
0.0575
0.9335
0.5029
24.20
Standard
Deviation
0.3734
0.2183
0.2124
0.4106
0.1716
0.3539
19.95
Standard Error
of the Mean
0.0420
0.0246
0.0239
0.0462
0.0193
0.0398
2.26

Flow
Acidity
S04
Total Iron
Mn
Al
Interval
Minimum
0.9542
1 .4409
1 .6902
-1.301
0.5775
-0.4948
1.00
Maximum
2.7882
2.3747
3.0792
0.8450
1.5185
1 .4698
104.00
First
Quartile
1 .4771
1 .7076
2.4346
-0.1024
0.8500
0.3598
12.75
Third
Quartile
2.0000
2.0453
2.6335
0.2032
1 .0253
0.6628
33.00
Coefficient of
Variation
20.9
11.7
8.4
929.3
18.3
71.4

                                                                                     7-3

-------
Chapter 7
There is little doubt that the coefficient of variation for iron is misleading and serves to illustrate
one of the dangers of using the CV%. When the mean is very small, as in this case, the CV tends
to become very large, particularly in ratio-type data (i.e., percent or concentration, Griffiths,
1967, Chapter 15, page 316).  It should be used on log data with great care, if at all.

When the data are subdivided into 57 observations (from the beginning of flow measurement to
immediately prior to remining, Table 7.2a), and into 19 observations (after commencement of
remining, Table 7.2b), the CVs of flow, acidity, sulfate, and manganese remain substantially
similar.  Iron, however, shows a marked drop from a CV equal to 109.2 % to a CV equal to
50.2%, implying that there was a major change in variability after the start of remining.  The CV
of aluminum, on the other hand, shows no change from the original data set.

Table 7.2a:    Summary Statistics for 57 Log Transformed Observations (Pre-remining)

Flow
Acidity
S04
Total Iron
Mn
Al
N
57
57
57
57
57
57
Mean
1 .7885
1.9176
2.4654
0.2027
0.9661
0.4874
Median
1 .8062
1 .9222
2.4771
0.1461
0.9713
0.5250
Trimmed
Mean
1.7751
1.9170
2.4744
0.1961
0.9652
0.4928
Standard
Deviation
0.3793
0.1964
0.1881
0.2214
0.1302
0.3520
Standard Error
of the Mean
0.0502
0.0260
0.0249
0.0293
0.0172
0.0466

Flow
Acidity
S04
Total Iron
Mn
Al
Minimum
0.9542
1 .4564
1 .6902
-0.1427
0.5775
-0.4948
Maximum
2.7882
2.3747
3.0792
0.7243
1 .4048
1 .4698
First
Quartile
1 .4771
1 .7489
2.4013
0.0453
0.8836
0.3874
Third
Quartile
2.000
2.0737
2.5682
0.3444
1 .0528
0.6389
Coefficient of
Variation
21.2
10.2
7.6
109.2
13.5
72.2
7-4

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Table 7.2b:   Summary Statistics for 19 Log Transformed Observations (During remining)

Flow
Acidity
S04
Total Iron
Mn
Al
N
19
19
19
19
19
19
Mean
1 .7702
1 .6673
2.6844
-0.5345
0.7988
0.4865
Median
1 .8062
1 .6928
2.6335
-0.5376
0.7672
0.6542
Trimmed
Mean
1.7551
1 .6733
2.6830
-0.5170
0.7953
0.4954
Standard
Deviation
0.3867
0.1018
0.1695
0.2684
0.1430
0.3818
Standard Error
of the Mean
0.0887
0.0234
0.0389
0.0616
0.0328
0.0876

Flow
Acidity
S04
Total Iron
Mn
Al
Minimum
1.1461
1 .4409
2.4150
-1.3010
0.6010
-0.3010
Maximum
2.6513
1 .7924
2.9777
-0.0655
1 .0569
1.1239
First
Quartile
1 .4771
1 .5694
2.5658
-0.6021
0.6656
0.2504
Third
Quartile
2.0000
1 .7543
2.8751
-0.3565
0.9101
0.7627
Coefficient
of Variation
21.8
6.1
6.3
50.2
17.9
78.5
The means also show interesting changes. Acidity possesses an overall mean of 1.87.  In
comparison, the mean of acidity prior to remining (1.92) is larger than during remining (e.g., see
Figure 2.5, Chapter 2).   Sulfate is lower than the overall mean prior to remining and much
higher than the overall mean during remining.  Log iron shows the most substantial change, from
0.20 before remining (approximately 1.6 in untransformed data units) to -0.53 (0.295) after after
remining operations began.  This represents a very large and favorable change beause the
pollution load has been reduced. Manganese also shows a quite large change from before to
during remining.

The histograms of log transformed flow (Figure 7. la), acidity (Figure 7.1b), iron (Figure 7.1c),
manganese (Figure 7.Id), and aluminum (Figure 7.1e) are essentially symmetrical, thus the
transformation has sufficed to reduce the asymmetry in the original data.  Because of the gaps  in
the data and their peculiar pattern of variation,  it was decided to graph some of the parameters to
show the distribution of gaps and to examine the pattern for cycles.
                                                                                     7-5

-------
Chapter 7
Figure 7.1a:  Histogram of Log Flow
               of        N  =  79
Midpoint Coynt
1 ,0 1
1.2
1.4
1.6
t .8
2.0
2.2
2.4
2,6
2.8
8
12
11
17
17
S
5
2
I
                        t**»*************
Figure 7.1b:  Histogram of Log Acid

   Histogram of  ACID   N -  79

   Midpoint    Count
         1 .4
         1.5
         1 .6
         1 ,?
         1 .8
         1 ,9
         2.0
         2.1
         2.2
         2.3
         2.4
 I   *
 4   * * * t
 S   *****
1g   ******************
14   **************
 9   *********
10   **********
 9   *********
 5   ***«*
 3   ***
 1   *
Figure 7.1c:  Histogram of Log Iron
  Histogram of  FE   N=79
Midpoint
-1.2
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
Count
1
1
1
8
6
3
22
17
9
9
2

*
*
*
*
*
*
*
*
*
*
*
                       ********
                       ******
                       ***
                       **********************
                       *****************
                       *********
                       *********
7-6

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 7.Id:  Histogram of Log Manganese


    Histogram  of  MN     M =  79
M tdpo » ft
0.
0,
0 .
0.
1 .
i .
1 .
1 .
I .
) ,
f
6
7
8
9
0
1
2
3
4
5
Count
4 * * # *




2 »*
0
2 **
I *
Figure 7.1e:  Histogram of Log Aluminum

    Histogram of  AL    N = 79
Midpoint
-0,4
-0.2
0,0
0.2
0.4
0.6
0.8
1.0
1 .2
Count
7
0
2
S
16
29
1 1
3
1

******* *

**
********
4********
*********

***
*
                                      **
            1.4
                           **
Log flow versus days is shown in Figure 7.2a, and there does not appear to be much change
coincident with the start of remining.  Furthermore, peak flows occur in various months
throughout the record; there are two in June (1982, 1985) and two in April (1984, 1987) for
example, but they do not appear to recur each year. No persistent cyclic pattern is evident for
flow.

Log acidity (Figure 7.2b) shows a large change, remaining well above both the mean and median
from November 27, 1981 to September 6,  1984, then falling to the mean around December 18,
1984 and falling consistently below both mean and median following October 26, 1985. This
change took place prior to activation of the remining permit. However, mining was occurring on
an adjacent surface mine prior to 1985.  The mean prior to remining (N = 57, Table 7.2a), is
1.9176 (log transformed) and is 82.7 in untransformed units. After remining, the mean is 1.6673
in log-transformed units and 46.5 in untransformed units. When the quality  control limits around
                                                                                   7-7

-------
Chapter 7

the median are inserted using a sample size of 18, the post-remining median is significantly
below the pre-remining limits (see lines in Figure 7.2b).

Graphs of log iron and log manganese are included in Figures 7.2c and 7.2d respectively. Log
iron shows a marked decrease over time during the pre-remining period with a sharp decline
immediately following commencement of remining.  The confidence limits around the median of
pre-remining are inserted in the graph. The median and the confidence limits, after remining
began, are much lower and the median lies outside the confidence limits of pre-mining.

Log manganese also shows a decrease after remining began but is not nearly so marked as is log
iron. However, as in the case of log iron, the median log manganese after remining remains
outside (below) the pre-remining confidence limits. Log aluminum is plotted against days
(Figure 7.2e), and the pre- and post-remining statistics are fairly similar. The post-remining
median lies within the confidence limits (for N' = 18) of the pre-remining performance.  There is
no substantial change in the central tendency.
7-8

-------
Figure 7.2a:    Log Flow vs. Time (Days)
  Ji't
V    i       i
 V    /SJ      «*
                                                                                                                                            a
                                                                                                                                            a
                                                                                                                                            i
                                                                                                                                            §•
                                                                                                                                            a
                                                                                                                                            b
                                                                                                                                            2
                                                                                                                                            a'
                                                                                                                                           .a
                                                                                                                                           a-
                                                                                                                                           TO
                                                                                                                                           o
                                                                                                                                           a
                                                                                                                                           a
                                                                                                                                           a,

-------
Figure 7.2b:    Log Acidity vs. Time (Days)
;'#


i*.

"
— — —



_ j !
JM

— 	
"~

| 	 , ,

4 1 t
-?P«
1 *
J •


1 i i 	 i 	

*
.







l'/V




1

i i
I
v+i. _ _i

1 I
VI \j
1
I

J
A 	

^ J
	 	 	 ^L

1— H 	 h

»tv '
+ ifr"
ficiJ
Jt ~— ,

.. _ -4 — — »— — — /fct - 	 "~
*' / \ / "" \ / j' . 	 3
Jj— 1' — -*^— — — ^ 	 »
\ "f -f-Spr Y >(\
' ?'
              aoo
Sbo         700        qoo   looo        i»co
)SOO         1160
                                                                                                                                          s


-------
Figure 7.2c: Log Iron vs. Time (Days)
               200   400
 1000 1200 1400  1600  1800  2000 2200


Days
                                                                                            rs
                                                                                            a
                                                                                            o
                                                                                            i
                                                                                            I
                                                                                            §
                                                                                            a'
                                                                                            TO

                                                                                            I
                                                                                            3'

                                                                                           I
                                                                                            s-
                                                                                            TO
                                                                                            o
                                                                                            a
                                                                                            o
                                                                                            o

-------
to
                        Figure 7.2d: Log Manganese vs. Time (Days)
                         1
                         5JO
                         O
                               .2-
                         VSJ

                        %    0.8-
                              0.6-
                              0.4-
                              00-
                                             X -2o
                                                     Pre-Remining
                                                                                       X -2o
Remitting
                                  0   200  400   600   800   000   !00  400   600   800   000   !200

                                                               Day
                                                                                                             S


-------
Figure 7.2e:     Log Aluminum vs. Time (Days)
                1ft o

/•y

•A,

/-£>
0-f
o-c
&•*.

0-0
0.2
»'i
A /«
I 1
-
I






^
^
*
' \
§

\
I

I






I
i i i i ' i i ' 1
~[ I > i ' J i i • i
L»fl. A*. . ' fds^B ^ „ .A




i



r 1
•\ r^ a
I . r r, jff,
1 ! s /M» Mj A \ *
l-'-L — « — .-A-S n* 	 A •/ \ A A
I
I

1

1^

1
1
i
i_J 	 i 	 1 '
!' {

» /
\ l
\ /
\i
/\ / ii\'ri
/ \ , / y i
/ VV '
f
1
l§¥
X + 2f
flt} + C-5*r
\

n .j^f \ * \

\ / \ / » ' * I ^\
V ^ »/ -ill -r-I-^I- —
, v | r i //Jf ^ -
• ! / \



f X -2^-
/ 	 Mrf-C-
c
i i • i i i i i ti
Sfctf

! I 1
1 ' i
1 I ''
1 1 V

i .
1
•J
' » *

1 ' I ' 1 J i
                                $0®
/«30
                                                                                                                                &

                                                                                                                                a
                                                                                                                                a
                                                                                                                                a
                                                                                                                                i
                                                                                                                                I
                                                                                                                                  '


                                                                                                                                I
                                                                                                                                a-
                                                                                                                                TO
                                                                                                                                I
                                                                                                                                o
                                                                                                                                a
                                                                                                                                o
                                                                                                                                a
                                                                                                                                a,

-------
Chapter 7

Another comparison of interest between pre- and post-remining water quality concerns
variances, and parameter pairs may be compared by a variance ratio or F-test. It is customary to
divide the smaller variance into the larger so that the outcome always equals or exceeds one.
This is in accord with the F-table of values which tests only that half of the F distribution that
exceeds unity. In this case, it begins with the flow variables as follows: the calculated ratio, Fcalc
= 0.1495/0.1439 = 1.039.  The expected F-value (F1855 = 1.79, at the five percent level) is much
larger, thus, there is no difference in the variances before and after remining.

In the variance ratio for acidity, Fcalc equals 0.0386/0.0104 = 3.722. The five percent level of
expected F for these degrees of freedom (df) is F5018 = 2.02, so that the variance for pre-remining
is significantly larger than that for post-remining (a desirable outcome).  The same test
performed for SO4 yields Fcalc equal to 0.0354/0.0287 = 1.232 . Thus, the pre-remining variance
is not significantly different from that after remining.

For iron, Fcalc equals 0.0720/0.0490 = 1.469 (not significantly greater than a 5% probability level
°f F is se =  1-79).  The variance of iron concentration after remining is not significantly different
from the variance of iron concentration before remining. Manganese has a slightly larger
variance post- than pre-remining yielding an F calc = 0.020/0.017 = 1.206, but the difference in
variability is not statistically significant. Aluminum also possesses a larger variance after
remining than before; the corresponding F calc = 0.1458/0.1239 = 1.176. This difference is not
significant.

The behavior of the variance before and after the beginning of remining is as important as the
differences in mean or median. This is because if the variance becomes significantly smaller
after remining begins, the observed value is much less likely to exceed the confidence limits at
some future time, assuming that the behavior remains consistent.

Bivariate Analysis

Examination  of the relationship among pairs of variables begins with the correlations of zero
order among  all pairs of the seven variables (Table 7.3).  The expected value of the correlation
coefficient for a pair of variables with 79 observations (= 78 df), each taken from a population in
which there is no correlation, is approximately r = 0.217 (using 80 df from Table 21, Arkin and
Colton,  1950, p. 140, Table of r for the 1% and 5%  points of the r distribution).  This means that
any r < 0.217 is not significantly different from zero. This value is found at the top of Table 7.3.
7-14

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Table 7.3:    Correlations of Zero Order Among the Seven Variables
          Correlation of Seven Variables
                                                     , = 0.217

Flow
Acidity
S04
Total Iron
Mn
Al
Days
-0.044
-0.783
0.199
-0.852
-0.410
-0.121
Flow

0.082
-0.006
-0.013
-0.042
-0.307
Acidity

-0.093
0.724
0.457
0.058
S04

-0.211
0.396
0.250
Total Iron

0.510
0.055
Mn

0.361
Pairing each variable in turn, against days, shows that a linear association between flow, sulfate,
or aluminum and days is unlikely. The relationship between acidity and days is negative (i.e.,
inverse). Acidity decreases as days increase.  This is also the case with the relationship between
days and iron. In both cases, the proportion of common association (r) among the pairs of
variables is large, 61% for acidity and 73% for iron.  Manganese also shows an inverse
relationship with days but the degree of association is much less (r 2 = 17%).

Relationships between flow and the other variables, as measured by the correlation coefficient, is
effectively zero. The exception is aluminum, where the relationship is negative (inverse) and the
degree of association is not very strong (r2 = 9%).

Acidity appears to have no simple linear relationship with either sulfate or aluminum, however,
it is positively associated with iron, possessing an r2 = 52% in common.  Acidity has r2 = 21%
common association with manganese and again the relationship is positive (i.e., they increase or
decrease together).  Sulfate and manganese are positively associated but the degree of common
association is weak. Variation in manganese is related to variation in iron in the same way but to
a slightly greater degree.  Manganese is also weakly positively associated with aluminum (the
degree of common association r2 = 13%).  The strongest correlation coefficient values are
between the pairs of acidity and days,  and iron and days. The decline of acidity and iron with
time is obvious in Figures 7.2b and 7.2c. It is no surprise, therefore, that the third strongest
association is the positive one between iron and acidity.

As a check that perhaps the maximum degree of association was not of zero order, the cross-
correlation functions (CCF) were run and the principal outcomes are listed in Table 7.4.  To
evaluate these functions, it is reasonable to take a conservative value (for example, r = 0.3) as the
limit below which the relationship is not significantly different from zero. In the case of flow
versus the other variables, there appears to be no linear association except for aluminum which
has its highest value as inverse (-0.389) at lag -4. It is likely that the zero order value of-0.302
is not really significantly different from the r value at lag -4.
                                                                                      7-15

-------
Chapter 7

Table 7.4:     Summary of Important Cross-correlation Functions (CCF) Among Seven
              Variables

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
Variables
Flow vs. Acid
Flow vs. SO4
Flow vs. Fe
Flow vs. Mn
Flow vs. Al
Acid vs. SO4
Acid vs. Fe
Acid vs. Mn
Acid vs. Al
SO4 vs. Fe
SO4 vs. Mn
SO4 vs. Al
Fe vs. Mn
Fe vs. Al
Mn vs. Al
Acidity vs. days
Iron vs days
'max
0.248
- 0.254
0.214
- 0.276
- 0.389
- 0.367
0.724
0.457
0.252
-0.361
0.396
0.299
0.511
-0.188
0.441
- 0.783
- 0.852
lag @ rmax
-16
-4
10
-4
-4
14
0
0
-18
-7
0
1
0
-12
1
0
0
Irl >0.3
none
none
none
none
-4, -3, 0
2 to 4, 14, 15
- 14to 14, 16to 18
0, 1
none
-9 to -6, -4, 2
0, 3 to 5
none
-2 to 3
none
- 13,0, 1
-18to17
-17,-15to13
Acidity versus sulfate, iron, and manganese are all larger than the critical value of 0.3. The
cross-correlation function for sulfate has three values exceeding 0.3 (at lags of+4, +2 and +14).
However, these values are all indicative of a low degree of association (<13%) between the two
variables. Iron and manganese achieve their maximum r at zero lag.

Sulfate versus iron, manganese, and aluminum show correlations between 0.3 (0.299) and 0.396.
These values are all significantly different from zero. Correlations are positive between sulfate,
manganese, and aluminum, but negative with iron.  Sulfate behaves independently in all
associations with other variables.  The maximum correlation between iron and manganese
(0.510) occured at 0 lag, and indicates a relatively  strong degree of association (26%). The
relationship between iron and aluminum never exceeded the critical value of r = 0.217.

The correlation between iron and manganese is very weak (<26%) and the maximum is at zero
lag. The relationship between iron and aluminum does not exceed the critical value of 0.217 (see
Table 7.3).
7-16

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

The relationship of manganese and aluminum exceeds the critical value of 0.217 at lag  -13 to
-10, lag 0 to 1, and lag 17.  The maximum r (0.441) occurs at lag 1. Again, if this is a real
association, it is weak (r2 = 19%).

Bivariate relationships between pairs of water quality parameters and flow vs. water quality
parameters were plotted.  The results yielded very little that was meaningful with the exception
of iron vs. acidity (Figure 7.3).  This relationship had the highest positive correlation coefficient
(r = 0.724).
Figure 7.3:    Plot of Log Iron vs. Acidity

MTB >  PLOT C5 VS C3

                                  r=0.724

    0.70 +
FE     -                            *     * **
                             *******
       _              ***o****3*      *  * * 2
    0.00+      *     * *  *23*  2    * *   **
                        * *   *
       _          * * *    *   2 ^
   -0.70+                 *  *
   -1.40+
         __ +	+	+	+	+	+	ACID
        1.40      1.60      1.80      2.00      2.20      2.40


Time Series Analysis

This analysis is subdivided into three parts.  First, there is a plot of each variable versus the date
collected.  The dates are forced into 79 equal intervals, distorting the graph in terms of horizontal
scale. (The correct spacing may be seen in Figures 7.2a to 7.2d.)  The second subdivision details
the diagnosis phase of the Box-Jenkins time series analysis using the autocorrelation and partial
autocorrelation functions (Acf and Pacf, respectively). The third stage comprises modeling
using Box-Jenkins estimation and forecasting programs.

The time series graphs begin with a plot using the variable of the first differences between
collection dates against the observation number(s) (Figure 7.4a). The trend increases
consistently through time. Figure 7.4a is included as an example of what happens when a
variable of known structure is analyzed,  where any variable with a constant function (increasing
or decreasing) over time will yield a typical Acf and Pacf (Figures 7.5a and 7.5b).

Log flow is plotted against equal intervals in Figure 7.4b. The variation around the mean
appears to remain reasonably constant throughout the period  of observation.  By contrast, in a
plot of log acidity versus date (Figure 7.4c) the variation in acidity is consistently high and above

                                                                                      7-17

-------
Chapter 7

the mean until approximately the 29th observation (September 6, 1984), when it decreases to the
mean from observation 30 to 36 (September 21, 1984 to November 23, 1984). The pattern of
variation then falls well below the mean from observation 37 to 49 (December 18, 1984 to May
22, 1985); from observation 50 to 54 (May 28, 1985 to August 19, 1985) it remains close to the
mean, and from 55 to 79 (September 21, 1985 to August 12, 1987) the range of acidity values
remains consistently below the mean. Remining began at observation 61 (February 17, 1986).

Sulfate versus observation number (Figure 7.4d) shows no substantive change in variation
around the mean throughout all 79 observations. The plot of iron versus observation number
(Figure 7.4e) shows a slight decreasing trend prior to remining at observation 61. From
observation 62 (March 22, 1986) onwards, the variation is well below the mean with two
observations (65, June  10, 1986 and 89, August 12, 1987) below the lower two standard
deviation limit.

Fluctuations in the concentrations of manganese (Figure 7.4f) are quite large, particularly in the
beginning (observations 1 through 3, November 27, 1981 to May 19,  1982).  From observation 4
through observation 66 (June 9, 1982 through July 15,  1986), the fluctuations are around the
mean (= the median), and from observation 67 to 79 (August 12, 1986 to August 12, 1987), the
observations tend to fall below the mean, varying widely, from observation 73 (February 14,
1987) slightly above the mean to observation 70 and observations 75 to 79 (November 15, 1986
and April 11, 1987 to August 12, 1987) near the lower  confidence limits.

The time series plot of aluminum (Figure 7.4g) begins well  above the mean (= the median) in
observations 3 to 9 (May 19, 1982 to September 18, 1982),  then falls well below the mean for
observations 11 to 22 (November 13, 1982 to March 2, 1984).  Observations 11,15, and 19 to 21
(November 13, 1982, May 18, 1983 and December 15, 1983 to January 28, 1984) are all below
the lower confidence limits. For observations 23 to 79 (March 31, 1984 to August 12,  1987), the
concentration falls around the mean with two strong deviations to the lower confidence limits at
observations 71 and 72 (December 13,  1986 and January 17, 1987). Remining does not seem to
have had any consistent effect.
7-18

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load


Figure 7.4a:   Collection Dates vs. Observation Number (First Differences)
      > TSPtOT C1

       liOO*
      1200+


                                                               iti


                                                                         2345
                                5G789
                                    4M234M7
                                           •901
                                                   8901134
                          •eo
                      348
              4B67BQ
          -  28
         0+
          *—T	"
                  1D
                                   s§
                                            40
                                                    50       (SO
                                                                     ¥8
Figure 7.4b:  Plot of Log Flow vs. Time
      MTi > TSPLOT 


      FLOW
         1 .20*
AZ

                                                                           »2
                                                                          4
                                                               1       9
  /  3*1  » •»
  /	   in	I  |

/     VJ
                            It      If
                                            40
                                                                              eo
                                                                                     7-19

-------
Chapter 7



Figure 7.4c:  Plot of Log Acidity vs. Time
     MTB ? TSPLOT f



       2.400+   !

            -, 23"

     ACIO    -'-
                 6
                , A   \
        1, 100+  I  I / 78 p X
                 5  V   34 ,
                               3 V 789
                                         \     ,v,  ;
                                          V   1  /lf\ f
                    10
                                                   so
                                                                          	f


                                                                           BO
Figure 7.4d:  Plot of Log Sulfate vs. Time

                                                                               v
                         20      30
Figure 7.4e:  Plot of Log Iron vs. Time
    FE
       0.75+
       o.oo
                                 2-s

      -0.75+
      -1.50+
                    10
                                      3O
                                               40
                                                        50
                                                                 60
                                                                          70
                                                                                   80
7-20

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figure 7.4f: 1
MTB > TSPLOf
1.50O+ 3
- i
MM - 2
,ta I
_ I
1 . 1
- I
- 1
0,900*

'lot of Log Manganese vs. Time
C6


1

11 ?6
I / \ A '
1 3 1 1234 1 P\3 5
an IIP' 7 ? y \
2. 815? | jWSMf t 	 fill
ft ?i 	 TT' 	 %if~ ' \ z/ «
J Sy ^ A] \^/V
V -i






_ . 	 -_ „_

3
A ;
t^A/
! \ ' 11
h
                                  30       40
                                                                         Bfi
Figure 7.4g:   Plot of Log Aluminum vs. Time
     MTB > TSPLOT C?
               3/78   2
        0.75*   ,'  If   90 |*3
        ml.** '2  5
        0.00 +
       -0.75*
       -A
                               /
i?(f
•234S,p    4   90,  34  7B^ 2'3\/78l  S,   ,
                          /	V--f4-Ar
^^
                             lot
                                     30
                                      o  /^'

                                           5

                                       12
                                              40
                                                      50
                                                               60
                                     -»	^
                                     ™        ft*
Diagnosis of Time Series Models Using Autocorrelation and Partial Autocorrelation
Functions

The Acf of days or dates of observation numbers 1 through 79 (Figure 7.5a), shows a steep but
uniform decline as would be expected from the consistent increase (i.e., a strong trend) present in
Figure 7.4a.  The corresponding Pacf (Figure 7.5b) has one large spike at lag 1.  The first
difference is likely to be a random walk.

The Acf of flow has no distinct patterns, with a single small spike at lag 1 (Figure 7.5c). The
Pacf is similar to the Acf (Figure 7.5d), and a simple auto regression (AR(1)) or moving average
(MA(1)) model would do equally well (or poorly) in describing the behavior.  Acidity, on the
                                                                                    7-21

-------
Chapter 7

other hand, shows a marked decline in the Acf (Figure 7.5e) similar to that in Figure 7.5a, but
somewhat less uniform.  The Pacf (Figure 7.5f) has one large spike at lag 1.  Here, a first
difference is necessary to reduce the variation to a stationary series.  Then a  simple AR or MA
would probably suffice.  The Acf and Pacf of sulfate are very similar (Figures 7.5g and 7.5h,
respectively), and resemble the corresponding graphs of flow (Figures 7.5c and 7.5d). A small
spike at lag 1 and a few subdued features are not likely to be significant.

Iron shows an exponential decline in Acf (Figure 7.5i). The Pacf has a large spike at lag 1,
indicating there is a trend over time which should be removed before the series becomes
approximately stationery (as shown in Figure 7.5j).  The other features appear to be
overwhelmed by the trend.

The Acf and Pacf of variation in concentration of log manganese (Figures 7.5k and 7.51) show
similar, if slightly less distinct, characteristics as log Fe concentration. Their variations, in
overall terms, are somewhat similar. Again  a large spike at lag 1 requires a first difference, but
the remainder of the variation is probably not significant. The variation in Acf and Pacf of
aluminum (Figures 7.5m and 7.5n) are very  similar to each other and to sulfate. Modeling should
begin with a simple AR(1) or MA(1) and the complexity should be increased if there are any
spikes which are significantly above background.

It appears evident that there are two types of variables in terms of their variation patterns. The
first type is very like the first differences (e.g., Figure 7.5a) in possessing a strong trend, not
always uniformly decreasing;  but acidity, iron and possibly manganese all decrease over time.
The second type (e.g., flow, sulfate, and aluminum) shows no well-marked trend but is much
more irregular in behavior. When the trend  is removed,  residual variation will possibly be
similar in all six variables.
7-22

-------
                         Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load


Figure 7.5a:  Autocorrelation Function  of Days
                  .ft -0.4-0.?  0,0  O.Z  0.4  0.6  0.8
t
a
i
4
s
6
?
a
9
to
tt
12
13
14
18
IB
IT
1«
0
0
0.
0.
0.
0.
0.
0.
0.
0,
0,
o.
o.
o.
0.
0.
0.
0.
.950
.904
.660
,818
.772
. 728
. BBS
.641
.597
.553
,S»t
,469
.428
3i7
.351
.318
.286
2S7
XXXXXXXXXXXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxxxxxxxxx
XXKXXXKKHHXXXXXXXXXXXXX
XXXXXXXKXXXXXXXXXXXXX
XX^KRXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxx.xxxxxxxxxxx
XXXXXXXXXXXXXXXX
XXXMXXXXMXXXXXX.
XXXXXMMXXXXXXX
XXXXXXXXXXXXX
KXXXXXXXXXXX
MXXXXXXXXXX
XXXXXXXXXX
xxxxxxxxx
XJIXXXXXX
XXXXXXX
Figure 7.5b:  Partial Autocorrelation Function of Days
PACF
1
2
3
4
S
6
T
i

10
1 f
12
IS
14
11
1 S
ITT
li
> Cl
of DAVS
-1.O -0,« -0.6 -O,4 -0.2
O.B5O
O.O12
-O.OOO
-0,023
-O.02B
-0,01?
-O.O241
-O.O33
-O.O1Z
-O.O27
-U.OI6
-8,022
-O.O14
-0.028
O,Oi ACF CZ
  ACF of
    9
   10
   11
   12
   II
   14
   IS
   18
   If
           -1.0 -0.8 -O.i -O.4 -O.2  0.0  0.1  0,4 0.6  0.8  1,0
       -0.
       -0.
       -0.
 0,335
 0.048
 0.009
 O.OT3
 0.083
  ,129
  .198
  ,120
-0.051
-0.026
-0,039
-0.1 If
-0.012
-O.OOB
-o.oa?
-0,«I
-O.OiO
 0,022
    xxxxxxxxx
    EX
    X
    XXX
    XX
  XXXX
XXJtXXX
  XXXX
   XX
   XX
   XX
  XXXX
   XXX
    X
   XXX
   XXX
   XXX
    XX
                                                                                            7-23

-------
Chapter 7

Figure 7.5d:  Partial Autocorrelation Function of Flow
   MTB > PACF C2
   PACF  of FLOW
            -1.0 -O.8 -0,6 -0.4 -O.2 O.O  O.2  0.4  O.6  O.8  l.O
1
2
3
4
5
6
7
8
9
to
1 1
12
13
14
15
16
17
18
0.
-0.
o.
o.
o.
-0.
-0.
-o.
-0.
0.
-0.
-0.
-0.
-0.
-0.
-o.
-0.
0.
335
07S
019
076
003
168
1 1 1
025
019
O07
OOO
1 IS
O47
OO1
133
O)9
044
038
XXXXXXXXX
XXX
X
XXX
X
xxxxx
xxxx
XX
X
X
X
xxxx
XX
X
xxxx
X
XX
XX
Figure 7.5e:   Autocorrelation Function of Acid
MTB
ACF
1
Z
3
4
5
6
7
8
9
10
1 1
12
13
14
IS
16
17
IB
> ACF C3
Of ACID
-1.0 -O.B -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
0
0
0.
0.
0.
0,
0.
0.
0,
0.
0.
0.
0.
0.
0.
0.
0.
0.
.796
.707
,665
.603
,591
,525
.508
512
447
.436
.338
,316
3O9
272
302
281
292
252
xxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxxxxx
xxxxxxxxxxxx
XXXXXXXXX
XXXXXXXXX
XXXXXXXXX
xxxxxxxx
XXXXXXXXX
xxxxxxxx
xxxxxxxx
xxxxxxx
Figure 7.5f:   Partial Autocorrelation Function of Acid
   MTB  > PACF C3
   PACF of ACID
            -1.0 -0.8 -0.6 -0.4 -0.2 0.0  0.2  0.4  0.6  0.8  1.0
,
2
3
4
5
6
7
a
9
10
1 1
12
13
14
15
16
17
18
0.796
0.202
0. 156
0.013
0. 134
-0.079
0.090
0,088
-0.099
0.049
-0.222
0.075
0.007
0.028
0. 113
-0.009
0.095
-0. 168
XXXXXXXX XXXXX XXXXX XXX
xxxxxx
xxxxx
X
xxxx
XXX
XXX
XXX
XXX
XX
xxxxxxx
XXX
X
XX
xxxx
X
XXX
xxxxx
7-24

-------
                         Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 7.5g:   Autocorrelation Function of Sulfate
    MTB  > ACf= C4

    ACF  of  S04

             -t.O -O.B -0.6 -O.4 -O.2  0.0   0.2  0.4  0.6  O.fl   1,0
               +	+	+	+	+	+	4.	+	+_. ._.,.	+
      I   0.4)8                         XXXXXXXXXXX
      2   0.201                         XXXXXX
      3   0.096                         XXX
      4  -0.023                        XX
      5   0.057                         XX
      6  -0.027                        XX
      7   0.064                         XXX
      8   0.138                         XXXX
      9   a.172                         xxxxx
     10   0.250                         XXXXXXX
     II   O.223                         XXXXXXX
     12   O.120                         XXXX
     13   O.094                         XXX
     14   0.027                         XX
     15  -0.020                         X
     16  -0.021                         XX
     17  -0.105                      XXXX
     18  -0.092                       XXX
Figure 7.5h:   Partial Autocorrelation Function of Sulfate

MTB > PACF C4

PACF of SO4

          -1.0 -0.8 -0.6 -0.4 -0.2   O.O  O.2  O.4  0.6 O.8  1.0
1
2
3
4
5
6
7
8
9
10
I 1
12
13
!4
15
16
17
18
0
0
0
-0
0
-0
0
0
0
0
0
-0,
0
-0
-0,
-0.
-0.
-0,
.418
,O32
.002
.083
. 1 10
.091
. 1 19
.087
. 107
. 126
.091
.044
.048
.021
.042
.022
. 123
.086
XXXXXXXXXXX
XX
X
XXX
XXXX
XXX
XXXX
XXX
XXXX
XXXX
XXX
XX
XX
XX
XX
XX
XXXX
XXX
Figure 7.5i:   Autocorrelation Function of Iron

         -1.0 -0.8 -0.6 -0.4 -0.2   0.0   0.2  0.4  0.6  0.8  1.0
           +	+	+	+	+	+	+	+	+	+	+
 1   0.761                          xxxxxxxxxxxxxxxxxxxx
 2   0.709                          xxxxxxxxxxxxxxxxxxx
 3   0.677                          xxxxxxxxxxxxxxxxxx
 4   0.633                          xxxxxxxxxxxxxxxxx
 5   0.576                          xxxxxxxxxxxxxxx
 6   0.562                          xxxxxxxxxxxxxxx
 7   0.521                          xxxxxxxxxxxxxx
 8   0.496                          xxxxxxxxxxxxx
 9   0.439                          xxxxxxxxxxxx
10   0.433                          xxxxxxxxxxxx
11   0.351                          xxxxxxxxxx
12   0.305                          xxxxxxxxx
13   0.286                          xxxxxxxx
14   0.261                          xxxxxxxx
15   0.207                          xxxxxx
16   0.147                          xxxxx
17   0.123                          xxxx
18   0.110                          xxxx
                                                                                           7-25

-------
Chapter 7
Figure 7.5j : Partial Autocorrelation Function of Iron

i
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

0
0
0
0
-0
0
-0
0
-0
0
-0
-0
0
0
-0
-0
0
0
-1.0 -0.8
.761
.310
.181
.069
.017
.071
.003
.024
.072
.049
.140
.067
.022
.016
.051
.127
.000
.045
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
xxxxxxxxxxxxxxxxxxxx
xxxxxxxxx
xxxxxx
XXX
X
XXX
X
XX
XXX
XX
xxxx
XXX
XX
X
XX
xxxx
X
XX
Figure 7.5k:  Autocorrelation Function of Manganese

 MTB > ACF C6

 ACF of MN

          -1.0 -0.8 -0.6 -0.4  -0.2  0.0  0.2  0.4  O.6  0.8  1.O
 ^   O-254
 3   O. 167                       XXXXX
 4   0.063                       XXX
 5   0.097                       xxx
 6   0.)67                       XXXXX
 7   0.1 IS                       XXXX
 a   o. j is                       xxxx
 9   0.092                       XXX
10   0.031                       XX
II   -0.024                      XX
12  -O.049                      XX
13  -O.OB6                     XXX
14  -0.139                    xxxx
15   -0, 124                    XXXX
16   -0.173                   XXXXX
17   -0.209                  XXXXXX
IS   -0.046                      XX
                                  XXXXXXXXXXXXXXX
                                  XXXXXXX
Figure 7.51:  Partial Autocorrelation Function of Manganese
           PACF of MN

\
2
3
4
5
6
7
8
9
\Q
1 5
12
13
14
15
16
17
18

O.S61
-0,089
0.091
-0.081
0. 138
O.083
-0 . 040
0.073
-0.023
-0.016
-0.075
-0.022
-0.066
-0. 107
0.000
-0. 141
-O.OS2
0. 179
.0 -0.8 -0,6 -0.4 -0.2 0.0 0.2 0,4 O.6 0,8 1,0
XXXXXXXXXXXXXXX
XXX
XXX
XXX
xxxx
XXX
XX.
XXX
XX
X
XXX
XX
XXX
xxxx
X
XXXXX
XX
XXXXX
7-26

-------
                         Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 7.5m:  Autocorrelation Function of Aluminum
      ACF of AL

                -1.0 -0.8 -0.6 -0.4. -0.2   0.0   0.2  0.4  0.6  0.8  1.0
                  +	+	+	+	-»	+	+	+	+	+.	4
        1   0.386                         XXXXXXXXXXX
        2   0. 177                         XXXXX
        3   0.147                         XXXXX
        4   0.288                         XXXXXXXX
        5   0.051                         XX
        6   0.092                         XXX
        7  -0.043                         XX
        8  -0.007                         X
        9  -0,049                         XX
       10  -0.043                         XX
       11   -0.164                     XXXXX
       12  -0.197                    XXXXXX
       13  -0.210                    XXXXXX
       14  -0.15!                     XXXXX
       15  -0.204                    XXXXXX
       16  -0.215                    XXXXXX
       17  -0.182                    XXXXXX
       18  -0.118                      XXXX
Figure 7.5n:  Partial Autocorrelation Function of Aluminum


     MTB > PACF C7

     PACF of AL

               -1.0 -0.8  -Q.6  -0.4 -0.2  0.0  0.2  0.4  0.6  0.8   1  0
                 +	«.	+	_+	+	+	+	+	+	+	+
       1   0.388                         XXXXXXXXXXX
       2   0.032                         XX
       3   0.081                         XXX
       4   0.239                         XXXXXXX
       5  -0.179                     XXXXX
       6   0.115                         XXXX
       7  -0.167                     XXXXX
       B  -O.OOS                         X
       9   0.003                         X
      10  -0.088                       XXX
      II   -0.064                       XXX
      12   -0.157                     XXXXX
      13   -0.060                       XXX
      14   -0.027                        XX
      15   -0.093                       XXX
      16   -0.046                        XX
      J7   -0.032                        XX
      18   -0.015                          X
                                                                                           7-27

-------
Chapter 7
Box-Jenkins Modeling of Variation in the Seven Variables

On the basis of the above diagnostics, variation in flow was modeled using both the AR (1,0,0)
and MA (0,0,1) models.  Tests of the AR (1) model outcome showed no correlation between the
                            ys.
mean and the AR coefficient (O i). The residual possessed a chi-square of 9.16 with 22 degrees
of freedom (df), yielding a probability of greater than 0.99 that the residual variation is white
noise.  Both the Acf and Pacf of the residuals were free of unusual spikes.  The relationship is
shown in Table 7.5. The residual standard deviation is Oe = 0.356 compared with an original
standard deviation of (7  = 0.373, a small improvement.

Table 7.5:  Equations of Models Fitted to Variables from the Fisher Deep Mine

1a.
1b.
2.
3.
4a.
4b.
5.
6.
FlowAR(l)
Flow MA (1)
Acid MA (1)
SO4MA(1)
FeAR(l)
Fe MA (1)
MnMA(l)
Al MA (1)
zt = 0.336zt_1+ 1.791 +a,
zt= 1. 788 + at + 0.340VJ
zt = at- 0.533at.l
zt = 2.532 + 0.375at.1+at
zt = 0.8502,^ + 0.044 + a,
zt = z,.! + a, - 0.612^.!
zt = z,.! + a, - O.SSlat.j
z, = 0.495 + at + 0.325 Vi
y.
(Je
0.356
0.354
0.119
0.197
0.252
0.219
0.151
0.333
ys.
<7
0.373
0.218
0.212
0.411
0.172
0.354
In the MA(0, 0, 1) model, there is no correlation between the mean and the moving average
coefficient.  The chi-square of the residuals is 8.879 with 22 df, a probability of P > 0.99 against
white noise. (The resulting equation is given in Table 7.5, Ib). The residual standard deviation
is 0.354, which is very close to the AR value of 0.356. The models have similar equations and
similar residual errors.

Variation in acidity requires a first difference.  When an MA (0,1,1) model is fitted, the chi-
square of the residuals equals 22.43 with 23  df. The probability that this would arise from a
white noise series is 0.50 > P > 0.30. The equation is presented in Table 7.5, number 2, and
yields a residual standard deviation of 0.119 compared with an original standard deviation of
0.218, an almost 50% improvement.

The MA (0,0,1) model was fitted to the variation in sulfate concentration. The mean is linearly
independent of the MA coefficient. Chi-square = 14.87 with 22 df, clearly showing (0.90 > P >
0.80) that the residual variation is not different from that of white noise. The equation is shown
7-28

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

as No. 3 in Table 7.5 and the residual standard deviation is Oe =0.197 compared with an
original standard deviation of 0.212, showing little improvement.

Two models were fitted to evaluate variation in iron concentration, an AR (1,0,0) and an MA
(0,1,1). An AR(1) coefficient may be a fair approximation of the first difference in the MA
model. The mean is relatively small, thus the AR(1) model is not as suitable as the MA (0,1,1)
fitted to first differences mean set equal to zero.  The standard deviations are 0.245 for the AR
model and 0.219 for the MA model compared with  <7  = 0.411 for the original variable, an
improvement of nearly 50  percent.

Since manganese varies in a manner similar to iron, the MA model was fitted to the first
differences MA (0,1,1). The chi-square equals 23.82 with 23 df (or 0.50 > P > 0.30, i.e., the
residual variation is likely  to be white noise).  One significant spike at lag 4 remained in the Acf
of the residuals.  The equation of the MA (0,1,1) model is in Table 7.5. The standard deviation
is 0.151 compared with an original standard deviation of 0.172, an improvement of only 10
percent.

Aluminum variation did not require a first difference thus the MA (0,0,1) model was fitted. Chi-
square of the residuals equals 25.71 with 22 df, 0.30 > P > 0.20.  There is no correlation between
the mean and the moving average coefficient. There is a significant spike at lag 4 as in the
manganese model.  The equation is given as No. 6 in Table 7.5.  The standard deviation equals
0.333 compared with  <7 = 0.354 for the original series (a marginal improvement).

These variables appear to show two patterns of variation. The first pattern is simple MA(1)
performance. The second  pattern is  a consistent trend,  usually a decline, with time.  This second
pattern is best matched by  the MA(1) model of the first differences. The effects of the trend are
removed by taking first differences.  In several cases, there is a significant spike at lag 4 in the
Acf of the residuals. However, this single spike is not repeated and there is no seasonal effect.
No further analysis was performed because the large gaps in the  time between observations
prevented any more rigorous analysis.

Quality Control

The appropriate use of quality control (particularly in the form of confidence limits around the
mean or median) is illustrated in Figures 7.2c and 7.2d.  This enables comparison between pre-
remining and post-remining water quality conditions and allows  for differences in sample size.

The two standard deviation limits around the mean are also inserted in the time series plots
(Figures 7.4b to  7.4g). These are confidence limits based on a sample size of one (i.e.,
X ± 2
-------
Chapter 7

Summary

The most important outcome of this analysis is to show that the pattern of variation in these six
variables falls into two groups. The first group (flow, sulfate, and aluminum) appears to be
unaffected by remining.  The second group (acidity, iron, and manganese) shows a marked
improvement after remining begins. This improvement is shown in both means (medians) and
variances.  The means are lower and the variances less after remining began than prior to
remining.
7-30

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 8: Analysis of Data from the Markson Deep Mine Site, Schuylkill
             County, PA

The abandoned Markson Colliery workings are located within the Donaldson Syncline of the
Southern Anthracite Coal Field.  The Markson discharge is located approximately 1.2 miles
upstream from the Rausch Creek Treatment Plant operated by PA DEP in Schuylkill County
(Figure 8.0). This discharge emanates from an airway of the abandoned Markson Colliery, and
is a principal contributor to the total acid load treated at the plant. The flow and water quality
characteristics of the Markson discharge were previously described in  Smith (1988), Hornberger
et al. (1990), and Brady et al. (1998). The data set used in most of those studies and in this
chapter was collected by the Pennsylvania Department of Environmental Protection, Bureau of
Abandoned Mine Reclamation (BAMR) which operates the Rausch Creek Treatment Plant.
BAMR routinely samples and monitors the Markson discharge and another large abandoned
deep mine discharge (Valley View Tunnel discharge) for purposes related to treatment plant
operations. Additional data and  discussion of the flow and water quality characteristics of the
Markson discharge from 1992 to 1999 are contained in Section 5 of the EPA Coal Remining
Statistical Support Document.

The Markson discharge exhibits  water quality characteristics that differ greatly from those of
principal discharges from adjacent mines (e.g., the Orchard Airway discharge from the Good
Spring No. 1 Colliery and the Tracy Airway discharge from the Good  Spring No. 3 Colliery).
The pH of the Markson discharge ranges from 3.2 to 3.7, while the pH of the Tracy  discharge
ranges from 5.7 to 6.5.  The distinct chemical  differences in two discharges from similar
abandoned underground mines in the same coal seams and the same geologic structure are
attributable to stratification of large anthracite deep minepools.  The Tracy discharge is a "top-
water" discharge from a relatively shallow ground water flow system (at elevation 1153 feet),
while the Markson discharge emanates from "bottom water" at a much lower elevation (865 feet)
in the minepool system.  Additional information on the chemical characteristics of stratified
anthracite minepools is found in  Barnes et al. (1964), Ladwig et al. (1984) and Brady et al.
(1998).

The raw data for the Markson discharge are listed in Appendix F. There are 253 observations,
and the assembled set is comprised of data on nine parameters as follows: days; flow; pH;
acidity; iron; manganese; aluminum; sulfate; ferrous iron. Days were calculated as the number
of days between the day an observation was collected and the day the first observation was
collected.

The first step was to adjust the data set for missing observations. The first 147  dates had no
observations for flow.  The original data were, therefore,  subdivided into two sets: the first
consisting of eight variables (253 observations); and the second consisting of nine variables,
including flow (107 observations).  A tenth variable (interval) was added by taking the first
differences among days to determine the regularity of the intervals between observations.
                                                                                     J-l

-------
Chapter 8


Figure 8.0:   Map of Markson Site
                                     ---        '
                     Rausch Creek^lretitm^
                                            '
                                                                         	 ___ _____ i¥Cf__^)  A

                                                                     ^SS**^-^-
                  S^^5^^S4|feI^>\-'^iifra^%W^3V!-tex>''viV''v ''-''I' <-•••-- - ••".'^:-"./- -J-Sr^f/t":^'
        a-'-'-ipt^.'-fr-.-. - TT-^
       j^^^-r ;^ fc;;^4f?: 3
     -f03?,_	wv-^-'-^
                         -r >. "--^ff^w®^''-'•
'.-.....•-,xr^
-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Univariate Analysis

The two data sets (N = 253 and N = 107) were explored initially to determine the shape of the
frequency distributions.  The variables flow, acidity, aluminum, and ferrous iron were considered
to be asymmetric and were transformed to base 10 logarithms. The summary statistics for these
data sets following log transformation are shown in Table 8.1 (N = 107) and Table 8.2 (N =
253).
Table 8.1:    Summary Statistics (N=107)

Intervals
Log Flow
PH
Log Acidity
Total Iron
Mn
LogAI
S04
Log Ferrous Iron
N
106
107
107
107
107
107
107
107
105
N*
1
0
0
0
0
0
0
0
2
Mean
7.0660
3.1250
3.2458
1 .9960
27.770
4.8971
0.2961
272.81
1 .3275
Median
7.0000
3.0892
3.2000
2.0000
27.006
4.9700
0.3191
271.00
1 .3979
Trimmed
Mean
7.0208
3.1089
3.2443
1 .9986
27.680
4.8920
0.2988
272.41
1 .3489
Standard
Deviation
0.7840
0.1950
0.0954
0.0743
9.255
0.9498
0.1625
27.13
0.2583
Standard Error
of the Mean
0.0761
0.0188
0.0092
0.0072
0.895
0.0918
0.0157
2.62
0.0250

Intervals
Log Flow
PH
Log Acidity
Total Iron
Mn
LogAI
S04
Log Ferrous Iron
Minimum
5.0000
2.8791
3.1000
1.5051
6.900
1 .2000
-0.1244
210.00
0.5185
Maximum
14.0000
3.8151
3.5000
2.1818
49.871
8.1600
0.7686
348.00
1 .6758
First
Quartile
7.0000
2.9930
3.2000
1 .9638
21 .500
4.4240
0.1793
253.00
1 .2087
Third
Quartile
7.0000
3.1978
3.3000
2.0414
33.762
5.3100
0.3979
292.00
1.5197
Coefficient of
Variation
—
6.2
2.9
3.7
33.3
19.4
54.9
9.9
75.3
                  N* = Number of missing data points in data set
                                                                                       8-3

-------
Chapter 8

Table 8.2:     Summary Statistics (N = 253)

Intervals
Log Flow
PH
Log Acidity
Total Iron
Mn
LogAI
S04
Log Ferrous Iron
N
252
107
253
252
253
249
246
253
241
N*
1
146
0
1
0
4
7
0
12
Mean
7.333
3.1250
3.2362
2.0515
30.703
5.0439
0.34588
297.87
1 .3656
Median
7.000
3.0892
3.2000
2.0414
28.997
5.1000
0.35622
293.00
1 .4393
Triggered
Mean
7.044
3.1089
3.2209
2.0483
30.410
5.0470
0.35099
296.84
1 .3865
Standard
Deviation
1.910
0.1950
0.1508
0.1123
13.026
0.8351
0.14314
44.85
0.3056
Standard Error
of the Mean
0.120
0.0188
0.0095
0.0071
0.819
0.0529
0.00913
2.82
0.0197

Intervals
Log Flow
PH
Log Acidity
Total Iron
Mn
LogAI
S04
Log Ferrous Iron
Minimum
5.000
2.8791
3.0000
1 .4771
3.810
1 .2000
-0.12436
155.00
-0.1805
Maximum
28.000
3.8151
4.4000
2.5340
63.500
8.1600
0.76864
510.00
1 .8037
First
Quartile
7.000
2.9930
3.1000
2.0000
21 .045
4.6300
0.28358
265.00
1.2015
Third
Quartile
7.000
3.1978
3.3000
2.0197
39.656
5.4200
0.43497
325.50
1 .5798
Coefficient of
Variation
—
6.2
4.7
5.5
42.4
19.8
41.4
15.1
22.4
                      N* = Number of missing data points in data set

The next step was examination of the regularity in the intervals between observations, (i.e., in
Table 8.2; the mean is 7.33 days and the median is 7.0 days). The majority of the observations
were taken at seven-day intervals as expected. The range was from 5 to 28 days, however, and
from the histogram (Figure 8. la) there were seven intervals of eight days, five intervals of six
days, and one interval of five days.  The regularity of these observations is desirable yet
surprising.  It is recommended that determination of the first differences of days should become
a routine procedure for long series of observations in order to examine the regularity of sampling
intervals.

The next step concerns the magnitude of variation as measured by the coefficient of variation in
percent. It should be noted (see Tables  8.1 and 8.2) that when the CV% is calculated [CV% =
(
-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

iron in both data sets. Most of the variation is very low for flow, pH, and acidity (less than
10%). Only iron, ferrous iron, and aluminum have high CV% values. This is largely due to low
mean values rather than high standard deviations.

The frequency distributions of selected variables from these data sets are shown in Figures 8. la
through 8. li.  The number of samples represented in the histograms for each parameter ranges
from N = 106 (interval) to N = 253 (sulfate) depending  on the number of sample results reported
for the corresponding parameter. The histogram for log flow, with N = 107 (Figure 8. Ib), is
skewed right even after taking logs.  The histogram of each data set (N = 252 and N = 107) for
log acidity appears fairly regular with some negative skewness after taking logs (Figures 8.1c
and 8. Id). The data sets of both iron and manganese result in symmetrical histograms (Figures
8.1e and 8. If, respectively). Aluminum is symmetrical  after taking logarithms (Figure 8.1g)
whereas, sulfate is essentially symmetrical without transformation (Figure 8. Ih).  Ferrous iron is
strongly  negatively skewed after taking logarithms (Figure 8.1i),  making further analysis of this
variable suspect.

Figure 8.1a:  Histogram of Interval (N=106)
      Histogram of  CIO    N  =  106
      Each *  represents 2 obs.
                      N* =  1
      Midpoint
               5
               6
               7
               a
               9
              10
              11
              12
              13
              14
Ceynt
     1
     5
    92
     7
     0
     0
     0
     0
     0
     I
*
* * *
*t******#*******************»**************
***«
                                                                                    S-5

-------
Chapter 8

Figure 8.1b:  Histogram of Log Flow (N=107)
  Histogram of FLOW   N  =  10?
Midpoint
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
Count
22
18
3S
15
S
5
4
2
2
1

* *** *
*****
*****
** ** *
*****
*****
#***
*#
**
*
                      tt *******************************
Figure 8.1c:  Histogram of Log Acidity (data set N=252)
   Ht»toer«m «*  ACID    H «
         * r»pr»s*nt* 5  obs,
Midpoint





.5
.0
. ¥
Is
.»
2.O
2.1
1.2
t.S
1.4
t.S
Count
1
0
1
1
IS
112
§7
10
8
2
2
Figure 8.1d:  Histogram of Log Acidity (data set N=107)
   Midpoint
        ,50
        .55
        .60
        .65
        .70
        ,75
       1 .60
       1,85
       1 .90
       1 .95
       2.00
       2.05
       2,10
       2, 15
       2.20
Of ACID

 Count
     1
     0
     0
     0
     0
     0
     0
     0
     i
    30
    31
    26
     fl
     I
     1
                        N = 10?
*********

************!************#**«!**

                  ****.********»*
8-6

-------
                     Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 8.1e:   Histogram of Iron (N=253)
     Histogram of IRON
N
253
Midpoint
i
10
15
20
2S
30
31
40
45
SO
SS
60
65
Count
s
14
24
29
41
34
33
23
IS
18
10
S
1

*****
**************
************************
*****************************
*****************************************
**********************************
*********************************
***********************
****************
******************
**********
*****
*
Figure 8.If:   Histogram of Manganese (N=249)
   Histogram of  MN    N  =  249
   Each *  represents  1  obs.
lint
1.0
1 ,S
2.0
2.5
3.0
3,!
4.0
4.S
5.0
6.5
6.0
t.l
7.0
7.5
i.O
Coynt
1
0
1
2
2
f
It
44
is
58
IS
•
a
4
i

*

*
*
*
****
****
*« **
***»
»»**
****
«**
**
**
*
                     **********************
                     ******************************************
                     *****************************
Figure 8.1g:  Histogram of Log Aluminum (N=246)
.Histogram of  AL    N =  246
 Each * represents  2 obs.
        N* =  ?
Midpoint
0. 1
0.0
0.1
0,2
0,3
0.4
0,5
0.6
0.7
o;s
Coynt
3
7
14
24
69
is
40
i
i
i

**
*#**
*#* *
*!i**
****
****
****
** **
*
*
                               ********** I

                       *********#********************#***#****#
                                                                               8-7

-------
Chapter 8

Figure 8.1 h:  Histogram of Sulfate (N=253)
   Histogram of S04   N  = 253
   Each *  represents 2 obs.
lint
160
200
240
280
•?«n
400
440
480
520
Count
2
2
48
91
Cif\
QIJ
A *}
*»«£
6 «
1 «
0
1 *


***********************
**********************t,]|..),**********.j! + A4:|t#ii%
-------
                        Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Table 8.3:    Correlations Among Variables (N = 107)
 r < 0.195 not significantly different from 0

Log Flow
PH
Log Acidity
Iron
Mn
Log Al
S04
Log Ferrous
Iron
Interval
Days
0.422
0.536
-0.396
-0.174
0.385
0.195
-0.003
-0.338
-0.141
Log
Flow

0.263
-0.259
-0.610
0.048
0.474
-0.165
-0.690
-0.001
PH

-0.191
-0.075
0.169
-0.027
-0.087
-0.175
-0.119
Log
Acidity

0.211
0.089
-0.036
0.264
0.380
0.016
Iron

0.183
-0.402
0.359
0.770
-0.067
Mn

0.584
0.322
0.036
-0.064
LogAI

0.095
-0.424
0.042
S04

0.400
0.010
Log Ferrous
Iron

0.008
Table 8.4:    Correlations Among Variables (N = 253)
 r < 0.15 not significantly different from 0

Log Flow
PH
Log Acidity
Iron
Mn
LogAI
S04
Log Ferrous
Iron
Interval
Days
0.422
-0.002
-0.348
-0.401
-0.036
-0.227
-0.589
-0.334
-0.100
Log
Flow

0.263
-0.259
-0.610
0.048
0.474
-0.165
-0.690
-0.002
PH

0.016
0.180
-0.030
-0.069
0.171
0.079
-0.088
Log
Acidit

0.124
0.174
0.133
0.322
0.140
0.103
Iron

0.177
-0.199
0.455
0.803
-0.020
Mn

0.435
0.289
0.042
-0.058
Al

0.116
-0.268
0.027
SO 4

0.356
0.030
Log Ferrous
Iron

0.018
While it is advantageous to use r2 rather than r as an indication of the strength of any association
between two variables, it is also advisable to examine scatter diagrams to see the relationship
graphically displayed. Because r and r2 are really measures of linear association, in those cases
where r is low (and r2 therefore very low), the graph may show a close curvilinear association.
For example, in Figure 8.2a (sulfate vs. flow), there could be a curvilinear inverse relationship
between the variables although the scatter at high flows (> 3.4) tends to mask it. Both total iron
and ferrous iron are also negatively associated with flow (r2 = 37% in both data sets,  see Figure
8.2b, for example for total  iron).
                                                                                         8-9

-------
Chapter 8


Manganese and aluminum show positive association in both data sets (r2 = 34% and 16% for the
107 and 253 sample data sets respectively, Figure 8.2c). Apart from flow and manganese, there
does not appear to be any meaningful relationship between aluminum and any other variable
(Figure 8.2d).

Manganese and flow do not show a significant positive (i.e., r = 0.048) or inverse relationship
(Figure 8.2e).  Similarly, acidity appears to have no association with any of the other parameters
(e.g., Figures 8.2f and 8.2g).
Figure 8.2a:  Plot of Sulfate vs. Log Flow

  PLOT C8 VS C2
  S04
          -         =*   2                   *
       320*     *      *         *
          -      * *      *
          -           * *   **
          -      42*   *  **   *
          _      2 ******      *     *
       280+      2*2*       *                *
                 32*=" 2**     *
          -             **   »2 * *
                  **,   2 *22*22        *  ** 2    *   *
          -               3* »  *  *   *          it
       240+              *   2   2   *
          -       *
          —                  *  #
            + —,	+	«,	+	^	+.	FLOW
         2.80      3.00      3.20      3.40       3.60       3.80
Figure 8.2b:  Plot of Iron vs. Log Flow
     PLOT C5 VS C2

     IRON
           45+     *   *
                    32*
                    3*        *   *
             „      * 2  *   * *   *
                    *   *  *24*   **    *    *
           30+        *     2***** *
                    ***    *  2 *3**  2 *
                    2     *  *32 2* 2                      '
                       »   * *32 2       «      *
             ^                      *        ** *
           \S-t-                   *****         *
             .                *               *   * #   *
               +	h	+	1.	4.	+ _	FLOW
            2,80      3.00      3,20      3.40      3.60      3.80
8-10

-------
                         Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 8.2c:   Plot of Manganese vs. Log Aluminum
            N* = 7
    PLOT C6 VS C?
          1.0+
                                           *  *        2   *
            ~.                                *      # #  * *
         6,0+               *       »* **3 *  *    »•  *     *
                       *   * *      *  *32*3 53222*22 **2
                          *  3    *  2* *4*95774445432* 2  3  »**«
                      *       *  *2  3*   **3433**354**32  2
                    *         *  *  * 2* 3 2*22 223*2*2 **
         4,0+   *             *   *    *   *2  2
            -       ***    2**      *      *
         2.0+ *
                   0.00      0.16      0,32      3.48       0,64


Figure 8.2d:  Plot of Log Aluminum vs. Sulfate
            N*  =  19
    PLOT C7 VS  C8
    AL
        0.60+                  *   2*   *  *     *
                        *        2*3 6*   2  ***
                *            **»  3 2*25*  *  *  *
                          3   464  22*4*2532 22* * *
                            3*33452*4*2*5226222**
        0.30+            *  2**4*2263**22 34222
                             *2 »32*  *** *   *
                            * 3**3 *  ****
            —    *           2   ** **2   *      *
            ™              *  **    *        *
        O.QQ-f"             *       * *      9
              +	+	+	+	+	+•	304
            140       210       280        350        420        490
Figure 8.2e:  Plot of Manganese vs. Log Flow
   PLOT C6 VS  C2


        8.0+
           »*            *
   MM      -                  «
           -      2        *
           _            *                   *
        6,0+               2
           ™      2*2   * »*2      *   *      *
           «      *42*2  **2  *2*2      *** ** *
                 9* 2 »22  2« * 2
        2,O»                *

           _                 *

          2.8O     3,00      3.2B      S,4Q      3, BO      3. »O                            8-11

-------
Chapter 8

Figure 8.2f:   Plot of Log Acidity vs. Log Flow
  PLOT C4 VS C2

     2,20*                    *

  ACID    -     2     *   *
         -    * +    #2*  * 2
               522*  **  »  2*   2        *
     2.00+     *2«   »3*53»22»2 *  *» «    2
               2 2*  * 3*»33**«  *•  *    *
      T .80*
     1 ,60+
           t	+	_«.	^	+	+	FLOW
        2.10      3.00     3.20     3.40      3.60      3.80
Figure 8.2g:  Plot of Log Acidity vs. Iron

    PLOT  C4 VS C5

    ACID

                                  *
        2,45+                                             * *
            —                                                 *
                            2*     *                    *
                    *  *      *             3   **
            -           *4  * *     *  *        *  * 2        *
        2.10+    * *    *2  * 2  3224232323»  *2*22*22 *33» 2*2 22  »
                    ***2****4244322827 42643522**22  42»2 *2       *
                   »   ** 2 2**232253242   3*2  *2** * * *
            -          2****        ***      *
                                                           *
        1 . 75-1-          *
        I .40+
              +	+	*	+	, +	+	IRON
              0        12        24         36         48        60

Since there does not appear to be any very strong relationship among these variables, they were
examined in pairs using cross-correlation functions (CCF). The outcomes are summarized in
Table 8.5. There do not appear to be any discrepancies between the correlation coefficient and
the cross-correlation coefficient results [i.e., the relationships at zero lag (Tables 8.3 and 8.4) and
at any other lags (Table 8.5)]. There are wide regions of the CCF that are above the 0.2 limits
demonstrating that, for the most part, interrelationships among these variables are weak to
almost non-existent. The only two that stand out are the relationships between flow and iron and
iron and ferrous iron (and therefore, between flow and ferrous iron).
8-12

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Table 8.5:  Cross-correlations Among Variables
No.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
Variables
pH vs. Flow
Acid vs. Flow
Iron vs. Flow
Mn vs. Flow
Al vs. Flow
SO4 vs. Flow
Ferrous Iron vs. Flow
Acid vs. pH
Iron vs. pH
Mn vs. pH
Al vs. pH
SO4vs. pH
Ferrous Iron vs. pH
Iron vs. Acid
Mn vs. Acid
Al vs. Acid
SO4 vs. Acid
Ferrous Iron vs. Acid
Mn vs. Iron
Al vs. Iron
SO4 vs. Iron
Ferrous Iron vs. Iron
Al vs. Mn
SO4vs. Mn
Ferrous Iron vs. Mn
SO4 vs. Al
Ferrous Iron vs. Al
Ferrous Iron vs. SO4
r
0.305
- 0.382
- 0.676
- 0.334
0.475
- 0.502
- 0.740
- 0.321
-0.175
0.288
0.232
0.296
- 0.227
0.226
- 0.287
-0.214
0.29
0.373
- 0.283
- 0.562
- 0.458
0.776
0.584
0.326
-0.281
0.419
- 0.478
0.473
lag @ r max
-6
-5
-2
12
0
-8
-1
-5
-12
13
-20
11
-13
9
13
13
2
0
14
3
20
0
0
-1
-16
17
-3
4
r > 0.2
-8 to 7
- 1 3 to 0
- 16 to + 6, 15 to 20
-21 , -1 1 , 7 to 20
- 1 3 to 1 3
- 1 8 to - 1 , 6 to 20
-15 to 6, 16 to 20
-16,-8,-7,-5to-3, -1,2, 3
none
-14, 6 to 13
-20
8 to 14
-14to-12, -10, 2
0, 4,9
13
13
-17, -4 toO, 2
-2 to 6, 9, 10
-20, 3, 14,16(018,20
- 20, -9 to 1 7
-20, -18 to 2, 12 to 20
-1 1 to 1 3
-20to-15, -5 to 1
-2to1
-20to- 15,- 10
- 20 to - 1 0, -8, -7, 5, 6, 9 to 20
-16 to 8
-20 to -8, -2 to 13
Time Series Analysis

One of the most striking features of the time series plots (Figures 8.3a through 8.3f) is the
limited variation which they show.  Another feature of interest is the unusual variation shown by
log flow.  The major flow event from February 1986 to July 1986 is worth noting and does not
appear in any of the other graphs. This emphasizes a lack of relationship between the
parameters.  Clearly, there appears to be no associated variation among the parameters, except
for the pair that includes iron and ferrous iron.
                                                                                     8-13

-------
Figure 8.3a: Time series Plot of Flow
      2,5  -4
           140
180
200
220
240

-------
Figure 8.3b: Plot of Manganese
 E
 C.
 to
 DJ
 C
 a
8  -I


7





5  -I

                                                           	 Median (Md)

                                                           --- C.I.aroundMd(N'=l)

                                                           — C.I. around Md(N'=12)
                           20
                                      40
60
80
100
120
                                                                                                                                              &

                                                                                                                                              a
                                                                                                                                         a
                                                                                                                                         a
i
§•
a
                  8


             ,-.  7



              I  6



             I  5

              I   4
              O3
              C   7
              (0   O
                       2  -

                                                                                     Median (Md)

                                                                              	C.I. around Md (N'=l)

                                                                              	C.I. around Md (N'=12)
                                                                                                                                              TO

                                                                                                                                              b
                                                                                                                                              a'


                                                                                                                                              I
                                                                                                                                              s-
                                                                                                                                              TO
                                                                                                                                              I
                                   140
                                              160
       180
      200
     220
    240
                                                                                                                                              §
                                                                                                                                              t>
                                                                                                                                              o
                                                                                                                                              a
                                                                                                                                              a.

-------
Figure 8.3c:  Time Series Plot of Sulfate
                                                                                                                                            s
                                                                                                                                            oo
     500
     400  —
 O)
 E
13   300  -
OT
     200
                                                               Median (Md)
                                                               C.I. around Md (N'=l)
                                                        ------- C.I. around Md(N*=12)
                                                                          	C.I. around Md(N'=12)
140
!60
                                                                  180

-------
Figure 8.3d: Time Series Plot of Iron
                                                                               	C.I. around Md (N'=l)

                                                                               	C.I. around Md(N'=12)
                                                           60
80
100
120
                                                                                                                                            a
                                                                                                                                            o
i
I
       200
        220
                                                                                                                   Median (Md)

                                                                                                            	C.I. around Md 0^=1)

                                                                                                            	C.I. around Md(N'=12)
        240
                                                                                                                                            I
                                                                                                                                            S'

                                                                                                                                           I
                                                                                                                                            s-
                                                                                                                                            TO
                                                                                                                                            I
                                                                                                                                            §
                                                                                                                                            o
                                                                                                                                            o
                                                                                                                                            a,

-------
oo
I
K^
oo
            Figure 8.3e: Time Series Plot of Aluminum
                  0
             _j
                                                                                                   Median (Md)

                                                                                                   C.I. around Md(N'=l)

                                                                                                   C.I. around Md(N'=12)
S
                                                                                                                                                         oo
                                        20
                                                                                                                        	C.I. around Md(N'=12)
                                                                                                                                 240

-------
Figure 8.3f: Time Series Plot of Acidity
       2.5  -
  O)





  I   2.0

  O
  D)
  O
       1.5  -
                                                            Median (Md)

                                                            C.I. around Md(N'=l)

                                                            C.I. around Md(N'=12)
                                                                                 80
                                          100
                                           120
                        2.5
                    O)

                    E
                    TO
                    "o
                        2,0  -
                         1,5
                                                                                                       -' j~ iit3\>?^ lirf'-^il-'
                                                                                                       v>fW\ij _ 11 tp*4K 2 O/ « ^a    *Vi

                                                                                      	 Median (Md)

                                                                                      	C.I. around Md (N'=l)

                                                                                      	C.I. around Md (K=12)
                                                                                           &

                                                                                           a
                                                                                           a
                                                                                           o
                                                         i
                                                         I
                                                                                           §
                                                                                           a'
                                                                                           TO

                                                                                           I
                                                                                           a'

                                                                                          I
                                                                                                                                                  a-
                                                                                                                                                  TO
                                       140
160
180
200
220
240
                                                                                                                                                  o
                                                                                                                                                  a

                                                                                                                                                  t>
                                                                                                                                                  o
                                                                                                                                                  o
                                                                                                                                                  a.

-------
Chapter 8

Quality Control Applied to the Variables

Two measures of quality control are calculated and summarized in Tables 8.6 and 8.7; in Table
8.6 the conventional two standard deviation limits around the mean are given for each of the
eight variables (these limits are presented, for example, in Figure 8.3a).

Table 8.6.  Quality Control Limits  x ± 2(7
Parameter
Log Flow
PH
Log Acidity
Iron
Mn
LogAI
S04
Log Ferrous Iron
Mean j^
3.125
3.246
1.996
27.770
4.897
0.296
272.81
1.328
yv.
Standard Deviation (7
0.195
0.095
0.074
9.285
0.950
0.163
27.13
0.258
2
-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

limits of a confidence interval (C.I.) Around the median, based on Tukey's non-parametric
formula of:

                       Median ± [1.96 (Q t - Q 3) 1.25 / (1.35 V7V7 )

Two values of N' are used, (namely,  N' = 1 and N' = 12) in Figures 8.3b, 8.3c, 8.3d, 8.3e, and
8.3f.

Many of the observations, which fall outside the limits when N' = 1,  are single observations and
need no activity to explain the exceedance. The longer areas of departure in flow (Figure 8.3a)
are due to natural events and presumably, are not related to mining activity.  It would be
expected that this extreme and long-term departure would be reflected in the variation of the
other parameters, but this is not the case.

With nearly all other parameters it appears that the control limits are somewhat tight and that
most of the variation outside of the control limits is irregular and of short duration (e.g.,
manganese in Figure 8.3b and sulfate in Figure 8.3c).

Iron shows two relatively long term, mostly positive deviations beyond the control limits (Figure
8.3d) in the period up to observation 80 (February 20, 1986). These deviations are not repeated
in later observations. Sulfate (Figure 8.3c) also extends beyond the upper limits for the first 30
observations (i.e., before February 28,  1985) and appears to decrease with time.  The behavior of
aluminum within the quality control limits (Figure 8.3e) is similar to manganese and sulfate.
However, the aluminum values drop below the lower limit for most of the observations from 185
to 225. Acidity (Figure 8.3f) shows little variation, with a few isolated peaks extending outside
the upper limits.  In three cases (October 1983, March 1984 and June  1984) consecutive results
exceeded the upper limit.

Model Identification

Identification of appropriate models is  performed by using the autocorrelation (Acf) and partial
autocorrelation (Pacf) functions of the  eight parameters. There are three types of functions
which can be characterized by appearance. The first type shows a strong steady decline from a
high value. Flow (Figure 8.4a), iron (Figure 8.4b), aluminum (Figure 8.4d), and ferrous iron are
examples of this type. All these parameters show a large spike at lag  1 in their Pacf (see iron in
Figure 8.4c and aluminum in Figure 8.4e).  These characteristics imply the parameter possesses a
trend which must be removed before further time series analysis. Removal may be achieved by
taking first differences. An example can be demonstrated using days, which increase in value
throughout the period of observation.  Taking first differences results  in random walk
characteristics (Figure 8.4f). Thus, the first difference is sufficient to  make this parameter
stationary.

The second type of function shows a less pronounced decline (e.g., pH (Acf in Figure  8.4g)). The
Pacf of pH, however, shows a pronounced spike at lag 1, and it too requires taking first
                                                                                     8-21

-------
Chapter 8

differences to become stationary (Figure 8.4h). Manganese (Acf in Figure 8.4i) is similar to pH
except its Pacf does not have a pronounced spike at lag 1 (Figure 8.4j). Therefore,  an AR model
                                     yv.
may be suitable. The first coefficient (O i) may suffice for the first difference.

The third type of function is represented by  sulfate which appears to show a trend as well as
some irregularities (Acf in Figure 8.4k). Before the irregularities can be evaluated the strong
spike in the Pacf at lag 1 must be reduced (Figure 8.41).

Figure 8.4a: Autocorrelation Function of Flow

   ACF  C2
   ACF  of FLOW
            -1.0  -0.8 -0.6 -0.4  -0.2  0.0  0.2  0.4  0.6  O.B  1.0
1
2
3
4
5
6
7
8
9
10
1 J
12
13
14
15
16
17
18
19
20
0.889
0.784
0.736
0.691
0.616
O.S17
0,460
0.432
0.412
0.354
Q.29J
0.248
0. 164
0.083
0.001
-0.077
-0. 132
-0. 164
-0. 181
-0.205
XXXXXXXXXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxxx
xxxxxxxx
xxxxxxx
xxxxx
XXX
X
XXX
xxxx
xxxxx
xxxxxx
xxxxxx
8-22

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Figure 8.4b:  Autocorrelation Function of Iron
       €5

  ACF of IRON

             -1 ,O -O.8
1
2
3
4
5
6
?
8
9
1O
I 1
12
13
\A
15
18
17
18
19
2O
21
22
23
2«4
2S
O
0
O
O
O
O
O
O
O
O
O
O
O
0
O
O
O
0
-O
-O
-O
-0
-O
-0
— Q
,762
,699
,65B
.603
.570
,558
,
.469
.442
. 4O1
,379
.327
. 277
.200
, 168
, 1O1
. O4S
,02O

.
.076
. 1 IS
,141
. 124
.
                         O.6 -Q.4 -O.2  O.O  O.2  O.4  O.6  O,8

                                           XXXXXXXXXXXXXXXXXXXX
                                           XXXXXXXXXXXXXXXJCXX
                                           XXXXXXXXXXXXXXXXX
                                           xxxxxxxxxxxxxxxx
                                           XXXXXXXXXXXXXXX
                                           XXXXXXXXXXXXXXX
                                           XXXXXXXXXXXXXX
                                           XXXXXXXXXXXXX
                                           XXXXXXXXXXXX
                                           XXXXXXXXXXX
                                           XXXXXXXXXX
                                           xxxxxxxxx
                                           XXXXXXXX
                                           xxxxxx
                                           XXXXX
                                           xxxx
                                           XX
                                           XX
                                           X
                                         XXX
                                         XXX
                                        xxxx
                                      XXKXX
                                      XXXXX
Figure 8.4c:   Partial Autocorrelation Function of Iron
PACf
PACf


1
2
3
4
5
i
7
8
i
10
11
12
13
14
15
16
17
18
19
20
CS
of
-1.0
+ —
0.699
0.245
-0,065
0.185
O.Gi2
0,030
0.011
0.024
O.OOi
0,133
0.090
-0.105
-O.O15
-0.229
-0. 117
-0.034
-0,055
-0,051
0.018
0.021


-O.i -0.6 -0.4 -0.2 0.0 0,2 0.4 0,6 0.8 1.0
.__+___ — 1.__ — >+— — •+• — — — + — , — +. — — , — |, 	 ,-+, 	 ,—jf.— — ,_+
xxxxxxxxxxxxxxxxxx
xxxxxxx
XXX
xxxxxx
XXX
XX
X
XX
X
xxxx
XXX
xxxx
X
xxxxxxx
xxxx
XX
XX
XX
X
XX
                                                                                     8-23

-------
Chapter 8

Figure 8.4d: Autocorrelation Function of Aluminum
   ACF C7

   ACF of At
     1
     2
     3
     4
     s
     8
     7
     a
     t
    10
    11
    12
    18
    14
    IS
    16
    1?
    li
    ii
    20
    21
    22
    23
    24
    25
-1.0 -O.I
0.516
0,494
0,424
0.321
0.370
0.235
0.260
0.28?
Q.22S
0.23i
0.112
0.174
0. 143
O.OiS
0.024
0.012
0,002
O.OiS
-0.041
-0.039
-0.07!
-0.051
-D.07I
-0 . 053
-0.097
i -0.6 -0.4 -0.2 0.0 0.2 0,4 0.6 0.8 1,0
XXXXXXXXXXXXXX
XXXXXIXXXXXXX
XXXXXXXXXXXX
XXXXXXXXX
XXXXXXXXXX
xxxxxxx
xxxxxxx
xxxxxxxx
xxxxxxx
xxxxxxx
xxxxxx
xxxxx
XXXKX
XXX
XX
XX
X
X
XX
XX
XXX
XX
XXX
XX
xxx
Figure 8.4e:   Partial Autocorrelation Function of Aluminum
 f»A€F C7
 PACF of AL
           -1.0 -§.• -0.6 -0.4  -0.2

   1    0.511
   2    0.308
   3    0.235
   4    0.126
   i    0.101
   i   -0.152                       XXXXX
   1   -0.094                         KXM
   8    0.096                           XXX
   i    0.040                           XX
  !0    0.098                           XKK
  11   -0.028                          XX
  12   -0.032                          XM
  13   -O.OT?                         XXM
  14    0.025                           MX
  15   -0.086                         XXX
  16    0.026                           XX
  17-0.017                           X
  1§    0.145                           XXXXX
  IS   -0.068
  20   -0.029                          XX
0.0  0.2  0.4  0.6

 xxxxxxxxxxxxxx
 XJCXXXXXXX
 MXXXXXM
 XXXM
 XXXX
O.I
8-24

-------
                     Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 8.4f:   Autocorrelation Function of Intervals
 ACF  CIO

 ACF  Of CIO

1
2
3
4
5
6
7
8
g
10
1 1
12
13
14
15
16
17
IB
19
20
21
22
23
24
25
-1.0
-0.022
0.097
0.018
0.044
0.083
-0.048
0. 104
0.015
-0.007
-0.026
-0.065
0.066
-0.063
-0.009
-0.015
-0.034
-0.012
-0.019
-0.044
-0.021
-0.035
0.001
-0.015
0.01 1
-O.OOB
-0.8-0.6-0.4-0.2 0.0 0.2 0.4 0.6 0.8 1.0
XX
XXX
X
XX
XXX
XX
XXXX
X
X
XX
XXX
XXX
XXX
X
X
XX
X
X
XX
XX
XX
X
X
X
X
Figure 8.4g:  Autocorrelation Function of pH
    ACF  C3

    ACF  of PH

              -1.0 -0.8 -0.6 -0.4 -0.2  0.0  0.2   0.4  0.6  0.8   1.0

      1    0.372                           XXXXXXXXXX
      2    0.290                           XXXXXXXX
      3    0.321                           XXXXXXXXX
      4    0.219                           XXXXXX
      5    0.173                           XXXXX
      6    O.M9                           XXXX
      7    0.176                           XXXXX
      8    0.176                           XXXXX
      9    0.108                           XXXX
     10    0.059                           XX
     11    0.078                           XXX
     12   -0.015                           X
     13   -0.054                          XX
     14   -0.074                         XXX
     15    0.019                           X
     16   -0.019                           X
     17   -0.068                         XXX
     18   -0.081                          XXX
     19   -0.036                          XX
     20   -0.016                           X
     21   -0.058                          XX
     22   -0.037                          XX
     23    0.015                           X
     24    0.049                           XX
     25   -0.117                        XXXX
                                                                               8-25

-------
Chapter 8

Figure 8.4h:  Partial Autorcorrelation Function of pH
       C3
  PACF of PH

            -1.0 -O.S -0.6. -0.4 -0.2  0.0   0.2   0.4   0.6  0.8  1.0

    1   0. 7i4                          XXXXXWXXXXXXXXXXXX
    2   0.12?                          XXXX
    3   0,230
    4  -0.04S                         XX
    i  -0.1 IS                       XXXX
    i   0,030                          XX
    7  -0.031                         XX
    a  -0.02S                         xx
    9   0.070                          XXX
   10  -0.1A                       XXXX
   11   0.003                          X
   12  -0.124                       XXXX
   13  -0,103                       XXXX
   14  -0.012                          X
   IS  -0.029                         XX
   i§   0.020                          x
   IT  -0.172                      XXXXX
   IS   O.OS4                          XX
   1ft   0.033                          XX
   20   0.2i3                          XXXXXXX
Figure 8.4i:
ACF C6
ACF of UN
_
I 0.523
2 0 . 399
3 0.301
4 0.252
5 0.223
8 0.20S
7 0.19S
8 0.140
9 O.OSZ
10 0.009
11 0.072
12 0.026
13 0.092
14 0 . 049
IS 0.047
16 0,058
17 0.109
It 0,175
1ft 0.171
20 0.199
21 0.127
22 0.187
23 0.128
24 0,151
25 0.157
Autocorrelation Function of Manganese


1,0 -O.i -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 O.S 1.0
XXXXXKXXXXXXXX
xxxxxxxxxxx
xxxxxxxxx
XXXXXXX
XXXXXXX
xxxxxx
xxxxxx
XXXX
XX
X
XXX
XX
XXX
XX
XX
XX
XXXX
XXXXX
XXXXX
xxxxxx
XXXX
xxxxxx
XXXX
xxxxx
XXXXX
8-26

-------
                     Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 8.4j:   Partial Autocorrelation Function of Manganese
       C6
  PACF of MN
    1
    2
    3
    4
    5
    6
    7
    s
    9
   10
   It
   12
   13
   14
   II
   16
   17
   18
   IS
   20
     -1.0 -0.8 -0.fi  -0.4 -0.2

 0.402
 0.232
-0.022
 0.070
 0. 108
-0,021
 0.027
 0.07S
-0.091
  0,0  0,2  0.4  0,8   0.8  1.0

   XXXXXXXXXKX
   XXXXXXX
  XX
   XXX
   XXXX
  XX
   XX
   XXX
 XXX
-0.100
 0.126
-0.107
 0.07B
 0.035
-0.004
 0.059
 0. 12?
 0.076
-0.035
 0.161
 XXX
   xxxx
xxxx
   xxx
   xx
   x
   xx
   xxxx
   xxx
  XX
   xxxxx
Figure 8.4k:  Autocorrelation Function of Sulfate
 ACF C8
     of  SO4
   1
   1
   3
   4
   5
   6
   7
   B
   9
  10
  11
  It
  13
  14
  16
  IS
  1?
  IS
  19
  20
  21
  22
  23
  24
  25

0.
0,
0.
0.
0.
0,
0,
0.
0.
0,
0.
0,
0,
0.
0,
0.
0,
0.
0,
o.
0.
0.
0,
0.
0.
-1.0 -0.8 -0.6
550
56?
584
505
46i
401
414
392
2S3
931
2?g
151
271
Z86
186
280
i$S
222
2i5
138
215
IBB
126
174
1S7
-0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1,0
XXXXX-XXXXXXXXXX
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxx
XXXKXXXXXXXXX
xxxxxxxxxxx
xxxxxxxxxxx
xxxxxxxxxxx,
xxxxxxxx
xxxxxxxxx
xxxxxxxx
xxxxxxxxxx
xxxxxxxx
xxxxxxxx
xxxxxxxx
xxxxxxx
XXXXXX
xxxxxxx
xxxxxxxx
xxxx
XXXXXX
XXXXXX
xxxx
xxxxx
xxxxx
                                                                              8-27

-------
Chapter 8

Figure 8.41:  Partial Autocorrelation Function of Sulfate

        ci
        of SO4

1
2
3
4
5
6
?
S
fl
to
It
12
13
14
IS
16
1?
18
It
20
-1.0 -0,4
0.683
0. 180
0. 172
o.osa
-0.055
0.056
0.075
O.OOS
-0,082
-0,041
0,009
0.07S
-0. 169
-0. 104
0.092
-0.120
-0.096
-0.021
-0.097
-0.017
0.0 0.2 0.4 0.6 0.8 1,0
xxxxxxxxxxxxxxxxxx
XXXXX
XXXXX
XX
XX
XX
XXX
X
XXX
XX
X
XXX
XXXXX
XXXX
XXX
XXXX
xxx
XX
xxx
X
Model Fitting to Selected Variables

Since many of the parameters show similar types of variation in terms of their Acf and Pacf
representations, only some parameters were submitted to full time series analysis. The outcomes
are summarized in Tables 8.8 and 8.9. Table 8.8 summarizes the tests performed on each model
fitted to each parameter.  Table 8.9 presents the models as equations relating values at time t to
previous values of the parameters or its associated shock term (random error term, i.e., a^).

There are only 107 observations of flow, and although first differences are indicated by the steep
and steady decline of the Acf and the single large spike at lag 1 in the Pacf, it was decided to try
an AR (1) model to determine whether the AR coefficient  O i was an adequate proxy for the first
                                                       yv.
difference. When an AR (1) was fitted to this variable, both O i and the mean were independent
(see r = 0 in No.  1, Table 8.8). The chi-squared value has a probability between  0.20 and 0.10 of
not being different from white noise.  There were many spikes left in the Acf of the residuals but
an AR (1) would suffice as a first approximation. It seems likely that the MA (0,1,1) would be a
superior model.

Two models were fitted to the variation in pH. The first was an AR (1,0,0) model. While
parameter estimates were independent (r = 0, No. 2. Table 8.8), the chi-square of the Acf of the
residuals was significantly different from white noise. There was only one significant spike at
lag 3. The second model was an AR (1,1,1) which could more adequately represent variation in

8-28

-------
                        Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

pH.  The AR and MA coefficients were not, however, independent (r = 0.64, No. 3 in Table 8.8).
The residual Acf yielded a chi-square value that is not significantly different from that of white
noise.  This model is indeed adequate to represent the series and there were no significant spikes
in the Acf or Pacf of the residuals. The residual standard deviation showed a small improvement
over the original standard deviation of the series (0.140 and 0.139 versus 0.151, see Table 8.8).

Table 8.8: Tests of the Different Models for Each Parameter

N
1.
2.
3.
4.
5.
6.
7.
8.
9.
10
11
12
13
Parameter
Flow
PH
PH
Iron
Iron
Mn
Mn
Mn
S04
S04
S04
S04
S04
Model
AR (1,0,0)
AR (1,0,0)
AR/MA (1,1,1)
MA (0,1,1)
AR (1,1,0)
MA (0,0,1)
AR (1,0,0)
MA (0,1,1)
MA (0,1,1)
AR (1,0,0)
AR (1,0,0)
AR (1,1,0)
AR (2,1,1)
ra
0
0
0.64
0
0
0
0
0
0
0
[3]b
88°
>0.7d
Chi-sq
29.16
34.85
25.11
17.17
27.68
38.27
28.44
30.79
41.05
108.09
89.70
32.72
32.85
d.f.
22
22
22
23
23
26
22
23
23
22
23
21
21
P
0.20 > P>0.10
0.05 > P > 0.02
0.30 > P > 0.20
0.90 > P > 0.80
0.30 > P > 0.20
0.02 > P > 0.01
0.20 > P>0.10
0.20 > P>0.10
0.02 > P > 0.01
<0.001
<0.001
0.05 > P > 0.02
0.05 > P > 0.02
Spike
many
1@3
none
none
1@2
1@2
1@2
1@10
@3,6,9,12
@1,2,3+
many
@6.9
@6,9,12
Standard Deviation
Resid.
630.43
0.140
0.139
8.069
8.349
0.902
0.875
0.721
33.01
36.96
46.90
32.83
32.72
Original
931.14
0.151
13.026
0.835
44.85
44.85
 a. Correlation among the parameter estimates.
 b. This [3] = a seasonal @ lag 3.
 c. r12 = 0.88; r13 =0.4; r23 = -0.24
 d. All coefficients highly positively correlated (i.e., redundant). >0.7
Acidity was considered to be sufficiently similar to pH and for this reason, not require any
special testing from the Acf and Pacf of the original series. From the Acf and Pacf and from
previous model fitting, an AR (1,1,1) or the MA (0,1,1) would be likely to adequately represent
this variable.

Iron showed the same steep decline as flow as well as the same large spike at lag 1 in the Pacf.
For this reason, two models were fitted to variation in this parameter. The MA (0,1,1) easily met
all tests (see No. 4 in Table 8.8) and is an adequate representative model.  The AR (1,1,0) met
most of the tests, but not as successfully (see probability  for chi-square, No.  5, Table 8.8). There
also remained a significant spike at lag 2 in both the Acf and Pacf of the residuals from fitting
the AR model.  There was a large reduction in the standard deviation compared with its value in
regards to the original series (see No. 4, Table 8.8). Ferrous iron was not analyzed because it
resembled total iron so closely that similar results would  be expected.
                                                                                       8-29

-------
Chapter 8

Manganese appears to possess a mixture of the characteristics of iron and pH in its Acf and Pacf.
For this reason, three models were tried. The simple MA (0,0,1) was not adequate in terms of
the chi-square value of the Acf of the residuals (No. 6, Table 8.8).  It also possessed a significant
spike at lag 2.  When a simple AR (1,0,0) model was used, the chi-square of the residuals was
not significantly different from that of white noise and the standard deviation of the residuals
was reduced from that of the simple MA. There was still a significant spike at lag 2 in both the
Acf and Pacf of the residuals.

As a check, a simple MA (0,0,1) was fitted to the first differences of manganese (i.e., an MA
(0,1,1)).  The results were similar,  although the spike at lag 2 disappeared and a weak spike
appeared at lag 10. This was determined to be too far along the Acf to be ignored.  The standard
deviation of the residuals was much improved 0.721 (No. 8, Table 8.8); over 76% improvement
over the original series (Table 8.1 wherein the standard deviation is 0.9498 from N = 107).

Sulfate also showed a mixture of types in its Acf and Pacf and for this reason, was explored at
greater length; first the MA (0,1,1) was fitted because this model appears to fit many cases in
previous reports. The chi-square of the residuals was significantly different from that of white
noise. There were many spikes in  the Acf and Pacf at lags 3, 6, 9, and 12 implying  a seasonal
repetition. Next, an AR (1,0,0) was fitted to see how much of the trend shown by the large spike
at lag 1 in the Pacf, could be reduced (No. 10, Table 8.8). The chi-square value was very large
(P«0.001) and there were many spikes at various lags in the Acf and Pacf of the residuals.

The next model applied was an AR (1,0,0) with a seasonal three term. This model proved
ineffective because the chi-square  value of the residuals remained very large (probability
«0.001 see No. 11, Table 8.8). The next step was to apply AR (1,1,0) with a seasonal AR of
lag 2. Lag 2 of the first differences is equivalent to lag 3 in the  original series (No.  12 in Table
8.8). This result was a strong improvement in the value of chi-square, but was still  significantly
different from that of white noise (0.05>P>0.02); the Acf and Pacf possessed spikes at lags 6 and
                                                        yv.   yv    yv
9. Finally, an AR (2,1,1) was applied and all the coefficients (Oi, ®2 ,  Os ) were highly
correlated and showed strong redundancies (No. 13, Table 8.8).  The chi-square was essentially
the same as in the previous model  and possessed significantly large spikes at 6, 9 and 12.

It is important at this stage, to examine what the results of these models mean  in terms of
equations. When a suitable model is found, the coefficients should have some implications of
substantive value.  In most cases, a simple model possessing an equation that is easy to interpret
is adequate.  An AR (1,0,0) such as that for flow or pH is an example. As the  models become
more complicated, interpretation of the equations becomes more difficult. Unless there are
definite reasons that a more comprehensive model would be appropriate, it is prudent to use a
reasonably simple model.

The equations for the models used in Table 8.8 are summarized in Table 8.9.  The first eleven
equations are relatively simple.  The last two equations, however, are obviously complicated,
and  it was decided to stop the analysis at this stage. It is suspected that the simple AR or MA
models of the first differences are sufficient to represent most of the parameters.  However, a
seasonal model of some kind is required for sulfate. From the Acf and Pacf of the original

8-30

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
series, an AR model of first differences is likely to be most parsimonious.  It will require an
additional seasonal term (possibly an AR at lag 3) to remove the remaining significant spikes.

Table 8.9:    Model Equations for the Variables (see Table 8.8)
No.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Variable
Flow
PH
PH
Iron
Iron
Mn
Mn
Mn
S04
S04
S04
S04
S04
AR(1)
AR(1)
AR/MA (1,1,1)
MA (0,1,1)
AR (1,1,0)
MA (0,0,1)
AR (1,0,0)
MA (0,1,1)
MA (0,1,1)
AR (1,0,0)
AR (1,0,0)
AR (1,1,0) (2,0,0)
AR/MA (2, 1,1)
Equation
Z, = 0.742 Z,., + 1487.2 + a,
Zt = 0.372 Z,., + 3.235 + a,
Z,= 1.121 Z,., - 0.121 Zt.2-0.808a,.,
Z, = Z,.1+at-0.525at.1
Z,= .629 Z,., + 0.371,.2 + a,
Z, = 4.90 +a,+. 270 a,..
^=0.404^., +4.913 + a,
Z, = ZM+at + 0.611 a,.,
Z, = Z,.1 + a, + 0.725 a,..
Z, = 0.550 Z,., + 296.9 + a,
^=0.572^., + a,
Z, = 1 .1 18 Z,., - 0.941 Z,.8 - 1 .281 Z,.3- 0.153z,.4 +a,
Z, = 1.380Z,., + .188Z,.2+ .182Z,.3+a,-0.424at.1
Summary

The first important characteristic of the variables from the Markson site is the lack of wide
variation except in the flow variable.  The second characteristic is the lack of any strong
relationships between pairs of variables.  The only high r values are the expected correlations
between iron and ferrous iron, flow and ferrous iron, and flow and iron. The iron ferrous iron
association is positive, whereas with flow, both are negative (i.e., high  flows may lead to dilution
of iron).  The most striking feature of the time series graphs is the high flow over the period of
February 1986 to July 1986, particularly because the flow is in logs.  In general, this does not
show up in any other variable. There are two very peculiar features which should be
emphasized. First, there does not appear to be any reflection of this high flow event in most of
the other variables (iron and ferrous iron  are exceptions); second, pH shows no relationship to
acidity or sulfate.

The most appropriate time series models  require a first difference to remove any trend. For
many cases, this may be adequately accomplished by an AR (1) term in the models.  The
residual, after fitting this kind of model, is a close approximation to white noise (i.e., random
variation).  The residual could, in most cases, be modeled fairly adequately by the usual MA
(0,1,1) moving average model.  This implies that once the trend is removed, the remaining
variation is similar to a random walk. This could account for whatever relationships there are
                                                                                    8-31

-------
Chapter 8

between the variables, and suggests that linear correlation is not adequate to evaluate the
relationships that do exist between the variables. The trend could well be due to the extreme
event in the flow variable.

Sulfate however, does appear to show some possible indications of a seasonal pattern. It seems
to possess some irregularities which go beyond the "random walk" type of residual. A number
of models were tried in this case but none did any better than the MA.  Nevertheless, there were
many spikes in the Acf of the residual from the first differenced series at what appear to be
regular intervals  of lags 3,  6, 9, and 12.  This implies a seasonal structure at three period
intervals (which  is a four period interval in the original series). The more complex models failed
one or more of the test criteria and rather than complicate the issue further, the analysis was
terminated.
8-32

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Chapter 9: Statistical Summary and Review of Quality Control Limits

Establishment of baseline pollution loads for a coal remining permit requires proper sampling
and chemical analysis of pre-existing abandoned mine discharges, and the appropriate statistical
analysis of flow, water quality, and pollution load data. The term "proper sampling"  is taken in
two contexts: (1) collection and analysis of surface water and groundwater samples, including
field measurements of flow and water quality parameters, sample preservation, transportation
and storage,  and chemical analyses,  and (2) collection of a sufficient number of samples with
sampling period duration and intervals that adequately represent the variations in flow and water
quality throughout the water year.  Abundant scientific literature exists on collection and
analytical procedures for water samples. Guidelines and protocols for water sample collection
from EPA, the U.S. Geological Survey (USGS) and other sources are compiled in Table 9.1, and
are discussed briefly in Chapter 1.

Table 9.1:    Guidance and Protocols For Water Sample Collection
#
1
2
3
4
5
6
7
8
Type of
Resource
Field
Procedures
Field
Operations
Manual
Monitoring
Guidance
Procedures
Protocols
Sampling
Techniques
Sample
Preservation
Title
National Field Manual for the
Collection of Water Quality Data
EMAP Surface Waters Field
Operations Manual for Lakes:
June, 1997EPA/620/R-97/001
Office of Water NEP Monitoring
Guidance EPA-842-B-92-004
Procedures for Handling and
Chemical Analysis of Sediment
and Water Samples. EPA/CD-
81-1
National Water-Quality
Assessment (NAWQA) Method
and Guideline Protocols
Ground Water Sampling
EPA: A Workshop Summary
Nov. 30- Dec. 2, 1993.
EPA/600/R-94/205
Publications on Techniques of
Water Resource Investigations
Fixing Water Samples Bureau of
Mines and Reclamation ID#
562-3200-203
May 1, 1997
Source
USGS
EPA
EPA
EPA/ US
Army Corps
of
Engineers
USGS
EPA
USGS
EPA/
Bureau of
Mining and
Reclamation
HTML
http://h2o.usgs.gov/owq/Fieldproced
ures.html
http://www.epa.gov/emjulte/html/pu
bs/docs/surfwatr/97fopsman.htm
http://www.epa.gov/OWOW/estuarie
s/guidance/
http://www.epa.gov/owgwwtrl/info/
PubList/monitoring/docs/027.pdf
http://wwwrvares.er.usgs.gov/nawqa
/protocols/doc_list.html
http://www.epa.gov/swerustl/cat/gw
wkshop.pdf
http://water.usgs.gov/owq/FieldMan
ual/chapterl/twri.html
http://www.dep.state.pa.us/dep/subje
ct/All Final Techinal guidance/bmr
/562-3200-203.htm
                                                                                    9-1

-------
Chapter 9
#
9
10
11
Type of
Resource
Sampling
Sampling
Sampling
Title
Quality -control design for
surface-water sampling in the
National Water-Quality
Assessment program (USGS
Open File Report 97-223)
Ground- Water Data-Collection
Protocols and Procedures for the
National Water-Quality
Assessment Program: Collection
and Documentation of Water-
Quality Samples and Related
Data (USGS Open-File Report
95-399)
Field Guide to Collecting and
processing samples of stream-
water samples for the National-
Water Quality Assessment
program (USGS Open File
Report 94-458)
Source
USGS
USGS
USGS
HTML
http://wwwrvares.er.usgs.gov/nawqa
/protocols/doc_list. html
http://wwwrvares.er.usgs.gov/nawqa
/protocols/doc_list.html
http://wwwrvares.er.usgs.gov/nawqa
/protocols/doc_list.html
Most of this report and EPA's Coal Remining Statistical Support Document (EPA-821-B-00-
001) are devoted to discussion of the second context (sample period duration and interval) of
proper sampling of pre-existing discharges and to the associated statistical analyses of the
sample data.

The baseline pollution load is essentially a statistical summary of a data set generally consisting
of 12 or more samples collected prior to issuance of a remining permit.  Chapter 2 of this report
provides an overview and explanation of exploratory and confirmatory statistical methods that
may be used in establishing the baseline pollution load. The fundamentals of univariate,
bivariate, multivariate, and time-series statistical analyses also are outlined in Chapter 2. The
algorithm for analysis of mine drainage discharge data (see Figure 3.1) developed in 1987 by Dr.
J.C. Griffiths and other authors of this report is described step by step in Chapter 3, and also is
included in Chapter 1 of the Coal Remining Statistical Support Document. This algorithm was
used in conducting the univariate, bivariate and time series analyses of the six relatively long
term mine drainage data sets described in Chapters 4 through 8 and Appendices A through F of
this report.  Chapter 5 of the Coal Remining Statistical Support Document contains  an additional
10-20 years of data on some of these six sites including data collected prior to, during, and post-
remining.

The sampling plan, data collection/organization and statistical  analysis components of
establishing the baseline pollution load should be integrated in a continuous process. In general,
abandoned mine discharges flow continuously, thus, it should not be difficult to collect an
adequate number of samples.  However, these discharges frequently exhibit significant variations
in flow and water quality, and logistical problems may be encountered in attempting to capture
9-2

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

the full range and distribution of seasonal variations. Ideally, there are no missing data, and a
sufficient number of samples are collected throughout the water year, at equal sampling intervals
that are small enough to capture the range of natural seasonal variations.  Continuous flow
recorders and automated water quality samplers may be part of that ideal world, but they are
rarely available or justifiable for use in remining permitting activities. Typically in routine
remining permit sampling, adjustments must be made in data organization and analysis to
account for missing data, unequal sampling intervals, data that are not normally distributed or
that lack  expression of the true extremes, and other problems.

This chapter summarizes the findings of the statistical analyses of abandoned mine discharge
data contained in Chapters 4 through 8 and Appendices A through F of this report.  This
summary includes examples of sampling plans, data organization, univariate analysis, bivariate
analysis and time series analysis, with emphasis on the practical applications of the time series.
The chapter concludes with a review of the use of quality control limits for establishing and
monitoring baseline pollution load at remining sites.

Sampling

The sampling plan is critical in all statistical studies and is one of the most difficult problems to
resolve.  One problem is the usual compromise between the samples one would like to collect
and the cost of collecting them. From a research point of view, to perform a time series analysis
that correctly models the variation of a parameter (e.g.,  flow), it is necessary to obtain
observations over several years so that the model  becomes truly representative.  Such large
collections of data are rare and the six long term data sets presented in this report are both
atypical and best-case scenarios.

Another requirement that is critical for time series analysis is that the samples should be
collected at equal time intervals.  This criterion is almost impossible to achieve in routine
sampling practice.  For example, when an extreme event occurs, it is usually for at most a few
days, and the common sampling intervals of one week, two weeks, or one month could easily
miss the event.  Secondly, if the event is a heavy snowfall or a flood, it may be physically
impossible to access the sample location. The data analyzed for the studies presented in this
report address these problems and other causes of unequal intervals and missing or erroneous
data (e.g., loss of sample, incorrect data entry).

It is advisable to establish a sampling plan that recognizes these difficulties.  It is also essential
to examine the data  in detail, as described in the earlier chapters of this report.  It should be
recognized that because of the nature of a typical  data set, a rigorous statistical analysis must not
be taken too far; one must compromise by being as accurate as possible without requiring
impossible precision. (It is, theoretically, always  possible to measure the degree of precision by
replicate  sampling although, in practice, replicate sampling may be too costly).  The following
guidelines are, therefore, a compromise and are presented as recommended guidelines only.

Sampling should be representative, cover a period of at least one year, and include both high and
low flow periods within that year.  Suppose 12 samples are taken at a rate of one per month  for a

                                                                                       9-3

-------
Chapter 9

year.  This scenario may not adequately represent baseline conditions because local extreme
storm events typically occur within a few days and can result in a great range in variability
between monthly samples. Extreme events are often missed with this sampling arrangement.

One recommendation for representative sample collection within the Appalachian Basin would
be to use stratified sampling; divide the year into three periods of about equal length, arranged to
cover high and low flow periods as follows:
January - March
high flow
90 days
(91 days during leap year)
April - June
intermediate flow
91 days
September - November
low flow
91 days
The months of July, August, and December are eliminated from this recommended scenario
because these months typically don't include extremes and include events covered during the
other three periods. Taking one sample every 15 days within each of the three intervals would
equal a total of 18 samples. Of course, to determine initial baseline pollution loading, it is
preferable to increase the number of sample intervals and to extend the sampling period for more
than a single year.

Data Preparation and Organization

It is always advisable to examine raw data before submitting it to analysis. The presence of
unusual values and missing data usually require some kind of action.  These and other features of
the data set are best examined by graphical procedure. A graph of discharge or log discharge in
gallons per minute versus days can be very helpful in identifying data gaps and unusual values
(e.g., Figure 9.1).
9-4

-------
                        Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Figure 9.1: Example Graph Log Discharge versus Days (Also Figure 4.2)
     Vertical lines indicate more than 20 days between sampling events

     Jan 1980                   Apr 1981
     2.0
Apr 1982
                           —x—Arnot 001
                           -••-•Arnot 003
                           —O—Arnot 004
                       Apr 1983
                                           40       50
                                             Sample
Figure 9.1 can be used to observe two kinds of information:

1)  Missing values.  The distribution of missing values is critical to more sophisticated analysis
    (particularly, time series). In general, a few missing values are not very serious, but if there
    are many and if they occur in clusters (Figure 9.1), the omissions may make further analysis
    very inexact.

    Missing values frequently occur during extreme events because during these events, sample
    sites are difficult to access. Sometimes, if the missing values are few and widely distributed,
    they may be replaced by the means (if the frequency distribution of the data is reasonably
    symmetrical), or by the median (if the frequency distribution of data is extremely skewed).

    In Chapters 7 and 8, a frequency distribution of the first differences between days of
    observation was constructed. Once constructed, both the number and the concentration
    density of missing observations was clearly displayed as the frequency of intervals of
    different lengths between observations. The variation for the Fisher site is from one day
    (difference = 0) to an interval of 104.  The mean (26.7 days) is very nearly equal to the
                                                                                         9-5

-------
Chapter 9

   median (26.5 days), thus, the distribution is roughly symmetrical around the expected
   sampling interval of 28 days.  The central 50 percent of the distribution (Qx - Q3) lies
   between 12.3 and 33 days.  The most serious discrepancies are, however, that there are five
   observations between 70 and 104 days (four of these are 90 days or more).  These large gaps
   in the data preclude rigorous time series analysis which requires a very close approximation
   to equal intervals between observations.

2) Extreme Values.  The second kind of preliminary observation is to examine the data for
   extreme values (usually on the high side). Again, the distribution of extremes is important.
   Prior to examination of this data, it was believed that extreme flows would occur at regular
   seasonal intervals, for example, during the Spring melt. However, examination of the data
   presented in Figure 4.2 shows that extreme events were spread over periods from February to
   April (for Spring melts) and from May through June (for intense summer rains, often as
   thunderstorms). These wide spreads of extreme events, together with missing data (which
   often occurred during extreme events), made it very difficult to detect any expected true
   seasonal effects.

   One further point concerning extremes, is the fact that these extremes tend to introduce
   strong skewness (asymmetry) into the frequency distribution. This skewness is usually
   positive (i.e., extreme values are at the high end of the data distribution). It is conventional
   to apply a transformation to reduce this skewness, and logarithmic transformation is usually
   the most effective. It is sometimes questionable, however, to what extent the effects  of
   extreme events should be suppressed if at all.  Thus, it is prudent to examine the raw  data
   very carefully to decide whether transformation is appropriate.

   Another effect of expressing variables in logs instead of concentration is shown in Figure
   4.8, where manganese (mg/L) is plotted against log transformed discharge (cubic feet per
   second, cfs).  There is an obvious linear association between the two variables.  If discharge
   is expressed arithmetically in cfs (see Figure 4.9), the association is curvilinear.  However,
   there is still a strong association between the variables. Note also that there are several
   outliers that appear to deviate from the trend. Expression of the log transformed data tends
   to suppress the  effects of extreme outliers.

Univariate Analysis

The main features of the univariate statistical  analyses described in Chapters 4 through 8 are the
frequency distributions of the water quality parameters and flow measurement data,  and the
tables of summary  statistics (e.g., Tables 4.1,  5.1, 6.1, 7.2, and 8.2). These tables typically
include the following summary statistics: number of observations (N), number of missing
observations (N*),  mean, median,  10 % trimmed mean, standard deviation,  standard error of the
mean, minimum and maximum values (i.e., range) and quartiles.  Several of these summary
statistics are included in Table  1.2a of EPA's  Coal Remining Statistical Support Document and
are incorporated as conditions of remining permits (i.e., median, range, and quartiles).
9-6

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

An additional statistic, the coefficient of variation (CV) is included in the tables in Chapters 4-8.
The coefficient of variation, usually expressed in percent (CV%), is defined as the ratio of the
standard deviation to the mean multiplied by 100. This is a useful approximate guide to the
degree of variation in a parameter.  In general, a CV < 30% represents a stable, in control
variable. In Chapters 4 through 8, most of the parameters showed much larger variation,
principally because of the effects of extreme events.  Use of the coefficient of variation with log
transformed data may result in extreme distortion because the transformation leads to a mean of
small value, resulting in a divisor of the ratio that is small and thus a CV that is inflated.

In Chapters 4 through 8, the frequency distributions of many water quality and pollution load
variables were found to be normally distributed, or at least symmetrically distributed, around a
value of central tendency (see for example, Figures 4.5 and 8. le).  Numerous other variables had
frequency distributions that exhibited positive skewness.  In Figure 5.3a, for example, there are
two single observations for discharge at 50 and 80-85 gallons per minute which represent
extreme events in flow. These values introduce a strong positive skewness in the histogram
towards high values.  In Figure 5.3b, discharge is transformed to log flow and the skewness is
now towards the negative side  (i.e., the transformation has over-corrected for positive skewness).
In such cases, it is best not to log transform the data.  Acidity (mg/L, Figure 5.4a) is somewhat
symmetrical and, as would be expected, log transformation introduces a strong negative
skewness (Figure 5.4b). Again, no transformation should be used.

It is possible of course, to use a less pronounced transformation (such as the square root of the
variable) that may avoid the over-correction that can  result from logarithmic transformation.
The use of various transformations is reviewed by Tukey (1977, Chapter 3), Velleman and
Hoaglin (1981, p. 46-49), and Box and Cox (1964).

Bivariate Analysis

Bivariate analysis is used to examine the relationship between pairs of variables.  One expects,
for example, pH and  acidity or sulfate to be inversely related (as acidity increases pH declines).
In the case of calcium and manganese, on the other hand, one expects positive correlation (both
either increase or decrease together).  The correlation coefficient (r) is used to represent the
(linear) relationship between any pair of variables.  The coefficient of determination (r2),
however, is a better measure of the intensity of the association between a pair of variables.  For
example, an r = 0.7 seems large because the range of r is from -1 to +1. However, r = 0.7 means
that r2 = 0.49, or that there is 49% in common between the two variables, with 51% of the
variation "unexplained" by the association.  For example, it would be necessary to have an r >
0.8 (i.e., > 64% in common) to claim that a strong association exists.  (See Chapter 8 for
additional discussion)

Another feature that can be evaluated using r and r2 is the statistical test that accompanies a
specific value of r. For example, the probability statement that for a sample size of N = 174 (see
Chapter 6), a value of r > 0.124 is significantly different from zero at the 5 percent probability
level, should be accompanied by the corresponding value of r2. In Table 6.3, the correlation
coefficient between pH and acidity (r = -0.365) comfortably exceeds the r (+/-) 0.124, thus, it is

                                                                                      9-7

-------
Chapter 9

statistically significant. Nevertheless, the corresponding r2 = 0.133 indicates that only 13.3% of
the variation is common to both variables.

Bivariate analysis of the Ernest site data also showed a strong association between all pairs of
the load variables (r2 > 80%, see Figures 6.5a, b, and c).  This clearly suggests that because
discharge is used as a  common factor in converting concentration to load, it tends to overwhelm
the relationships among the other variables. This problem with pollution load variables also was
detected in the analysis of data from the other sites described in Chapters 4 through 8.

In Figures 6.5a and 6.5c, the variation between the parameters increases as their values increase.
This phenomenon is called heteroscedasticity and, in general, it is advisable to plot the logs of
the values to make them homoscedastic. Since heteroscedastic parameters show a difference in
variability with change in values, no probability statement should be made without
transformation to make the variables homoscedastic. Peculiarly, the change from
heteroscedasticity to homoscedasticity does not lead to a major change in the  value of r.
However, it does make the probability statements more reliable.

One more avenue was explored during bivariate analyses in Chapters 4 through 8, and that was
to determine whether there is any lag in association between parameter pairs.  The cross-
correlation function is used for this purpose. The cross-correlation function calculates the linear
association between observations 0 to t days apart, and thus gives an indication of when the
association is strongest. The range oft is from -{ V N +10} to { V N +10},  where N is the
number of observations in the series.  For example, if an  event occurs that affects  one parameter
immediately and affects another parameter five observations later, the linear correlation
coefficient may be quite low at zero lag but may show a strong association after a five day lag.

Bivariate statistical analysis of data from the Fisher site (Chapter 7) can be used as an example
of the use of the cross-correlation function.  The correlation coefficients of zero order for each
pair of variables are given in Table 7.3. The zero order value of r = 0.663 for acid versus iron
was the highest correlation between any of the water quality parameters. The zero order
correlation coefficient for iron and manganese is r = 0.396, and this is the maximum value.  The
maximum correlation  coefficients and corresponding lag values from the cross-correlation
functions are summarized in Table 7.4.  Few are meaningful, and most are barely significant.
This indicates that the degree of association was correctly represented for these variables by their
conventional  zero order correlation coefficients (Table 7.3).

Time Series Analysis

There are two fundamental aspects to the time series analyses described in Chapters 4 through 8
and Appendix A:

1) use of a simple time series plot of the data for a particular water quality or flow variable,
   with or without quality control limits, to assist in evaluating patterns of variation through
   time (essentially an exploratory data analysis step), and
9-8

-------
                    Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
2) application of the full Box-Jenkins time series analysis and model building procedures (see
   Figure 9.2).

Figure 9.2:   Flow Chart for Box-Jenkins Time Series Analysis
                              Identification

                     Determining  a tentative model
                                Estimation

                      Finding parameter  values  for
                                      fit.
                                D lagnostics

                    Determining  If the  model  Is
                    adequate.
                                    Yes
                                Forecast ing
                         Predicting  future  values
                                                                           9-9

-------
Chapter 9

Time series analysis begins with a plot of the observations against time (days or dates).  This plot
is a simple outcome and can give helpful guidance to the type of time series that is represented
by the variation in the data. Furthermore, the quality control limits, either some suitable multiple
of (2 or 3 times) the standard deviation or, in this report, a non-parametric substitute for the
standard deviation (e.g., confidence intervals around the median):

   = Md± 1.96 [1.25 R/ (1.35 V^7]
   Where, R = the Interquartile Range (after McGill, R., et al., 1978).

With this application, outliers plotting beyond the confidence limits are easily seen, and the
arrangement of the outliers may be either irregular (occurring as unique individuals) or
systematic (e.g., periodic). Examples are given in Figure(s) 8.3.

When the  plotting procedure is complete, analysis may continue using standard Box-Jenkins
Time Series modeling.  There is also an exhaustive Box-Jenkins procedure may be applied if
there is a suitable computer package available.  The main advantage of the exhaustive Box-
Jenkins analysis is that very thorough testing may be performed as an automatic procedure at
each stage in the analysis.  The exhaustive procedure is described in many textbooks (e.g., Box
and Jenkins, 1970, Nelson, 1973, and Vandaele, 1983) and the package of computer programs
for pursuing the step by step analysis is also readily available in many computer systems
programs (e.g., Dixon's BMDP Manuals (after 1980)).

A flow chart for Box-Jenkins time series analysis is provided in Figure 9.1. The first step is to
identify a tentative model and to improve on the model by iteration through the procedure, until
a more satisfactory model is found. The global model is called an ARIMA model or an
Autoregressive Integrated Moving Average Model. This family of models may be summarized
for convenience as an AR (autoregressive) or MA (moving average) model. A back-operator is
defined as Bzt = ZM where zt is the set of observations taken at various equally-spaced values oft
(time).  An autoregressive model may be represented as AR (1,0,0) which stands for an
autoregressive model of order (1) with no differences (0) and no moving average terms (0); an
analogous series is the MA (0,0,1). This permits extensions to AR (2), ARI (2,1,0) etc. and
similarly for the MA models MA (2), IMA (0,1,2)  etc.  Seasonal models  may be included as, for
example, an ARIMA (1,1,1) (1,0,1), which represents a first order ARIMA model, together with
first order seasonal autoregressive and moving average terms (Box and Jenkins, 1970, p. 322).

The basis for identification of a suitable model is the autocorrelation function (Acf) and the
partial  autocorrelation function (Pacf) of the observations. It is assumed  that the series is
stationary (i.e., the observations are free of trend).  If a trend is present, it is typical to take first
differences of the observations and to analyze zl_l instead of zt In practice, it is rare to require
second differences, but they are available if needed. This is where the back-operator (Bzt = zt_j)
is useful and is why a differenced series is called integrated.  The form of the Acf and Pacf is
usually adequate to determine an appropriate model and one may then proceed to the estimation
stage.
9-10

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

The variables flow, acidity and acid-load from the Ernest Refuse Pile data were chosen as
examples of the use of Acf and Pacf in selecting a preliminary model.  Flow shows a steady,
almost straight line decline in Acf values over the first 15 lags, implying the presence of a strong
trend (Figure 6.8c).  This is confirmed by the corresponding Pacf which consists of a large
overwhelming spike at lag 1 (Figure 6.8d). It is advisable to take first differences to remove the
effect of the trend. After differencing, a first order MA (0,1,1) fits the series adequately.

The Acf for variation in log transformed acidity is entirely different in appearance, possesses at
least three significant peaks at lags  1, 2, 3, and is otherwise reasonably featureless (Figure 6.8e).
The corresponding Pacf shows only two spikes at lags 1 and 2 (Figure 6.8f). An MA (0,1,2)
model was tried and found to be over-identified (i.e., possessed too many coefficients).  For this
reason an MA (0,0,2) was fitted and found adequate.

When log transformed acid-load was examined, the Acf and Pacf were almost identical to their
equivalents for flow (compare Figures 6.8 c and d with 6.8 g and h). There is little doubt that
flow dominates the variation when the variable is converted from concentration to load using
flow as the divisor.

After complete analysis using a variety of models, it was concluded that the first order MA
(0,1,1) was the most parsimonious and appropriate model for the Ernest site, and showed no
significant departures  from what was expected after stringent testing.  The  form of the equation
is: zt = a, - 0.247 at_j with the coefficient 9  being from log acid load.

The Markson site data presented in Chapter 8 and Appendix F provides the best example of the
full range of the Box-Jenkins time series analysis. The steps in the analytical procedure shown
in Figure 9.1 are followed using sulfate data because it was one of the few  parameters where a
seasonal component appeared to be present (although never finally identified).

Identification of a tentative  model was made through the Acf and Pacf of sulfate in Figures 8.4k
and 8.41. The MA (0,1,1) was chosen as a starting model because the Pacf had a single large
spike (Figure 8.41) and because this model was, in general, the most suitable for many other
parameters at different sites. It was then necessary to test the residuals (i.e., the deviations of
observed values from those of the fitted model). The Acf of the residuals yielded a chi-square of
41.05 with 23 degrees of freedom leading to a probability that a chi-square value as high as the
one observed arising from white noise equals 0.01
-------
Chapter 9
Table 9.2:     Acf of the Residuals from Fitting an MA Model to the Original Observations
              After Taking a First Difference: SO4
Lags 1-8
Standard Error
Lags 9-1 6
Standard Error
-0.07
0.06
-0.17
0.07
0.04
0.06
-0.03
0.07
0.13
0.06
-0.12
0.07
0.02
0.06
0.10
0.07
-0.01
0.06
-0.07
0.07
-0.12
0.06
0.01
0.07
0.00
0.07
0.05
0.07
0.01
0.07
-0.01
0.07
Mean = -1.972; Standard Error = 2.076; N = 252
* Spikes beyond the 16th lag unlikely to be real. Chi-square = 41.05; 0.01 2 Standard Error = 66.02)
Observation Numbers of Significant Residuals
< Expected
33
80
104
122
> Expected
45
87 ,90
103
121
9-12

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Continuing with model fitting and diagnostic testing, the next step is to examine the estimators
of the parameters.  For an IMA (0,1,1), there are two estimators: the coefficient of the noise term
at.j ( 9 = 0.725), and the overall residual standard deviation. Calculated 95% confidence limits
for$  are 0.639 and 0.811, clearly confirming that the coefficient is real because the interval does
not contain 0 or 1.

A number of potentially appropriate models were fitted to see if a suitable model could be found.
The results of model fitting for the Markson site data are summarized at the bottom of Table
8.8).  The best candidate was the IMA (0,1,1).  All other models had notable failures of one or
more diagnostic tests. This outcome implies that the first differences (ZM) of the original
observations (zt) represent a random walk.  The seasonal effect appears to be too weak to show a
positive response.

With the exception of the final step (forcasting or predicting future values), the time series
examples from the Ernest and Markson sites discussed above provide a summary of the Box-
Jenkins procedures listed in Figure 9.1. The last step was attempted using all the data sets
presented in this report without great success. Results of this attempt using the Clarion site
sulfate data are presented in Chapter 5. The reasons for this, described below, are characteristic
of the six abandoned mine drainage data sets analyzed in this report.

Given the model, it is necessary to estimate the parameters for best fit.  Diagnostics are applied
to determine if the model is adequate and may also be used to compare different models to select
the most appropriate. Finally, predictions or forecasts  may be made of future values based on
the selected model. This last step was shown to be of little value because extreme events inflated
the confidence limits around the forecasts and thus, were not useful. There are many alternative
extensions of the analytical time series procedure that can be followed, but because of extreme
events, and because of difficulties with missing data and unequal values oft, it was considered
imprudent to pursue the analysis further.

Quality Control Limits

The main objective of this study was to perform a statistical analysis (i.e., univariate, bivariate,
and time series analyses) of numerous, long term abandoned mine drainage data sets in order to
provide the foundation for developing and implementing a  simple quality control approach for
routine baseline pollution load analyses for remining permits.  The six data sets included in
Chapters 4 through 8 and Appendixes  A through F of this report contain a greater number of
samples (N) for a longer duration (and in some cases a tighter sampling interval) than typical
remining permit baseline pollution load data sets. In addition, the statistical analyses in these
chapters are more rigorous and exhaustive (see Figure  3.1) than intended for routine use in
remining permits.  However, much was learned from the statistical analyses of these six data sets
(particularly the time series analyses) that can be applied to the use of quality control limits in
establishing baseline pollution load and monitoring variations in the pollution load.
                                                                                      9-13

-------
Chapter 9

Two examples of the many variables in the six data sets are illustrated in Figures 8.3.  Figure
8.3c indicates a variation in sulfate from the Markson site over a period of 253 days.  Variation
in total iron is shown over the same period in Figure 8.3d.  As a guide to this "long range"
variation, quality control limits are inserted in both graphs.  One set of those limits consists of
the conventional mean {X) and the range between plus and minus two standard deviations (±2
  yv
 
-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

In Figure 8.3c for sulfate, the two standard deviation limits emphasize the nature of the variation.
Variation in sulfate starts out above the mean and above the upper confidence belt, but gradually
declines with time until beyond observation number 135. Beyond observation 135, variation
tends to remain within the confidence belts, and after the 230th observation, variation remains
around the lower confidence belts.  The quality control limits help to indicate this gradual
decline despite the wide variation.  The first 35 observations are persistently above the upper
quality control limit, implying that  some treatment of the discharge is necessary. Departures
such as the 80th and 105th observations, on the other hand, are isolated events and no action is
required.

The same features appear in the graph of total iron (Figure 8.3d).  The earlier observations (to
about 80) are mostly above the mean and around the upper quality control level.  From 80
onwards, variation remains below the mean and is lowest beyond  the 230th observation. In both
graphs, there are some large gaps of missing observations.

In setting up baselines, and in subsequently using the baselines to judge the variation in any
particular parameter, the sample size is always one so that only the conventional spread of two
standard deviations and the equivalent spread measured by the interquartiles around the median
are  relevant.  In this case the relationship: Md± [1.96 (1.25 R/ (1.35 ^N' }] with N'  =1
reduces to Md ± (1.815R) and the calculations for sulfate and total iron are outlined in Table 8-7.

These calculations are presented to show the orders of magnitude  of the different quality control
limits.  The rather large difference in the spreads around the mean and the median for ferrous
iron (Tables 8.6 and 8.7), is essentially due to the strong negative  skewness of the logs of ferrous
iron. This example clearly shows that the non-parametric spread around the median is more
suitable for these data. Little is lost if the distribution is symmetrical and much is gained if the
data are either positively or negatively skewed.

Conclusions

The main objective of this study was to perform a statistical analysis  (i.e., using univariate,
bivariate, and time series approaches) of numerous, long  term abandoned mine drainage data sets
in order to provide the foundation for developing and implementing a simple quality control
approach for routine baseline pollution load analyses for  remining permits.

Sample Collection

Establishment of baseline pollution loads for a coal remining permit requires proper sampling
and chemical analysis of pre-existing abandoned mine discharges, and the appropriate statistical
analysis of flow, water quality, and pollution load data.
    The term proper sampling means the collection of a sufficient number of samples for a
    duration and at approximately constant intervals that  adequately represent the variations in
    flow and water quality throughout the water year.
                                                                                     9-15

-------
Chapter 9

•   Sampling should be representative, cover a period of at least one year, and include both high
    and low flow periods within that year.
•   One recommendation for representative sample collection within the Appalachian Basin
    would be to use stratified sampling; divide the year into three periods of about equal length,
    arranged to cover high and low flow periods.

Discharge Variability

These pre-existing discharges frequently exhibit significant variations in flow and water quality,
and logistical problems may be encountered in attempting to capture the full range and
distribution of seasonal variations.  There are two types of variation in pollution load that are of
interest in evaluating monitoring data during and after remining to determine whether the
variations are out of control compared to the established baseline conditions.
    The first and most obvious pattern of variation occurs when there are a series of extreme
    events, which consistently exceed the upper control level. This variation pattern indicates a
    sudden and dramatic increase in pollution load which may be attributed to remining, and
    which is referred to as the dramatic trigger.
•   The second pattern of variation of concern is a trend of gradually increasing pollution load,
    where the general pattern of pollution load observations is increasing above the baseline
    central tendency value over time without exceeding the upper control level. As this second
    pattern of variation is much less dramatic than the first, and takes much more time and effort
    to detect, it is referred to as the subtle trigger.  The reason that these two patterns  of variation
    are referred to as triggers is that they can be used to initiate  the requirement for a  mine
    operator to treat a pre-existing discharge to a numeric effluent limit. If fair and reasonable
    consideration is given to the concerns of the mine operator and protection of the
    environment, the treatment triggers must be carefully established so that they are:  (a) not set
    off prematurely or erroneously, adversely affecting the mine operator, or (b) set off too late
    resulting in additional mine drainage pollution without treatment.

Data Set - Initial Evaluation

The baseline pollution  load is essentially a statistical summary of a data set generally consisting
of 12 or more samples  collected prior to issuance of a remining permit.  In routine sampling for
remining permits, adjustments must be made in data organization and  analysis to account for
missing data, unequal sampling intervals, and data that are not normally distributed or that lack
expression of the true data extremes.
•   It is always advisable to examine raw data before submitting it to statistical analysis.  The
    presence of unusual values and missing data usually require some  kind of action.  A graph of
    concentration versus time or discharge or log discharge in gallons  per minute versus days can
    be very helpful in identifying data gaps and unusual values. Missing values frequently occur
    during extreme events because during these events, sample  sites are difficult to access.
•   Another kind of preliminary evaluation is to examine the data for extreme values  (usually on
    the high side). The  wide spreads of extreme events, together with missing data (which often
    occur during extreme events) may make it very difficult to detect any expected true seasonal
    effects.

9-16

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Univariate Analysis

•   The main features of the univariate statistical analyses are the frequency distributions of the
    water quality parameters and flow measurement data, and the tables of summary statistics
    (e.g., Tables 4.1, 5.1, 6.2, 7.2 and 8.2).
    The frequency distribution is a graphical summary of the sample data.  Its shape and
    accompanying summary statistics enable a greater understanding of how a parameter
    behaves.  The normal distribution (shown in Figure 2.2) is the most widely known and most
    useful frequency distribution.  It is also known as the bell-shaped curve.
•   A major problem that is frequently encountered in the statistical analysis of water quality
    parameters is that the sample data are not normally distributed because it is typical to have
    many small valued observations in the data set and a few very large values representing
    extreme events.  Extremes tend to introduce strong skewness (asymmetry) into the frequency
    distribution. This skewness is usually positive (i.e., extreme values are at the high end of the
    data distribution). It is conventional to apply a transformation, commonly  logarithmic, to
    reduce this skewness (See Figure 5.3a).  However, it is prudent to examine the raw data very
    carefully to decide whether data transformation is appropriate.
•   The frequency distributions of many water quality and pollution load variables (Chapters 4
    through 8) were found to be normally distributed, or at least symmetrically distributed,
    around a value of central tendency (see for example, Figures 4.5 and 8. le). Numerous other
    variables had frequency distributions that exhibited positive skewness.
•   An additional univariate statistic, the coefficient of variation (CV) is included in the Tables
    in Chapters 4-8.  The coefficient of variation, usually expressed in percent (CV%), is
    defined as the ratio of the standard deviation to the mean multiplied by 100. This is a useful
    approximate guide to the degree of variation in a parameter. In general, a CV<30%
    represents a stable, in control variable. In Chapters 4 through 8, most of the parameters
    showed much larger variation, principally because of the effects of extreme events. Use of
    the coefficient of variation with log transformed data may result in extreme distortion
    because the transformation leads to a mean of small value, resulting in a divisor of the ratio
    that is small and thus a CV that is inflated.
    One additional parameter of interest  is the number of days between sampling events. This
    should be approximately constant, because any outlying results could distort relationships
    between other parameters.

Bivariate Analysis

Bivariate analysis is used to examine the relationship between pairs of variables.
    The correlation coefficient (r) is usually used to represent the (linear) relationship between
    any pair of variables. The coefficient of determination (r2) is, however, a better measure of
    the intensity of the association between a pair of variables. For example, r = 0.7 looks large
    because the range of r is from -1 to +1, but it means that r2 = 0.49 or 49%  of the variation is
    common to  the two variables and therefore, 51% of the variation is "unexplained" by the
    association. It is necessary, therefore, to realize that one needs r > 0.8 to claim that  a strong
    association  exists; i.e., > 64% in common.

                                                                                      9-17

-------
Chapter 9

•   Generally, the correlations between concentration parameters were not strong, except for
    those that are known to be related (e.g., pH and acidity, total and ferrous iron).
•   Bivariate analysis of some data sets (e.g., Ernest site data, Chapter 6) showed a strong
    association between all pairs of the load variables (r2>80%, see Figures 6.5a, b and c). This
    clearly suggests that because discharge is the common factor in converting concentration to
    load, it tends to overwhelm the relationships among the other variables.
•   Heteroscedasticity occurs when the variation between the parameters increases as their
    values increase (see Figures 6.5a and 6.5c). In general, to correct for heteroscedasticity, it is
    advisable to plot the logs of the values to make them homoscedastic and to calculate
    correlations using log-transformed values.
•   Cross-correlation analysis is performed to determine whether there is any lag in correlations
    between pairs of variables; i.e., to see if a relationship that is weak at zero lag is stronger at
    greater lags.  This observation could result from a delayed effect, where one variable does
    not associate with another variable immediately, but only after a specific lag or period of
    time. For example, in a small watershed, where base flow is dominated by several large
    abandoned deep mine discharges, the peak of concentrations and pollution loads of acidity,
    iron and other parameters may occur several days or weeks following the peak of stramflow,
    due to the residence time in the groundwater system.
•   The cross-correlation function (CCF) calculates the linear association between observation 0
    to t days and so gives a picture of when the association is strongest.  In the use of the cross-
    correlation function in bivariate and time series analyses  in this report, r values of 0.2 or the
    more conservative r = 0.3  have been selected as critical values. This selection infers that r
    values less than these critical values are not significantly  different than 0, and therefore can
    be deleted from consideration. Even if a lag correlation is significantly greater than 0, the
    relationship may still be weak (low r2). In most of the examples presented in this report,
    there did not appear to be any very significant lag in the effects.

Time Series

Two fundamental aspects to the time series analyses (described in Chapters 4 through 8 and
Appendix A) are: (1) the use of a simple time series plot of the data for a particular water quality
or flow variable, with or without quality control limits, to assist in evaluating patterns of
variation through time (essentially an exploratory data analysis step), and (2) the application of
the full Box-Jenkins time series analysis and model building  procedures (see Figure 9.2).
•   Time series analysis begins with a plot of the observations against time (days or dates). This
    plot is a  simple outcome and can give helpful guidance to the type of time series that is
    represented by the variation in the data. With this graph, outliers plotting beyond the
    confidence limits are easily seen, and the arrangement of the outliers may be either irregular
    (occurring as unique individuals) or systematic (e.g., periodic).
    The first step of Box-Jenkins time series analysis is to identify a tentative model and to
    improve on the model by iteration through the procedure, until a more satisfactory model is
    found. The basis for identification of a suitable model is the autocorrelation function (Acf)
    and the partial autocorrelation function (Pacf) of the observations. The form of the Acf and
    Pacf is usually adequate to determine an appropriate model and one may then proceed to the
    estimation stage.

9-18

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

•  Given the model, it is necessary to estimate the parameters for best fit. Diagnostics are
   applied to determine if the model is adequate and may also be used to compare different
   models to select the most appropriate. In order to fit a model with reasonably reliable
   estimates, there should be at least 2-3 years of data collected at even time intervals (e.g.,
   either weekly or monthly).
   The last step of Box-Jenkins time series analysis is to make predictions or forecasts of future
   values based on the selected model. This last step was shown to be of little value because
   extreme events inflated the confidence limits around the forecasts and thus, were not useful.
•  There are many alternative extensions of the analytical time series procedure that could have
   been followed, but because of extreme events, and because of difficulties with missing data
   and unequal values oft (intervals between collection times), it was considered imprudent, for
   the purposes of this report, to pursue  the analysis further.
•  Most of the variables show the presence of a trend over time (pH, flow, acidity, acid load,
   iron load, ferrous iron). These variables need a first difference to remove the effects of the
   trend. It seems evident from the studies to date that a moving average model applied to the
   first differences is almost universally the best choice. In some cases, the autoregressive
   model, possibly with a first difference, is also appropriate. In both cases, there is an indicator
   that the variation in whichever parameter is being analyzed, when first differenced, leads to a
   random walk (the parameter is equally likely to move in one direction as the other, i.e., there
   is no trend).
•  It is somewhat surprising that there appears to be no seasonal component in the time series
   models, particularly in the load variables. The only satisfactory explanation appears to be the
   existence of too many maxima at too many different times with very little repetition during
   the same time period.

Quality Control

There are many methods for defining quality control limits and there are arguments for and
against all of them. Throughout this report the conventional quality control limits based upon
the mean and standard deviation of the normal frequency distribution are compared to another
set of non-parametric quality control limits based upon the median and other order statistics
(e.g., quartiles, H-spreads, C-spreads), which may be more applicable to mine drainage data that
frequently do not follow a normal distribution.
•  The quality control analyses suggest that either the mean (plus or minus two standard
   deviations) or the non-parametric median (plus or minus a function of the H-spread) are
   equally appropriate. For the present, it is recommended both should be used until one or the
   other show superior performance.
   The quality control approach used  in this report and much of statistical work in general, is
   dependent upon the frequency distribution of the sample data. As 95.46% of the area of the
   normal frequency distribution is contained in the interval of the mean +/-two standard
   deviations, it is expected that approximately 95 out of 100 observations will occur within
   these confidence intervals. In the normal frequency distribution, the values are symmetrically
   distributed around the mean and the mean and standard deviation are best statistical
   estimators of the population. In a highly skewed frequency distribution, the mean may not be
                                                                                     9-19

-------
Chapter 9
   the best estimator of central tendency, and the standard deviation may not be the best
   measure of dispersion.
   Quality control limits can be set to compare to a specific number of remining results by
   setting a specific value of N' for the equations defined in the chapters.  These limits can be
   used as a subtle trigger for a mean or median, depending on the distribution or data.  A quick
   trigger can also be set in the same manner by setting N'=l.  For example, if one measurement
   is to be taken per month for a remining year, N'= 12 can be used (equation, page 3-9) to set a
   subtle trigger for the baseline median.
   The quality control approach should provide adjustments so that the number of monitoring
   samples (N) and the number of baseline samples (N) can be set to be equal when comparing
   these time periods (i.e., monitoring N=12 should be compared to a baseline N=12 even if the
   baseline contains 36 or more samples from several water years.
   Since intervals based on the median and interquartile range are non-parametric, data does not
   have to be transformed for normal distribution. However,  it is still recommended that the
   data are graphed, evaluated, and transformed if transformation would improve distribution.
   This improved distribution would lead to improved statistical control and a tighter estimate
   of the confidence belts around the median.
   The analyses presented in this report were conducted using long-term data sets with frequent
   samples. It would be impracticial to expect this type of analysis for a remining operation.
   Although large data sets are preferable, the practical alternative is to employ a simple quality
   control approach that allows the use of data sets that are typically compiled for remining
   permits.
9-20

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

                                     References

Chapter 1

NONE

Chapter 2

Aitchison, J., and J.A.C. Brown, (1973).  The Lognormal Distribution. London:  Cambridge
       University Press. 176 p.

Box, G.E.P. and Cox, D.R., (1964). An Analysis of Transformations, Journal Royal Statistical
       Society, B 26, 211  p.

Box, G.E.P. and Tiao, (1975).  Intervention Analysis with Applications to Economic and
       Environmental Problems. Journal of the American Statistical Association, 70, 349
       (March), 70-79.

Duffield, G.M., (1985) Intervention Analysis Applied to the Quantity an Quality of Drainage
       from an Abandoned Underground Coal Mine in North-Central Pennslvania. Masters of
       Science Thesis, Pennsylvania State University, Dept. of Geology.

Fisher, R.A., (1970) Statistical Methods for Research Workers, 14th Edition, Oliver & Boyd
       Limited, London, (1st Edition, 1925), 354 p

Fisher, R.A., (1973) Statistical Methods and Scientific Inference, New York: Hafner Press, p
       180.

Griffiths, J. C., (1967) Scientific Method in Analysis of Sediments.  New York:  McGraw Hill
       Book Co. 508 p

Krumbein, W.C., and F.A. Graybill, (1965) An Introduction to Statistical Models in Geology.
       New York:  McGraw Hill Book Company, Inc. 475 p.

Reimann, C. and P. Filzmoser. 2000. Normal and Lognormal Data Distribution in
       Geochemistry: Dealth of a Myth.  Consequences for the Statistical Treatment of
       Geochemical and Environmental Data. Environmental Geology. 39(9) July, 2000. 14pp.

Shewhart, W.A., (1931) Economic Control of Quality of Manufactured Product, New York: D.
       Van Nostrand Company, Inc., 501 p.

Shewhart, W.A., (1939) Statistical Method from the Viewpoint of Quality Control, Washington:
       U.S. Department of Agriculture, 135 p.
                                                                                   R-l

-------
References

Tiao G.C., G.E.P. Box, and WJ. Hamming, (1973) Analysis of Los Angeles Photochemical
       Smog Data: A Statistical Overview, Technical Report No. 331, Department of Statistics,
       Madison: University of Wisconsin, 15 p.

Tukey, J.W., (1977) Exploratory Data Analysis. Reading, Massachusetts:  Addison Wesley
       Publishing Company.

Vandaele, W., (1983) Applied Time Series and Box-Jenkins Models, Orlando, Florida: Academic
       Press, Inc., 417 p.

Chapter 3

Tukey, J.W., (1977) Exploratory Data Analysis. Reading, Massachusetts:  Addison Wesley
       Publishing Company.

Velleman, P.P., and D.C. Hoaglin, (1981) Applications, Basics, and Computing of Exploratory
       Data Analysis.  Boston, MA: Duxbury Press.

Chapter 4

Cleveland, W.S., (1979) Robust Locally Weighted Regression and Smoothing Scatterplots.
       Journal American Statistical Association, Volume 74, p. 829 -836

Damsleth, E., (1986) Modeling River Acidity- A Transfer Function Approach, Oslow, Norway:
       Norwegian Computing Center,  p. 52

Duffield, G.M., (1985) Intervention Analysis Applied to the Quantity and Quality of Drainage
       from an Abandoned Underground Coal Mine in North-Central Pennslvania. Masters of
       Science Thesis, Pennsylvania State University, Dept. of Geology.

Hornberger, R.J., M.W. Smith, A.E. Friedrich, and H.L. Lovell, (1990)  Acid Mine Drainage
       from Active and Abandoned Coal Mines in Pennsylvania.  Chapter 32 in Water Resources
       in Pennsylvania: Availability, Quality and Management. Edited by S.K. Majumdar, E.W.
       Miller and R.R. Parizek, Easton:  The Pennsylvania Academy of Science, pp. 432 - 451.

Smith, M.W., (1988)  Establishing Baseline Pollution Load from Pre-existing Pollutional
       Discharges for Remining in Pennsylvania. Paper presented at the 1988 Mine Drainage
       and  Surface Mine Reclamation Conference, Pittsburg, PA, April 17-22, pp.  311 - 318.

Velleman, P.F., and D.C. Hoaglin, (1981) Applications, Basics, and Computing of Exploratory
       Data Analysis.  Boston, MA: Duxbury Press.
R-2

-------
                      Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Chapter 5

Lusardi, P.J., and P.M. Erickson, (1985) Assessment and Reclamation of Abandoned Acid-
       Producing Strip Mine in Northern Clarion County, Pennsylvania. Proceedings Symposium
       on Surface Mining Hydrology, Sedimentology, and Reclamation. Lexington, KY

Chapter 6

NONE

Chapter 7

Arkin, H. and R.R. Colton, (1963) Tables for Statisticians, New York: Barnes and Noble Books,
       2nd Edition

Griffiths, J. C., (1967) Scientific Method in Analysis of Sediments.  New York:  McGraw Hill
       Book Co. 508 p.

Griffiths, J.C., (1987) Report No. 6:  Analysis of Data from the Fisher Deep Mine. Prepared for
       the U.S. Environmental Protection Agency, Office of Water, Washington DC, December
       1987.

Plowman, W., (1989) New Light on an Old Problem, Game News (Pennsylvania Game
       Commission), May 1989, p. 10-15

Smith, M.W. and C.H. Dodge, (1995) Coal Geology and Remining, Little Pine Creek Coal Field,
       Northwestern Lycoming County, Guidebook for Pennsylvania Geologic Survey Annual
       Field Meeting

Chapter 8

Barnes, I, W.T. Stuart, and D.W. Fisher, (1964)  Field Investigation of Mine Waters in
       the Northern Anthracite Field, Pennsylvania. U.S. Geological Survey, Professional Paper
       473-B, U.S. Government Printing Office, Washington, DC

Brady, K.B.C., RJ. Hornberger, and G. Fleeger, (1998) Influence of Geology on Postmining
       Water Quality: Northern Appalachian Basin.  Chapter 8 in Coal Mine Drainage
       Prediction and Pollution Prevention in Pennsylvania.  Edited by K.B.C. Brady, M.W.
       Smith and J.  Schueck, Department of Environmental Protection, Harrisburg, PA.  pp. 8-1
       to 8-92

Ladwig, K.J., P.M. Erickson, R.L.P. Kleinmann and E.T. Posluszny, (1984) Stratification in
       Water Quality in Inundated Anthracite Mines, Eastern Pennsylvania, U.S. Bureau of
       Mines Report of Investigation No. 8837, 35 p.
                                                                                   R-3

-------
References

Hornberger, R.J., M.W. Smith, A.E. Friedrich, and H.L. Lovell, (1990) Acid Mine Drainage from
       Active and Abandoned Coal Mines in Pennsylvania. Chapter 32 in Water Resources in
       Pennsylvania: Availability, Quality and Management. Edited by S.K. Majumdar, E.W.
       Miller and R.R. Parizek, Easton:  The Pennsylvania Academy of Science, pp. 432 - 451..

Smith, M.W., (1988) Establishing Baseline Pollution Load from Pre-existing Pollutional
       Discharges for Remining in Pennsylvania. Paper presented at the 1988 Mine Drainage
       and Surface Mine Reclamation Conference, Pittsburg, PA, April 17-22, pp. 311 - 318.

Chapter 9

Box, G.E.P., and Jenkins, G.M., (1970) Time Series Analysis, Forecasting and Control, Holden-
       Day Inc., S.F., 553 p.

Griffiths, J. C., (1967) Scientific Method in Analysis of Sediments. New York: McGraw Hill
       Book Co. 508 p

McGill, R., Tukey,  J.W., and Larsen, W.A., (1978) Variation of Box Plots., The American
       Statistician, Volume 32, No. 1, p. 16.

Minitab Reference Manual, (1986) 266 p.

Nelson, C.R., (1973) Applied Time series Analysis for routine field practice. Managerial
       Forecasting, San Francisco, California: Holden-Day Inc., 231 p.

Ryan, B.F., Joiner, B.L., and Ryan, T.A.  Jr., (1985) Minitab Handbook, 2nd Edition, Duxbury
       Press, Boston, 379 p.

Tukey, J.W., (1977) Exploratory Data Analysis. Reading, Massachusetts: Addison Wesley
       Publishing Company.

Vandaele, W., (1983) Applied Time Series and Box-Jenkins Models, Orlando, Florida: Academic
       Press, Inc., 417 p.

Velleman, P.F., and Hoaglin, D.C., (1981) Applications, Basics, and Computing of Exploratory
       Data Analysis, Duxbury Press, Boston, 354 p.
R-4

-------
    APPENDIX - A
Hamilton Discharge Data

-------

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Appendix A:      Hamilton Discharge Data

The Hamilton site is a permitted remining site located in Clearfield County, Pennsylvania, as
shown in Figure A. 1. Several years of background (pre-mining) baseline data existed for two
abandoned mine discharges on the site (Hamilton 01 and Hamilton 08). This site was selected to
be the initial data set statistically analyzed by Dr. J.C. Griffiths during February to April of 1987.
The first two reports of the eight report series of statistical analyses completed by Dr. Griffiths in
1987 and 1988 were on the Hamilton site. Report No. 1 was a preliminary evaluation of the
MINITAB1 software package for the analysis of remining data, performed on the Hamilton 01
and 08 data files. Report No. 2 was an evaluation of the usefulness of MINITAB in conducting a
time series analysis, including Box-Jenkins procedures,  of the Hamilton 08 discharge data set.
Since these first two reports were preliminary or exploratory in nature, they were not as well
developed as far as evaluation of the various steps of the data analysis algorithm (see Figure 3.1)
as succeeding reports (Report Nos. 3-8). These succeeding reports are the subject of Chapters
4 through 8 of this report (Report No.  8 of the original Griffiths report was a synopsis of Report
Nos. 1 to 7). However some  items of interest, not found in the other reports, were expressed in
the Hamilton site reports, and the data set is a good example of remining permit data. Thus, it
was determined that the elements of these two reports (although somewhat sketchy in places)
and the data sets would be presented in this Appendix.

The Hamilton 01 data had problems (missing data and the presence of a few exceptionally high
values) similar to the other data sets described in Chapters 4 through 8. For high values  of
manganese and sulfate, for example, it was stated that it is important to decide whether to keep
or reject the values as outliers with the assumption they  are data recording errors and therefore,
not really meaningful. Examination of each example, case by case, is recommended to make an
appropriate decision. Logarithmic transformation of some variables was attempted, but
introduced negative skewness in the sulfate data. Ultimately, it was determined that the sulfate
data appeared to be acceptable without transformation.

Some univariate and bivariate analyses were conducted  on the Hamilton 08 discharge data. It
was found that there was no obvious relationship between flow and acidity. There also was no
apparent relationship between acidity and sulfate.  There seems to be a weak inverse relationship
between manganese and  flow (flow increases as manganese decreases). Simple time series plots
of acidity, iron, manganese, and sulfate data from the Hamilton 08 discharge were also
performed, and some obvious cycles were observed.
                 is a commercial software package from Minitab, Inc. ©1986, 3081 Enterprise Drive,
State College, PA 16801.

                                                                                    A-l

-------
Appendix A

Figure A. 1:    Map of Hamilton Site
                                                             Enlarged Portion of the Houtzdale &
                                                          Sandy Ridge USGS 7,5 Minyte
                                                                 1000      2000
                                                                                   3000 Feet
A-2

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

Following application of stem and leaf plots, box plots, scatterplots and time series plots, it was
determined that Cross-Correlation functions, Rootogram functions, and the Box-Jenkins
procedures in the software package should be applied.  It was also concluded that many
additional analytical tools could be used including analysis of variance, t-tests, Chi-Square tests,
and regression. It is necessary to emphasize that while these tests are easy to apply, both their
applicability and the interpretation of the results may be very  demanding.

The objective of Griffiths Report No. 2 was an attempt to fit a model to the Hamilton 08
discharge data, and preferably, to find  a single simple model that would provide a reasonably
close fit. It is desirable to find a single model, if feasible, for  all five variables.  The Box-Jenkins
time series analysis procedure was used for this purpose (Box and Jenkins, 1970).  This
procedure consists of a convenient package  of computer programs that embrace the entire
modeling process.  A wide variety of models, collectively known as the ARIMA models, is
available in this package. Use of this sophisticated procedure requires that the data be  collected
at equal time intervals. This requirement was only partially fulfilled by the Hamilton 08 data.
Therefore, application of the resulting  model(s) should be limited.

Eventually, when the model meets the  demands of the criteria, it may be used to forecast future
values of the variable, accompanied by an appropriate estimate of the confidence limits at a
selected probability level.  Any new observations may be added to the chosen model and the fit
examined for acceptance or rejection.  These data should be taken at the same time intervals as
the original series (i.e., if the original observations are taken at two week intervals the new
observations should also be taken at two week intervals). The number of samples need not be
extensive; six to twelve would be acceptable.

If second differences of the flow data set are taken, the Acf and Pacf show many large  spikes
suggesting that the series has been overdifferenced. It therefore seems evident that an MA
(0,1,1) model may be most suitable. The first check criterion is a measure of correlation among
the parameters. Since, in this case, there is only one parameter, this does not apply. The second
criterion is the Acf of the residuals; if the model "fits" well, all systematic variation has been
removed and the remainder is random  (equals white noise). There are two tests at this  stage: the
first is an overall Portmanteau test (Box-Pierce-Ljung Statistic) of all autocorrelations taken
together. For this case the result is X2 = 17.95 with 29 degrees of freedom. It is not
significantly greater than that expected from white noise, hence it is feasible to consider that
these residuals represent random variation and,  on the basis of this  criterion, there is no evidence
to reject the model.

The second test is to examine the individual autocorrelations against twice their standard errors.
Since none exceed this value there is no evidence to require further refinement in the model.
When first differences of the residuals  are taken, the Portmanteau test yields a highly significant
value, implying overdifferencing.  The Pacf of the residuals confirms this diagnosis.
                                                                                      A-3

-------
Appendix A

A number of alternative models were fitted to log-transformed flow data and the results are
summarized in Table A.I. Both the AR (1,0,0) and MA (0,1,1) models fit equally well.  The
coefficient ( O ) in the AR model is approximately equivalent to the first difference in the MA
model. Attempts to improve on these simple models by using additional coefficients, seasonal
and otherwise, failed to provide any substantial improvement. Thus, it was decided to select one
of the simpler models.

Table A.I:    Alternate Models Fitted to Log Flow Data
No.
1
2
3
4
5
6
7
Model
MA (0,1,1)
AR (1,0,0)
AR(1,1,1)
AR (2,0,0)
AR(1,1,0)
AR (1,0,0) (1,0,0)
AR (1,0,0) (0,0,1)
Residual
Sum of
Squares
30.73
28.57
30.34
28.57
30.83
25.72
29.81
Coefficients
*
*
e^ot
significantly
different from 0
Oi 02
significantly
correlated
*
*
Oi not
significantly
different from 0
Acf
spikes
None
None
None
None
None
None
None
Portmanteau
Chi-Square
statistic
17.95
27.58
17.89
27.58
19.88
22.11
13.40
Residual
Standard
Deviation
0.519
0.503
0.520
0.505
0.522
0.481
0.523
* All coefficients are significantly different from 0 or 1, and there are not significant correlations
between coefficients

It was concluded that the most appropriate model, common to all variables, is the simple moving
average of the first differences of the observations, or an MA (0,1,1) model. The resulting
equations for each variable are:
Log Flow

Log Acidity

LogFe

LogMn

SO,
                                  z    =
                                  z    =
                                       = z
                                          t-i
                                       = z
                                  z    =
                                          t-1
I-a,  -0.4153,.!

hat  -'

-at  -0.8243,.!

•a,  -0.6623,.!

 3, -0.4083,.!
The model implies that the observation at time t (zt) equals its previous value plus a contribution
A-4

-------
                       Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

from the shock term (a^ and an additional, smaller contribution from the shock term of the
previous period (a,^). The system appears to have only a one-step memory and is otherwise a
typical random variable.

The absence of a seasonal component may be attributed to the fact that there are extreme
variations in the data which tend to smother any smaller systematic contribution.  There appears
to be two main reasons  for this, one of which may be modified. The first is the presence of zeros
in the data and the absence of an attempt to smooth the data.  Smoothing may well be of major
importance in reducing  the effects of extreme variations and thus, reducing the confidence limits
around forecasts. The second reason is that the unusual events represented by large positive
residuals are not repeated at the same interval during each annual cycle.  Thus, a heavy influx of
water from spring melt  is common but is not consistently heavy, and rarely occurs on the same
date.  Again, there are heavy late spring storms which lead to flooding, but do not occur every
year and do not always  occur in the same month. Thus the spread of events from February to
June would tend to smooth out any persistent cyclical feature that may be present. A much
longer series would be needed to check these possible effects.

There is one other aspect to the data that may be of importance. It may not be desirable to
perform a test of the observations that is too stringent, because it could result in too many  false
alarms.  Thus,  a fairly simple, robust test is desirable in practice.  The present MA models may
well be adequate for this purpose.  Investigations at more locations may help to clarify these
questions.
                                                                                     A-5

-------
Appendix A
Hamilton 01

Rows Days
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
0
34
73
119
161
202
216
231
244
257
271
286
299
313
329
342
356
369
383
386
411
425
455
467
482
495
510
524
538
552
565
579
593
608
624
636
650
666
680
692
706
721
734
748
762
772
790





Flow Acidity Total Iron Manganese Sulfate
9.70
3.00
10.00
0.00
0.00
0.00
0.00
0.00
28.61
7.71
28.00
7.70
7.70
7.70
7.70
7.70
7.70
7.70
0.00
2.10
82.00
7.70
28.60
28.00
28.00
28.00
61.30
105.00
28.00
105.00
105.00
105.00
28.00
28.00
28.00
105.00
28.00
7.70
28.00
28.00
0.00
7.70
0.00
2.10
7.70
2.10
7.70
337.0
360.0
305.0
400.0
294.0
307.0
305.0
300.0
539.0
195.0
174.0
184.0
230.0
306.0
254.0
394.0
444.0
340.0
474.0
714.0
222.0
258.0
274.0
282.0
284.0
268.0
220.0
202.0
214.0
110.0
118.0
162.0
224.0
250.0
98.0
197.0
78.0
264.0
218.0
286.0
458.0
352.0
356.0
632.0
392.0
364.0
336.0
43.30
20.00
14.00
33.00
25.00
19.00
27.00
19.00
8.50
35.00
16.20
16.00
21.50
28.00
28.00
35.00
19.00
35.00
75.00
75.00
62.00
54.30
7.00
42.00
20.00
20.00
9.50
7.20
10.00
3.70
8.00
14.70
25.00
25.00
12.50
19.00
6.00
18.00
5.20
10.00
13.00
8.50
6.90
9.30
8.70
9.00
7.50
5.12
6.80
9.00
8.00
3.00
4.00
4.00
4.10
3.00
2.90
5.00
8.20
4.00
7.00
11.30
10.00
5.30
7.20
8.00
8.20
4.60
7.70
7.00
56.00
5.10
5.90
4.80
4.90
5.20
2.30
2.90
5.00
3.80
4.90
4.50
4.00
3.70
5.00
3.00
5.90
6.50
7.50
7.30
7.90
9.30
8.00
8.00
868.0
900.0
652.0
823.0
422.0
550.0
342.0
419.0
142.0
430.0
494.0
510.0
600.0
612.0
382.0
423.0
705.0
399.0
872.0
608.0
550.0
500.0
550.0
700.0
510.0
681.0
620.0
598.0
613.0
587.0
358.0
469.0
655.0
713.0
477.0
397.0
612.0
600.0
542.0
643.0
746.0
811.0
778.0
568.0
831.0
806.0
835.0
A-6

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Rows
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
Days
799
818
832
846
857
867
874
885
899
916
930
944
958
972
989
1000
1015
1028
1043
1052
1070
1085
1098
1116
1126
1141
1154
1171
1184
1197
1210
1221
1238
1248
1266
1280
1294
1308
1322
1336
1351
1365
1379
1393
1407
1421
1434
1450
1464
Flow Acidity Total Iron Manganese Sulfate
7.70
7.70
7.70
7.70
7.70
7.70
28.00
28.00
28.00
105.00
61.00
61.00
61.00
61.00
29.00
28.97
7.80
0.00
7.80
7.80
7.80
7.80
7.80
1.20
2.20
7.90
6.10
8.90
41.70
2.20
197.00
0.00
61.00
131.00
0.00
7.90
18.80
12.10
11.00
8.90
12.00
2.70
4.60
4.60
2.70
4.60
12.10
4.60
9.90
334.0
316.0
306.0
370.0
306.0
346.0
206.0
264.0
264.0
261.0
226.8
155.6
133.4
168.1
74.2
142.2
158.3
191.3
232.6
266.8
300.7
317.3
326.6
314.7
287.3
265.6
184.5
121.2
91.6
166.1
197.0
226.1
215.7
84.6
107.8
126.7
107.1
128.2
126.7
124.7
102.4
190.4
179.4
189.3
202.3
664.7
163.8
194.3
231.5
7.00
9.00
9.00
12.80
9.00
9.00
8.00
8.00
10.00
3.75
2.25
4.00
2.75
4.20
9.80
8.00
12.00
7.50
9.00
7.50
9.50
8.00
9.50
6.00
8.50
7.50
8.00
9.00
3.10
5.50
8.50
6.50
8.00
1.20
4.00
6.50
3.90
6.00
7.30
6.30
5.50
9.00
6.50
7.00
8.00
8.00
3.90
3.90
8.50
8.00
7.00
6.00
4.90
8.00
8.00
5.53
6.00
6.00
4.00
3.00
3.00
2.30
3.90
5.10
7.80
6.00
9.00
10.00
7.00
19.00
14.00
11.50
10.00
9.00
9.00
9.50
4.40
3.30
4.80
5.30
7.30
7.50
2.90
2.20
3.50
2.80
2.80
3.60
2.90
2.40
6.30
5.50
4.30
7.80
6.50
3.20
4.60
5.50
798.0
655.0
713.0
794.0
674.0
719.0
431.0
566.0
594.0
333.0
29.6
276.0
181.0
300.0
343.0
491.0
550.0
511.0
584.0
690.0
531.0
452.0
755.0
805.0
816.0
780.0
608.0
300.0
261.0
396.0
524.0
652.0
609.0
187.0
242.0
337.0
246.0
264.0
284.0
255.0
236.0
455.0
385.0
481.0
596.0
466.0
299.0
466.0
612.0
                                                                       A-7

-------
Appendix A
Rows
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
Days
1477
1487
1504
1515
1526
1548
1581
1599
1623
1688
1700
1711
1731
1742
1760
1770
1784
1798
1814
1826
1842
1855
1865
Flow Acidity Total Iron Manganese
5.40
2.70
4.60
4.60
2.20
27.10
23.60
14.60
54.30
18.80
18.80
15.90
8.90
13.30
15.90
9.90
7.90
7.90
8.90
9.90
11.00
3.30
4.00
265.6
264.7
447.7
287.2
282.8
175.1
201.8
238.3
120.5
200.8
198.1
218.0
250.0
222.0
259.0
324.0
305.0
492.0
625.0
294.0
356.0
359.0
355.0
9.50
8.50
1.60
1.82
11.82
4.76
4.57
3.94
1.77
6.90
5.99
5.90
44.80
6.22
3.70
3.62
3.76
6.51
9.06
8.83
10.40
6.28
10.60
9.80
9.80
10.60
10.30
7.50
10.71
6.07
2.40
1.66
4.50
4.61
5.44
11.30
21.50
10.70
6.48
6.51
6.52
5.83
7.07
7.59
6.74
7.52
Sulfate
700.0
1223.0
716.0
667.0
664.0
369.0
362.0
613.0
275.0
395.0
326.0
504.0
603.0
509.0
621.0
617.0
622.0
641.0
609.0
721.0
802.0
840.0
874.0
A-8

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Hamilton 8





Rows Days Flow Acidity Total Iron Manganese
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
0
132
155
170
202
244
258
272
287
300
314
330
343
357
370
384
398
412
426
456
468
483
496
511
525
539
553
566
580
594
609
624
636
650
666
680
692
706
721
734
748
762
772
790
799
818
832
0
0
0
0
0
70
33
33
33
33
33
19
9
9
9
9
9
2
9
9
9
19
32
32
32
19
120
70
32
19
32
32
32
32
9
32
32
0
9
0
2
9
2
9
9
9
9
298
291
291
221
250
300
226
272
236
262
348
376
404
490
416
558
448
344
362
328
356
290
238
270
254
256
204
216
224
238
278
232
219
187
386
336
320
660
382
340
454
460
414
390
394
396
370
16.1
15
19
21.1
15
28
11
19
27
30
23
25
120
25
38
40
40
21.5
40
13
33.7
29
10.5
13.7
17.2
17
14
13.7
12.9
14.7
14.7
18.68
5.99
4.8
16
7.11
16.1
7.5
8.5
8.5
7.5
11.5
7.5
10
7.5
11
7.5
4.49
4.5
6
6
4.5
6
5.1
3.2
9.5
9
8
5
4.3
9
2
4.9
6.8
7
7.5
8.1
6.5
5.4
5.5
5.6
5.5
6.5
3.8
4.5
10
3.4
5
4.6
5.1
3.9
5
3.5
8.5
6.5
8
5.5
8.2
10.1
7
10
10
6
5

Sulfate
750
202
141
170
184
166
183
145
199
200
240
160
184
300
161
848
300
750
262
675
650
700
677
693
647
649
662
487
495
591
680
600
493
575
498
702
707
751
801.99
797
592.99
862.01
851
993
894.99
752
852
                                                                       A-9

-------
Appendix A
Rows
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
Days
846
857
867
874
885
899
916
930
944
958
972
989
1000
1015
1028
1043
1052
1070
1085
1098
1116
1126
1141
1154
1171
1184
1197
1214
1221
1238
1248
1266
1280
1294
1308
1322
1336
1351
1365
1379
1392
1407
1421
1434
1450
1464
1477
1487
1504
Flow Acidity Total Iron Manganese
9
9
9
32
9
9
49
120
70
70
32
19
19
8.9
8.9
8.9
8.9
2.4
8.1
6.7
1.8
3.3
6.1
5.4
18.8
72.3
44
18.8
18.8
9.9
360
37.1
12.1
46.7
62.8
44
37
49
22
17
12
7.9
14.6
25.3
12.1
11
8.9
7.9
6.2
430
472
448
270
292
372
409.4
324.2
282.2
283.8
270
112.6
164.8
178.4
204.2
247.7
178.4
320
339
342.1
376.8
340.9
338.1
291
252.1
151.5
238.7
254.4
283
309.4
141.9
197.9
206.6
198.3
214.8
202.8
212
189
223.1
232.9
270.8
270.8
310.8
221.2
273.6
294.1
329
232.6
369.6
9.8
9
8
10
8
9
11
8.5
6
10
4.8
11.5
7.9
11
8
10
8.5
8.5
11
17
7.5
9.5
8
14
8
7
6
8
8
8.5
5.5
7
8.5
9
8.5
8
7
7
12
7.5
7
8.3
8.3
8
8.4
8
9
7.5
3.1
6.2
9
7
6
6
8
5.5
3.5
4
3.8
2.2
6.2
6.9
7
7
6.3
1.3
12
10
9.5
10
10.5
11
13
10
5.3
5.3
6.5
8
10
3.4
4.5
5
4
3.2
4.3
3.4
4.3
5.3
5.3
4.8
8
8.5
4.5
4.6
6.8
10
11
12.1
Sulfate
774
819.99
817
514
543
629
449
365
363
275
322
390
453.99
517
516
588.99
180
396
466
791.01
879
808
910
854
510
382
465
555
676.01
677
294
436
464
334
313
335
343
345
440
413
543
621
669.99
443
552
618
625
667.01
911
A-10

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Rows
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
Days
1515
1526
1548
1581
1599
1623
1688
1700
1711
1731
1742
1760
1770
1784
1798
1814
1876
1842
1855
1865
Flow Acidity Total Iron Manganese
7.1
7.1
23.6
37.1
25.3
140
57
44
32.9
23.6
21.9
21.9
17.3
15.9
14.6
14.6
13.3
12.1
11
8.9
375.1
371.6
292.5
238.2
271.2
171.2
225.7
228.3
259
298
294
329
360
347
381
394
401
401
408
451
3.3
11.6
16.46
9.15
8.5
4.56
13.8
12.22
8.9
0.1
13.4
19.7
17
16.5
14.4
17.1
17.3
15
13.3
16.2
12.6
9
13.83
6.02
2.31
2.18
4.55
4.7
5.42
8.43
14.1
5.51
7.37
7.44
6.57
5.6
5.58
7.96
6.72
7.82
Sulfate
746
827.01
494.99
600
561
311
481.01
299
527
616
569
646.99
629
634
640
801
846.99
858.01
879
874
                                                                     A-ll

-------
Appendix A
A-12

-------
   APPENDIX - B
Arnot Discharge Data

-------

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Arnot 001
ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

DATE
1/28/80
2/29/80
3/31/80
4/22/80
5/10/80
5/31/80
6/18/80
6/30/80
7/19/80
8/12/80
8/27/80
9/11/80
9/27/80
10/16/80
11/7/80
11/30/80
12/18/80
1/5/81
1/19/81
1/31/81
2/19/81
3/8/81
3/21/81
4/11/81
4/30/81
5/16/81
5/29/81
6/18/81
6/30/81
7/13/81
7/28/81
8/30/81
9/29/81
10/15/81
10/29/81
12/8/81
12/16/81
1/6/82
1/14/82

PH
5.02
4.86
5.08
5.03
4.83
4.82
4.70
4.68
4.50
4.73
4.68
4.89
4.91
4.66
4.97
4.20
4.75
4.92
4.77
4.92
5.02
5.07
5.03
5.02
5.38
5.45
5.03
5.13
5.10
4.95
4.72
4.84
4.63
4.50
4.69
4.82
4.84
5.04
5.02

TEMP
ND
9.8
7.6
8.4
9.2
10.8
11.0
11.3
9.9
11.2
12.9
9.8
8.9
9.3
9.6
8.6
7.6
9.1
7.5
9.1
7.0
8.0
8.0
8.7
8.3
8.3
9.0
12.0
8.9
9.4
8.6
11.3
9.1
9.8
9.1
7.9
8.3
7.8
ND

ACIDITY
12
23
9
14
13
12
27
23
28
31
24
31
28
35
41
28
28
26
27
26
8
16
15
6
12
10
11
10
9
16
19
28
31
33
39
12
15
13
12

ALK
4
4
4
8
7
8
3
2
0
7
6
6
6
5
37
0
5
18
9
18
7
7
6
18
12
16
5
5
11
7
6
4
7
0
6
5
14
8
8

TOT. FE
0.2
0.2
0.3
0.4
0.3
0.2
0.2
0.3
0.2
0.2
0.3
0.3
0.2
0.3
0.1
0.1
0.3
0.2
0.3
0.2
0.3
0.1
0.1
0.2
0.2
0.2
0.3
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2

FFE
0.1
0.2
0.2
0.3
0.2
0.0
0.1
0.3
0.1
0.1
0.2
0.2
0.2
0.2
0.1
0.1
0.3
0.1
0.1
0.1
0.2
0.1
0.1
0.0
0.2
0.0
0.1
0.1
0.0
0.2
0.2
0.0
0.1
0.2
0.1
0.2
0.2
0.0
0.1

S04
180
96
91
122
141
177
181
186
177
204
206
182
193
240
211
259
231
179
228
179
117
159
151
140
131
134
124
154
151
140
188
195
210
244
205
186
189
185
106

CA
ND
ND
ND
ND
93
96
118
107
111
121
115
133
135
136
130
144
152
146
139
141
70
80
83
91
79
76
96
120
90
99
129
120
130
129
124
102
108
111
79

MG
ND
ND
ND
ND
77
67
64
82
82
80
90
95
99
124
113
114
103
145
141
127
67
76
87
76
69
76
68
80
86
90
81
105
118
125
111
92
83
75
46

MN
ND
ND
ND
ND
1.58
1.62
1.69
1.68
1.34
1.85
1.86
1.78
1.92
1.71
2.44
1.87
2.52
2.67
2.70
2.68
0.82
1.61
1.41
1.50
1.28
1.04
1.41
1.79
1.24
1.55
2.33
2.49
3.95
3.80
2.75
2.17
1.77
1.87
1.02

AL
ND
ND
ND
ND
0.63
0.70
3.64
0.73
0.81
1.57
1.39
2.67
2.71
2.87
1.93
2.23
3.19
2.29
2.60
2.87
0.19
0.92
0.91
0.36
0.22
0.27
0.50
0.97
ND
0.35
0.94
2.04
2.28
2.52
2.08
1.51
1.12
1.70
0.68

DISCH
0.531
0.291
2.375
1.474
1.240
0.514
0.446
0.366
0.466
0.241
0.241
0.196
0.150
0.209
0.209
0.209
0.209
0.209
0.178
0.193
0.010
1.332
0.718
0.718
1.063
1.332
0.862
0.542
0.673
0.584
0.420
0.274
0.209
0.209
0.241
0.459
0.420
0.500
0.673

flow gpm
238.33
130.61
1065.97
661.58
556.55
230.70
200.18
164.27
209.15
108.17
108.17
87.97
67.32
93.81
93.81
93.81
93.81
93.81
79.89
86.62
4.49
597.84
322.26
322.26
477.11
597.84
386.89
243.27
302.06
262.12
188.51
122.98
93.81
93.81
108.17
206.01
188.51
224.42
302.06

DATE
29248
29280
29311
29333
29351
29372
29390
29402
29421
29445
29460
29475
29491
29510
29532
29555
29573
29591
29605
29617
29636
29653
29666
29687
29706
29722
29735
29755
29767
29780
29795
29828
29858
29874
29888
29928
29936
29957
29965
                                                                      B-l

-------
Appendix B
ROW
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
DATE
2/23/82
3/2/82
3/31/82
5/6/82
5/19/82
5/26/82
6/2/82
6/10/82
6/11/82
6/16/82
6/25/82
6/30/82
7/1/82
7/8/82
7/16/82
7/28/82
8/6/82
8/12/82
8/17/82
8/26/82
9/12/82
10/2/82
10/16/82
10/30/82
11/5/82
11/24/82
12/16/82
1/8/83
2/5/83
3/1/83
4/6/83
4/23/83
5/5/83
5/10/83
5/20/83
5/31/83
6/14/83
6/30/83
7/16/83
8/2/83
8/5/83
PH
5.22
5.01
5.05
5.13
4.96
4.78
4.92
5.34
5.10
4.94
4.92
5.19
4.91
4.84
4.67
4.78
4.72
4.74
4.69
4.67
4.65
4.54
4.48
4.57
4.42
4.58
4.59
4.57
4.84
4.94
5.06
4.97
4.86
4.99
4.92
4.92
4.81
4.76
4.67
4.64
4.65
TEMP
7.3
8.4
8.9
9.0
ND
ND
ND
ND
9.2
9.0
10.0
ND
9.8
9.8
10.9
ND
ND
10.5
ND
12.2
12.5
11.7
9.6
11.7
ND
7.8
9.2
7.8
7.8
ND
8.3
9.0
8.6
ND
9.2
8.9
9.3
9.7
11.8
11.7
ND
ACIDITY
16
10
8
13
19
12
64
10
8
10
14
13
11
11
15
17
15
41
22
25
25
26
40
40
31
39
37
31
10
22
3
6
9
16
4
7
8
15
31
23
20
ALK
5
4
7
14
7
9
11
16
11
9
9
8
4
3
13
4
4
3
2
2
2
2
1
1
0
2
2
0
4
5
6
4
3
5
6
6
5
4
3
3
2
TOT. FE
0.3
0.2
0.2
0.3
0.2
0.2
0.2
0.4
0.3
0.3
0.2
0.2
0.2
0.2
0.0
0.2
0.2
0.2
0.1
0.0
0.1
0.2
0.2
0.1
0.1
0.2
0.2
0.2
0.2
0.1
0.2
0.2
0.2
0.3
0.2
0.1
0.2
0.3
0.2
0.2
0.3
FFE
0.2
0.1
0.0
0.2
0.2
0.0
0.0
0.2
0.2
0.1
0.1
0.1
0.2
0.2
0.0
0.1
0.1
0.2
0.0
0.0
0.1
0.1
0.1
0.1
0.1
0.0
0.2
0.0
0.1
0.0
0.1
0.1
0.1
0.1
0.1
0.0
0.1
0.1
0.1
0.1
0.1
SO4
160
172
107
124
148
118
151
99
130
170
186
185
148
141
161
169
171
202
211
198
201
252
238
212
252
240
277
254
159
191
182
125
171
66
92
140
132
169
180
189
167
CA
ND
115
80
110
99
100
95
66
68
118
125
93
96
103
102
102
101
ND
114
124
123
130
130
142
143
137
127
127
150
104
82
70
66
80
75
81
98
125
118
123
123
MG
ND
58
58
54
101
83
82
51
77
46
42
71
74
72
76
87
92
ND
93
109
104
130
107
107
95
130
118
128
31
116
62
69
41
77
70
61
62
54
93
79
87
MN
ND
1.16
0.98
0.95
ND
0.74
0.68
0.85
0.95
1.11
1.22
1.39
1.31
1.38
1.60
1.57
1.70
1.68
2.01
0.54
1.18
2.24
2.17
2.75
2.70
2.66
2.09
2.39
1.45
1.49
1.29
1.40
0.96
0.67
1.41
1.32
1.76
1.05
1.92
1.90
1.92
AL
ND
0.10
0.25
0.50
ND
0.87
0.58
0.51
0.27
0.45
0.56
0.70
0.61
0.79
0.78
1.58
2.00
2.27
ND
2.96
2.55
2.73
2.64
3.08
ND
3.22
2.98
3.22
0.44
1.46
0.80
0.60
0.51
1.16
1.15
0.66
0.69
0.41
0.92
1.60
1.68
DISCH
0.861
0.673
2.912
0.910
0.584
0.628
0.765
3.484
3.124
1.984
1.168
0.910
0.861
0.178
0.565
0.437
0.378
0.355
0.325
0.289
0.246
0.219
0.187
0.210
0.199
0.166
0.188
0.216
0.565
0.569
1.613
2.617
5.091
3.032
2.112
1.382
0.821
0.673
0.568
0.448
0.425
flow gpm
386.44
302.06
1306.99
408.44
262.12
281.87
343.35
1563.72
1402.14
890.48
524.23
408.44
386.44
79.89
253.59
196.14
169.66
159.33
145.87
129.71
110.41
98.29
83.93
94.25
89.32
74.51
84.38
96.95
253.59
255.38
723.96
1 1 74.59
2284.99
1360.85
947.93
620.28
368.49
302.06
254.94
201.08
190.75
DATE
30005
30012
30041
30077
30090
30097
30104
30112
30113
30118
30127
30132
30133
30140
30148
30160
30169
30175
30180
30189
30206
30226
30240
30254
30260
30279
30301
30324
30352
30376
30412
30429
30441
30446
30456
30467
30481
30497
30513
30530
30533
B-2

-------
                                                               Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW     DATE   PH  TEMP ACIDITY  ALK  TOT. FE  FFE   SO4    CA     MG    MN  AL             DISCH  flow gpm   DATE
 81     8/14/83  4.68  12.2     25      3     0.4    0.2   220    116     91   2.03  1.94             0.364   163.37    30542

                                                                                                 5.09
                                                                                                 0.01
                                                                                                 0.80
                                                                                                 0.80
MAX
MIN
AVG
MED
5.45
4.20
4.85
4.85
12.90
7.00
9.45
9.45
64.00
3.00
20.04
20.04
37.00
0.00
6.46
6.46
0.40
0.00
0.21
0.21
0.30
0.00
0.12
0.12
277.00
66.00
173.23
173.23
152.00
66.00
109.52
109.52
145.00
31.00
86.05
82.00
3.95
0.54
1.71
1.71
3.64
0.10
1.43
1.43
                                                                                                                                      B-3

-------
Appendix B
Arnot 003
ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
DATE
1/28/80
2/29/80
3/31/80
4/22/80
5/1 0/80
5/31/80
6/1 8/80
6/30/80
7/1 9/80
8/12/80
8/27/80
9/11/80
9/27/80
1 0/1 6/80
11/7/80
11/30/80
1 2/8/80
1/5/81
1/19/81
1/31/81
2/19/81
3/8/81
3/21/81
4/11/81
4/30/81
5/16/81
5/29/81
6/18/81
6/30/81
7/13/81
7/30/81
8/30/81
9/29/81
10/15/81
10/29/81
11/23/81
12/8/81
12/16/81
1/6/82
1/14/82
PH
3.35
3.31
3.50
3.42
3.24
3.21
3.29
3.29
3.21
3.26
3.26
3.37
3.37
3.23
3.41
3.31
3.24
3.36
3.17
3.32
3.70
3.36
3.29
3.21
3.46
3.49
3.32
3.41
3.32
3.27
3.29
3.32
3.13
3.07
3.24
3.22
3.04
3.11
3.27
3.59
TEMP
m.v.
8.5
m.v.
8.4
8.5
8.5
8.7
8.9
8.9
8.9
9.0
8.4
8.5
8.9
8.9
7.8
6.9
6.9
7.3
7.3
6.2
7.8
8.1
8.7
8.3
8.3
8.6
8.9
8.7
8.2
8.6
9.2
8.4
8.9
8.3
7.7
7.8
7.7
7.9
m.v.
DISCH
0.190
0.117
0.494
0.360
0.360
0.360
0.219
0.139
0.128
0.117
0.106
0.092
0.087
0.080
0.072
0.065
0.065
0.052
0.046
0.046
0.363
0.425
0.249
0.226
0.325
0.446
0.325
0.198
0.198
0.172
0.136
0.106
0.072
0.065
0.066
0.106
0.106
0.106
0.159
0.184
ACIDITY
69
119
47
58
70
70
90
81
107
106
99
89
104
102
97
109
108
113
106
104
42
66
83
71
101
140
70
71
79
80
87
96
95
114
151
101
111
107
120
64
ACIDLD
32.04
34.14
56.80
51.05
61.61
61.61
48.16
27.58
33.61
30.41
25.67
20.13
22.08
19.97
17.16
17.35
17.19
14.36
11.93
11.71
37.30
68.56
50.52
39.26
80.34
152.90
55.68
34.33
38.20
33.57
29.03
24.89
16.81
18.15
24.50
26.19
28.78
27.74
46.77
28.86
ALK
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ALKLD
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
TFE
1.2
1.2
1.1
1.2
1.0
0.8
0.8
0.8
0.7
0.9
1.0
1.0
1.0
1.0
0.9
0.9
1.2
1.1
1.2
1.2
0.5
1.5
1.5
1.2
1.3
1.2
1.4
1.2
1.1
1.2
1.1
1.1
1.2
1.3
1.2
1.5
1.6
1.8
1.6
0.8
TFELD
0.56
0.34
1.33
1.06
0.88
0.70
0.43
0.27
0.22
0.26
0.26
0.23
0.21
0.20
0.16
0.14
0.19
0.14
0.14
0.14
0.44
1.56
0.91
0.66
1.03
1.31
1.11
0.58
0.53
0.50
0.37
0.29
0.21
0.21
0.19
0.39
0.42
0.47
0.62
0.36
FE
0.5
0.4
0.3
0.2
0.3
0.0
0.3
0.3
0.2
0.6
0.7
0.4
0.4
0.2
0.4
0.1
0.5
0.3
0.3
0.3
0.2
0.1
0.2
0.6
0.3
0.2
0.2
0.2
0.2
0.7
0.2
0.4
1.0
0.7
0.8
0.5
0.4
1.5
0.3
0.4
FELD
0.23
0.12
0.36
0.18
0.26
0.00
0.16
0.10
0.06
0.17
0.18
0.09
0.09
0.04
0.07
0.02
0.08
0.04
0.03
0.03
0.18
0.10
0.12
0.33
0.24
0.22
0.16
0.10
0.10
0.29
0.07
0.10
0.18
0.11
0.13
0.13
0.10
0.39
0.12
0.18
S04
162
180
93
119
149
148
169
177
181
208
197
208
191
214
225
218
232
231
215
215
110
158
85
125
150
144
144
165
170
171
165
143
211
223
221
231
225
223
212
122
S04LD
75.23
51.64
112.39
104.74
131.14
130.26
90.43
60.26
56.85
59.68
53.93
47.05
40.56
41.89
39.80
34.70
36.93
29.35
24.20
24.20
97.69
164.12
51.74
69.12
119.31
157.27
114.54
79.79
82.21
71.76
55.06
37.08
37.33
35.50
35.82
59.90
58.34
57.82
82.62
55.01
CA
m v
m v
m v
m.v.
58
55
61
61
59
66
69
71
74
69
74
77
90
77
77
78
44
46
46
70
52
48
49
56
55
54
64
62
60
70
70
64
66
62
61
61
CALD
m.v.
m v
m.v.
m.v.
51.05
48.41
32.64
20.77
18.53
18.94
17.89
16.06
15.71
13.51
13.09
12.26
14.33
9.78
8.67
8.78
39.08
47.78
28.00
38.71
41.36
52.42
38.98
27.08
26.60
22.26
21.36
16.08
10.62
11.14
11.34
16.59
17.11
16.08
23.77
27.51
MG
m.v.
m v
m.v.
m.v.
45
51
66
56
67
70
68
78
72
89
99
95
96
142
127
115
39
49
59
53
64
57
67
66
72
74
75
83
103
94
99
99
97
99
81
40
MGLD
m.v.
m v
m.v.
m.v.
39.61
44.89
35.32
19.07
21.04
20.08
17.63
17.64
15.29
17.42
17.51
15.12
15.28
18.04
14.30
12.94
34.64
50.90
35.91
29.31
50.91
62.25
53.29
31.92
34.82
31.06
25.03
21.52
18.22
14.96
16.04
25.67
25.41
25.67
31.57
18.04
MN
m.v.
m v
m.v.
1.85
1.92
2.40
2.73
2.87
2.48
4.70
4.55
3.70
4.20
4.40
4.90
4.65
5.90
5.70
6.70
6.90
1.95
2.72
1.99
2.62
3.18
2.35
2.07
2.76
2.38
2.52
4.65
4.80
6.20
5.95
4.60
4.56
5.60
4.50
4.75
1.98
MNLD
m.v.
m v
m.v.
1.63
1.69
2.11
1.46
0.98
0.78
1.35
1.18
0.84
0.89
0.86
0.87
0.74
0.94
0.72
0.75
0.78
1.73
2.82
1.21
1.45
2.53
2.57
1.65
1.34
1.15
1.06
1.55
1.25
1.10
0.95
0.75
1.18
1.45
1.17
1.85
0.89
AL
m.v.
m v
m.v.
3.31
3.76
3.45
5.16
0.88
4.36
6.12
3.85
5.80
6.64
7.09
7.39
6.60
9.40
9.44
8.37
8.03
0.70
3.77
1.79
1.76
4.35
3.23
3.89
4.87
m.v.
3.81
4.80
6.96
6.96
6.52
5.16
7.34
7.09
6.64
6.57
3.05
ALLD
m.v.
m v
m.v.
2.91
3.31
3.04
2.76
0.30
1.40
1.76
1.00
1.31
1.41
1.39
1.31
1.05
1.50
1.20
0.94
0.90
0.62
3.92
1.09
0.97
3.46
3.53
3.09
2.36
m.v.
1.60
1.60
1.81
1.23
1.04
0.84
1.90
1.84
1.72
2.56
1.38
B-4

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
DATE
2/23/82
3/2/82
3/31/82
5/6/82
5/1 9/82
5/26/82
6/2/82
6/1 0/82
6/11/82
6/16/82
6/25/82
6/30/82
7/1/82
7/8/82
7/1 6/82
7/28/82
8/6/82
8/1 2/82
8/1 7/82
8/26/82
9/12/82
1 0/2/82
10/16/82
10/30/82
11/5/82
11/24/82
12/16/82
1/8/83
2/5/83
3/1/83
4/6/83
4/23/83
5/5/83
5/1 0/83
5/20/83
5/31/83
6/1 4/83
6/30/83
7/1 6/83
8/2/83
8/5/83
PH
3.21
3.15
3.36
3.38
3.28
3.34
3.30
3.47
3.42
3.31
3.27
3.29
3.33
3.35
3.36
3.26
3.25
3.16
3.24
3.20
3.19
3.21
3.14
3.20
3.18
3.18
3.24
3.15
3.27
3.18
3.25
3.25
3.24
3.30
3.31
3.15
3.21
3.22
3.24
3.16
3.19
TEMP
6.5
7.8
8.5
8.5
m.v.
m.v.
m.v.
m.v.
9.0
9.0
9.0
m.v.
9.5
9.0
9.7
m.v.
m.v.
9.0
m.v.
10.3
10.0
9.6
8.6
10.0
m.v.
7.4
8.6
7.8
6.8
m.v.
8.3
8.9
8.5
m.v.
9.4
8.6
8.9
8.9
9.6
11.7
m.v.
DISCH
0.257
0.226
0.565
0.363
0.338
0.319
0.290
0.565
0.516
0.516
0.383
0.325
0.307
0.243
0.193
0.157
0.136
0.125
0.115
0.106
0.089
0.073
0.068
0.065
0.058
0.046
0.040
0.052
0.198
0.125
0.361
0.493
0.538
0.495
0.470
0.379
0.244
0.163
0.138
0.106
0.106
ACIDITY
133
94
57
68
65
60
58
50
57
59
59
60
64
64
79
67
69
97
74
80
86
102
104
110
104
114
120
114
81
103
72
69
86
67
62
67
74
73
93
94
83
ACIDLD
83.52
51.98
78.85
60.39
53.77
46.83
41.11
69.17
71.93
74.45
55.27
47.72
48.09
38.05
37.30
25.74
22.96
29.66
20.82
20.75
18.73
18.22
17.30
17.49
14.76
12.83
11.74
14.50
39.24
31.50
63.59
83.23
113.20
81.14
71.29
62.13
44.18
29.11
31.40
24.38
21.53
ALK
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ALKLD
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
TFE
1.7
1.4
1.0
1.2
1.0
0.9
1.0
0.8
1.2
1.1
1.1
0.9
1.2
1.1
0.7
0.9
1.0
0.9
0.8
0.8
0.7
0.9
1.0
0.5
0.3
1.0
1.1
0.8
1.1
1.5
2.0
1.5
1.2
1.1
0.9
1.3
1.0
1.3
1.1
1.1
1.0
TFELD
1.07
0.77
1.38
1.07
0.83
0.70
0.71
1.11
1.51
1.39
1.03
0.72
0.90
0.65
0.33
0.35
0.33
0.28
0.23
0.21
0.15
0.16
0.17
0.08
0.04
0.11
0.11
0.10
0.53
0.46
1.77
1.81
1.58
1.33
1.03
1.21
0.60
0.52
0.37
0.29
0.26
FE
0.5
0.5
0.3
0.6
0.6
0.4
0.2
0.4
0.4
0.2
0.3
0.2
0.3
0.2
0.1
0.3
0.3
0.1
0.0
0.4
0.7
0.2
0.2
0.1
0.1
0.1
0.3
0.2
0.3
0.3
0.7
0.4
0.4
0.4
0.4
0.2
0.3
0.3
0.3
0.5
0.2
FELD
0.31
0.28
0.42
0.53
0.50
0.31
0.14
0.55
0.51
0.25
0.28
0.16
0.23
0.12
0.05
0.12
0.10
0.03
0.00
0.10
0.15
0.04
0.03
0.02
0.01
0.01
0.03
0.03
0.15
0.09
0.62
0.48
0.53
0.48
0.46
0.19
0.18
0.12
0.10
0.13
0.05
SO4
167
179
128
125
98
134
134
119
134
128
144
128
116
146
151
137
165
178
184
158
158
220
215
206
233
250
256
262
103
220
165
145
112
125
99
100
110
167
156
164
202
SO4LD
104.87
98.98
177.07
111.01
81.06
104.59
94.97
164.62
169.10
161.53
1 34.89
101.81
87.15
86.80
71.30
52.62
54.90
54.44
51.77
40.98
34.40
39.29
35.77
32.76
33.06
28.14
25.05
33.33
49.90
67.28
145.73
1 74.89
147.42
151.38
113.84
92.73
65.67
66.60
52.67
42.53
52.39
CA
m v
58
53
48
51
48
52
44
44
48
47
47
46
45
m.v.
53
58
m.v.
59
63
61
67
74
74
84
74
72
80
63
71
50
45
39
40
38
48
51
65
61
68
65
CALD
m.v.
32.07
73.32
42.63
42.19
37.47
36.85
60.87
55.52
60.57
44.03
37.38
34.56
26.75
m.v.
20.36
19.30
m.v.
16.60
16.34
13.28
11.97
12.31
11.77
11.92
8.33
7.05
10.18
30.52
21.71
44.16
54.28
51.33
48.44
43.70
44.51
30.45
25.92
20.60
17.64
16.86
MG
m.v.
82
39
65
67
55
52
74
53
69
72
49
55
49
m.v.
55
72
m.v.
83
84
71
67
77
103
116
114
103
102
64
122
58
48
44
38
52
58
51
48
70
79
74
MGLD
m.v.
45.34
53.95
57.73
55.42
42.93
36.85
102.37
66.88
87.07
67.45
38.97
41.32
29.13
m.v.
21.13
23.96
m.v.
23.35
21.78
15.46
11.97
12.81
16.38
16.46
12.83
10.08
12.98
31.00
37.31
51.23
57.90
57.92
46.02
59.79
53.78
30.45
19.14
23.63
20.49
19.19
MN
m.v.
2.57
1.61
1.88
m.v.
1.91
2.01
1.57
1.61
1.62
1.67
1.76
2.01
1.84
2.30
2.45
2.68
2.83
2.80
3.27
3.61
3.45
3.61
3.79
4.04
4.55
4.68
4.30
3.09
3.40
1.54
1.84
2.68
1.61
1.71
2.49
2.50
2.10
2.99
3.19
2.71
MNLD
m.v.
1.42
2.23
1.67
m.v.
1.49
1.43
2.17
2.03
2.04
1.56
1.40
1.51
1.09
1.09
0.94
0.89
0.87
0.79
0.85
0.79
0.62
0.60
0.60
0.57
0.51
0.46
0.55
1.50
1.04
1.36
2.22
3.53
1.95
1.97
2.31
1.49
0.84
1.01
0.83
0.70
AL
m.v.
3.40
1.90
1.28
m.v.
3.40
1.75
2.90
2.98
2.94
3.26
3.56
3.59
4.06
m.v.
4.29
4.67
4.68
m.v.
5.43
6.26
6.98
6.96
7.94
m.v.
8.29
9.05
9.40
7.88
6.78
3.89
3.22
3.62
3.02
3.11
4.05
4.98
5.04
5.66
8.68
6.96
ALLD
m.v.
1.88
2.63
1.14
m.v.
2.65
1.24
4.01
3.76
3.71
3.05
2.83
2.70
2.41
m.v.
1.65
1.55
1.43
m.v.
1.41
1.36
1.25
1.16
1.26
m.v.
0.93
0.89
1.20
3.82
2.07
3.44
3.88
4.76
3.66
3.58
3.76
2.97
2.01
1.91
2.25
1.80
                                                                      B-5

-------
Appendix B
 ROW   DATE   PH  TEMP  DISCH ACIDITY  ACIDLD  ALK  ALKLD  TFE TFELD FE  FELD SO4 SO4LD  CA  CALD  MG  MGLD  MN  MNLD  AL  ALLD
  82   8/14/83  3.19  10.0   0.104     93     23.66    0     0    1.3  0.33  0.9  0.23  206  52.42   63  16.03  79  20.10 3.40  0.87  6.76  1.72
max
min
Avg
Med
3.70
3.04
3.28
3.27
11.7
6.2
8.6
8.6
0.565
0.040
0.216
0.161
151
42
86
85
152.90
11.71
40.25
32.81
0
0
0
0
0
0
0
0
2.0
0.3
1.1
1.1
1.81
0.04
0.59
0.44
1.5 0.62
0.0 0.00
0.4 0.18
0.3 0.13
262 177.07 90 73.32 142 102.37 6.90 3.53 9.44 4.76
85 24.20 38 7.05 38 10.08 1.54 0.46 0.70 0.3
169 77 60 27.5 73.6 32.4 3.2 1.3 5.1 2.1
165 60.08 61 21.71 70 25.7 2.76 1.17 4.8 1.76
B-6

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
Arnot 004
ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
DATE
1/28/80
2/29/80
3/31/80
4/22/80
5/1 0/80
5/31/80
6/1 8/80
6/30/80
7/1 9/80
8/1 2/80
8/27/80
9/11/80
9/27/80
10/16/80
11/7/80
11/30/80
12/18/80
1/5/81
1/31/81
2/1 9/81
3/8/81
3/21/81
4/11/81
4/30/81
5/1 6/81
5/29/81
6/1 8/81
6/30/81
7/1 3/81
7/28/81
8/30/81
9/29/81
10/15/81
1 0/29/81
11/23/81
12/8/81
12/16/81
1/6/82
1/14/82
2/23/82
Dischg
0.383
0.273
0.9
1.646
1.745
1.255
0.9
0.446
0.319
0.241
0.209
0.178
0.136
0.209
0.15
0.122
0.15
0.15
0.123
0.274
0.718
0.563
0.459
0.584
0.765
0.673
0.459
0.42
0.381
0.223
0.209
0.15
0.136
0.136
0.198
0.209
0.257
0.308
0.247
0.5
PH
3.36
3.32
3.46
3.38
3.22
3.22
3.32
3.36
3.21
3.25
3.25
3.12
3.28
3.16
3.32
3.29
3.17
3.29
3.11
3.40
3.35
3.33
3.41
3.41
3.44
3.32
3.39
3.35
3.36
3.31
3.28
3.10
3.00
3.19
3.19
3.19
3.15
3.16
3.46
3.21
TEMP
m.v.
8.8
m.v.
8.4
8.4
8.5
8.6
8.9
8.9
8.9
9.0
9.0
8.8
9.1
9.0
7.4
6.3
6.1
6.9
8.1
8.1
7.8
8.7
8.1
8.1
8.6
8.9
8.7
8.5
8.5
9.7
8.1
8.2
8.2
6.8
7.9
7.6
7.0
m.v.
7.0
ACID
76
152
67
71
70
75
100
79
103
112
107
98
114
118
115
131
130
137
122
88
70
80
78
140
168
77
72
82
111
103
100
111
124
137
126
133
121
131
69
145
ACIDLD
71.31
101.48
147.50
285.94
298.84
230.26
220.14
86.20
80.50
65.92
54.65
42.78
37.91
60.26
42.11
39.30
47.60
50.17
36.60
58.96
123.06
110.19
87.67
200.19
314.49
126.77
80.93
84.24
103.62
86.78
51.07
40.65
41.24
45.56
61.18
67.92
76.09
98.86
64.41
177.39
ALK
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
ALKLD
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
TFE
1.2
1.0
1.5
1.0
0.9
0.9
0.9
0.9
0.6
1.1
1.0
1.2
1.2
1.3
1.3
1.6
1.8
1.7
1.9
0.9
1.4
1.3
1.4
1.9
1.5
1.5
1.2
1.3
1.3
1.6
1.8
1.5
1.6
1.6
1.9
2.0
2.1
2.1
2.8
1.8
TFELD
1.13
0.67
3.30
4.03
3.84
2.76
1.98
0.98
0.47
0.65
0.51
0.52
0.40
0.66
0.48
0.48
0.66
0.62
0.57
0.60
2.46
1.79
1.57
2.72
2.81
2.47
1.35
1.34
1.21
1.35
0.92
0.55
0.53
0.53
0.92
1.02
1.32
1.59
2.61
2.20
FE
0.8
0.4
0.4
0.4
0.2
0.3
0.5
0.2
0.5
1.1
0.8
0.5
0.6
0.6
0.4
0.4
0.5
0.5
0.4
0.2
0.1
0.2
0.3
0.3
0.3
0.2
0.2
0.2
0.5
0.4
0.5
0.4
1.4
1.1
0.5
0.5
1.1
0.6
1.1
0.3
FELD
0.75
0.27
0.88
1.61
0.85
0.92
1.10
0.22
0.39
0.65
0.41
0.22
0.20
0.31
0.15
0.12
0.18
0.18
0.12
0.13
0.18
0.28
0.34
0.43
0.56
0.33
0.23
0.21
0.47
0.34
0.26
0.15
0.47
0.37
0.24
0.26
0.69
0.45
1.03
0.37
S04
143
160
138
136
128
150
169
167
162
163
171
167
135
186
230
242
206
251
181
132
160
145
169
177
147
121
193
166
159
143
198
185
211
222
235
221
204
196
124
174
S04LD
134.18
106.82
303.80
547.71
546.45
460.52
377.04
182.22
126.62
95.94
87.33
72.89
44.89
94.99
84.22
72.60
75.43
91.91
54.30
88.43
281 .28
199.71
189.95
253.09
275.18
199.20
216.93
170.54
148.42
120.49
101.12
67.74
70.17
73.83
114.10
112.87
128.29
147.92
115.75
212.87
CA
m.v.
m.v.
m.v.
m.v.
46
44
58
52
54
59
58
59
58
61
62
65
79
62
60
50
44
40
53
52
48
52
56
51
50
52
55
58
64
57
58
56
54
53
46
m.v.
CALD
m.v.
m.v.
m.v.
m.v.
196.38
135.09
127.68
56.74
42.21
34.73
29.62
25.75
19.29
31.15
22.70
19.50
28.93
22.70
18.00
33.50
77.35
55.09
59.57
74.36
89.86
85.61
62.94
52.39
46.67
43.81
28.09
21.24
21.28
18.96
28.16
28.60
33.96
40.00
42.94
m.v.
MG
m.v
m.v
m.v
m.v
46
50
52
53
59
65
57
63
67
72
94
82
73
109
102
73
58
75
67
54
68
48
67
56
64
82
63
103
92
110
96
77
83
74
45
m.v.
MGLD
m.v
m.v
m.v
m.v
196.38
153.51
1 1 4.47
57.83
46.11
38.26
29.11
27.50
22.28
36.77
34.42
24.60
26.73
39.91
30.60
48.91
101.97
103.30
75.31
77.22
127.30
79.02
75.31
57.53
59.74
69.09
32.17
37.72
30.60
36.58
46.61
39.32
52.20
55.85
42.01
m.v
MN
m.v
m.v
m.v
2.06
1.94
2.09
3.00
2.35
1.96
2.54
2.38
2.36
2.63
3.40
4.00
2.73
5.10
4.00
6.50
1.83
2.61
1.92
2.33
2.82
1.91
2.03
2.24
2.10
2.14
2.88
3.55
4.55
4.85
4.35
4.24
4.45
3.55
4.40
2.10
m.v.
MNLD
m.v
m.v
m.v
8.30
8.28
6.42
6.60
2.56
1.53
1.50
1.22
1.03
0.88
1.74
1.46
0.82
1.87
1.47
1.95
1.23
4.59
2.64
2.62
4.03
3.58
3.34
2.52
2.16
2.00
2.43
1.81
1.67
1.61
1.45
1.06
2.27
2.23
3.32
1.96
m.v
AL
m.v
m.v
m.v
4.10
4.80
4.51
7.62
7.06
3.67
5.96
5.80
7.99

8.75
10.25
5.78
9.46
9.38
11.38
1.41
5.56
3.40
4.35
5.02
5.18
5.01
6.92
m.v
6.47
5.46
8.00
8.55
8.55
9.09
7.04
8.30
9.12
7.74
0.71
m.v
ALLD
m.v
m.v
m.v
16.51
20.49
13.85
16.78
7.70
2.87
3.51
2.96
3.49
2.33
4.47
3.75
1.73
3.46
3.44
3.41
0.94
9.78
4.68
4.89
7.18
9.70
8.25
7.78
m.v
3.99
4.60
4.09
3.13
2.84
3.02
3.42
4.24
5.74
5.84
0.66
m.v
                                                                      B-7

-------
Appendix B
ROW
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
DATE
3/2/82
3/31/82
5/6/82
5/1 9/82
5/26/82
6/2/82
6/1 0/82
6/11/82
6/1 6/82
6/25/82
6/30/82
7/1/82
7/8/82
7/1 6/82
7/28/82
8/6/82
8/1 2/82
8/1 7/82
8/26/82
9/1 2/82
10/2/82
10/16/82
1 0/30/82
11/5/82
11/24/82
12/16/82
1/8/83
2/5/83
3/1/83
4/6/83
4/23/83
5/5/83
5/1 0/83
5/20/83
5/31/83
6/1 4/83
6/30/83
7/1 6/83
8/2/83
8/5/83
8/1 4/83
Dischg
0.5
1.115
0.765
0.628
0.563
0.584
0.971
0.755
0.96
0.814
0.861
0.718
0.656
0.534
0.441
0.397
0.366
0.344
0.289
0.231
0.173
0.16
0.158
0.139
0.133
0.135
0.186
0.295
0.351
0.979
1.501
1.838
1.574
1.358
1.07
0.787
0.587
0.488
0.417
0.403
0.362
PH
3.17
3.30
3.40
3.25
3.28
3.27
3.46
3.38
3.29
3.27
3.30
3.34
3.31
3.39
3.32
3.28
3.20
3.28
3.24
3.22
3.21
3.91
3.18
3.14
3.12
3.18
3.13
3.12
3.17
3.20
3.23
3.13
3.29
3.94
3.17
3.25
3.28
3.30
3.18
3.23
3.23
TEMP
7.8
8.2
8.2
m.v.
m.v.
m.v.
m.v.
9.0
8.8
9.0
m.v.
9.5
8.0
9.2
9.7
m.v.
9.0
m.v.
9.4
10.3
10.7
9.2
9.7
m.v.
7.8
8.3
6.9
6.4
m.v.
8.3
8.6
8.4
m.v.
8.9
8.3
8.6
9.0
9.6
9.4
m.v.
9.4
ACID
97
73
65
73
64
66
64
79
71
73
69
64
66
74
81
64
96
72
80
100
115
136
124
121
132
127
112
125
115
95
77
128
67
62
67
74
73
92
97
83
100
ACIDLD
118.67
199.17
121.68
112.19
88.15
94.37
235.27
225.82
166.80
145.14
145.35
112.51
105.93
96.68
87.39
62.16
85.96
60.60
56.57
56.52
48.67
53.24
47.93
41.15
42.95
41.95
50.97
90.22
100.47
227.54
282.77
575.59
258.01
205.99
175.40
142.48
104.84
109.84
98.96
81.84
88.57
ALK
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
ALKLD
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
TFE
1.3
0.9
1.2
1.1
1.0
0.9
0.9
1.1
1.1
1.1
1.0
1.1
1.1
0.7
0.9
0.9
0.9
0.8
0.8
0.8
1.1
1.3
0.6
0.6
1.2
1.5
1.5
1.9
1.6
1.7
1.3
2.1
1.0
0.8
1.0
0.9
0.9
1.0
1.0
1.0
1.2
TFELD
1.59
2.46
2.25
1.69
1.38
1.29
3.31
3.14
2.58
2.19
2.11
1.93
1.77
0.91
0.97
0.87
0.81
0.67
0.57
0.45
0.47
0.51
0.23
0.20
0.39
0.50
0.68
1.37
1.37
4.07
4.77
9.44
3.85
2.66
2.62
1.73
1.29
1.19
1.02
0.99
1.06
FE
0.4
0.4
0.8
0.8
0.3
0.3
0.3
0.4
0.2
0.3
0.4
0.3
0.2
0.3
0.3
0.3
0.3
0.0
0.4
0.8
0.1
0.4
0.3
0.1
0.2
0.5
0.3
0.8
0.6
0.3
0.3
0.7
0.3
0.2
0.1
0.2
0.2
0.3
0.3
0.2
0.2
FELD
0.49
1.09
1.50
1.23
0.41
0.43
1.10
1.14
0.47
0.60
0.84
0.53
0.32
0.39
0.32
0.29
0.27
0.00
0.28
0.45
0.04
0.16
0.12
0.03
0.07
0.17
0.14
0.58
0.52
0.72
1.10
3.15
1.16
0.66
0.26
0.39
0.29
0.36
0.31
0.20
0.18
S04
168
140
144
135
125
130
131
138
142
159
145
103
140
137
145
153
200
168
175
187
256
184
219
221
m.v.
268
262
200
227
166
213
207
100
86
150
148
152
160
167
224
202
S04LD
205.53
381 .97
269.57
207.48
172.17
185.89
481 .57
394.47
333.60
316.12
305.22
181.08
224.69
178.99
156.43
148.61
179.09
1 41 .39
123.64
105.69
108.35
72.03
84.66
75.16
m.v.
88.52
119.23
1 44.35
194.94
397.60
782.21
930.84
385.09
285.73
392.68
284.97
218.29
191.03
170.38
220.86
178.90
CA
49
46
49
46
48
50
40
44
53
56
45
51
52
54
54
50
m.v.
55
72
52
63
60
63
62
60
65
61
65
62
49
44
39
39
39
49
48
69
56
66
61
60
CALD
59.95
125.50
91.73
70.70
66.11
71.50
147.04
125.77
124.51
111.34
94.79
89.66
84.36
70.55
58.26
48.56
m.v.
46.29
50.91
29.39
26.67
23.49
24.35
21.08
19.52
21.47
27.76
46.91
53.24
117.37
161.58
175.38
150.19
129.58
128.27
92.42
99.09
66.86
67.34
60.14
53.14
MG
70
59
42
65
64
61
72
52
58
60
63
42
43
47
43
60
m.v.
60
51
75
63
86
75
101
99
102
103
67
80
74
17
54
51
74
56
48
50
92
65
64
69
MGLD
85.64
160.97
78.62
99.90
88.15
87.22
264.68
148.64
136.26
119.29
132.71
73.84
69.01
61.40
46.39
58.28
m.v
50.50
36.06
42.39
26.67
33.67
28.99
34.35
32.21
33.69
46.87
48.36
68.70
177.25
62.43
242.83
196.40
145.86
146.60
92.42
71.81
109.84
66.31
63.10
61.11
MN
1.97
1.38
1.72
m.v.
1.90
1.85
1.65
1.80
1.77
1.78
1.80
1.76
2.01
1.90
2.90
2.09
2.04
2.17
2.46
2.69
2.79
3.04
3.18
3.27
3.83
3.72
3.61
3.40
3.09
1.20
2.72
3.40
2.03
1.98
2.62
2.18
2.12
2.43
2.59
2.80
2.71
MNLD
2.41
3.77
3.22
m.v
2.62
2.65
6.07
5.15
4.16
3.54
3.79
3.09
3.23
2.48
2.05
2.03
1.83
1.83
1.74
1.52
1.18
1.19
1.23
1.11
1.25
1.23
1.64
2.45
2.65
2.87
9.99
15.29
7.82
6.58
6.86
4.20
3.04
2.90
2.64
2.76
2.40
AL
4.83
9.39
3.81
m.v
3.68
4.44
3.64
3.81
4.02
4.06
4.23
4.20
3.93
4.30
5.06
5.62
5.26
m.v
6.15
6.96
8.03
8.40
9.57
m.v
9.40
12.90
13.50
13.56
9.05
4.26
5.30
6.15
4.25
4.28
4.92
5.90
5.10
6.04
6.92
7.39
9.40
ALLD
4.83
9.39
7.13
m.v
5.07
6.35
13.38
10.89
9.44
8.07
8.91
7.38
6.31
6.27
5.46
5.46
4.71
m.v
4.35
3.93
3.40
3.29
3.70
m.v
3.06
4.26
6.14
9.79
7.77
10.20
19.46
27.66
16.37
14.22
12.88
11.36
7.32
7.21
7.06
7.29
8.33
B-8

-------
                                                            Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW  DATE   Dischg   pH  TEMP  ACID  ACIDLD ALK  ALKLD  TFE TFELD  FE  FELD  SO4  SO4LD  CA   CALD   MG   MGLD  MN  MNLD  AL   ALLD
max
min
Avg
Med
1.84
0.12
0.53
0.40
3.94 10.70
3.00 6.10
3.28 8.47
3.28 8.60
168.00
62.00
96.99
96.00
575.59
36.60
115.40
88.15
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
2.80
0.60
1.26
1.20
9.44
0.20
1.60
1.29
1.40 3.15 268.00 930.84 79.00 196.38 110.00 264.68 6.50
0.00 0.00 86.00 44.89 39.00 18.00 17.00 22.28 1.20
0.42 0.49 171.80 210.83 54.29 65.17 67.68 76.74 2.71
0.30 0.36 166.50 175.54 54.00 53.24 65.00 61.11 2.45
15.29
0.82
3.09
2.44
13.56 27.66
0.71 0.66
6.45 7.17
5.85 5.84
                                                                                                                               B-9

-------
Appendix B
B-10

-------
 Q

 ST
 2.

 o"
 3
CTQ
 rt

 O
 as
 p+
 as
O
HH

X


n

-------

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
CLARION
ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

DATE
2/16/82
3/17/82
3/30/82
4/12/82
4/22/82
5/11/82
5/26/82
6/9/82
6/23/82
7/7/82
7/21/82
8/3/82
8/30/82
9/17/82
9/27/82
10/12/82
10/26/82
11/8/82
11/24/82
12/28/82
1/12/83
1/26/83
2/9/83
2/23/83
3/9/83
4/5/83
4/19/83
5/10/83
5/24/83
6/15/83
7/6/83
7/19/83
8/1 0/83
8/23/83
9/7/83
10/13/83
1 0/26/83
11/9/83
11/21/83
12/8/83

PH
3.38
3.51
2.90
3.13
2.83
2.95
2.77
2.78
3.19
2.74
2.74
2.83
2.95
3.22
3.20
3.11
3.17
3.14
3.27
3.15
3.16
3.12
3.16
3.04
3.04
3.12
3.01
2.94
2.95
2.67
2.78
2.78
3.03
2.88
2.92
3.03
2.76
3.00
3.25
3.29

DISCH
-
-
10.32
40.84
9.43
3.59
9.43
5.39
3.60
0.20
3.60
12.21
0.45
0.45
6.70
1.35
0.20
3.60
14.80
22.00
9.40
8.50
3.60
5.40
4.00
8.50
9.40
7.63
20.20
0.17
4.04
2.70
5.40
0.10
3.60
5.40
6.70
1.30
12.60
14.60

ACID
189
139
1102
607
1038
719
644
793
943
1197
951
708
819
480
317
411
441
401
299
495
910
985
968
744
850
630
598
990
795
1247
1383
1205
954
985
469
663
636
584
201
570

TOT. FE
14.0
9.8
33.5
38.6
83.2
73.9
85.7
85.0
94.6
108.0
97.1
111.0
90.0
72.0
32.5
46.4
60.0
16.0
45.0
85.5
182.0
185.0
174.0
129.0
129.0
133.0
137.0
165.0
131.0
136.0
215.0
257.0
182.0
174.0
98.4
102.0
86.4
112.0
73.5
115.0

FFE
9.0
8.7
32.1
27.8
30.5
17.1
32.6
32.4
55.8
50.6
41.9
61.0
29.5
60.6
0.9
8.6
22.5
12.0
30.6
56.5
112.0
101.0
116.0
42.0
34.0
43.0
86.0
69.0
86.0
66.7
92.3
105.0
93.0
119.0
43.3
47.0
25.0
26.5
50.0
89.0

S04
344
296
1611
1002
1401
1681
1406
1748
1350
1935
2064
1656
1436
1114
880
1294
2861
1068
797
1462
2436
2203
1844
1355
1619
1488
1870
2397
1934
2134
3241
2216
2275
2682
1175
1652
1642
1623
859
1193
                                                                      C-l

-------
Appendix C
ROW
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
DATE
12/20/83
1/5/84
1/27/84
2/15/84
2/28/84
3/19/84
3/27/84
4/12/84
4/25/84
5/8/84
5/22/84
6/6/84
6/27/84
7/11/84
8/8/84
8/21/84
9/5/84
1 0/3/84
1 0/1 6/84
11/1/84
11/14/84
11/28/84
12/12/84
12/27/84
1/10/85
2/27/85
3/12/85
3/24/85
4/10/85
4/23/85
4/24/85
5/2/85
5/22/85
6/4/85
7/1/85
7/18/85
7/31/85
8/13/85
8/28/85
9/10/85
PH
3.18
3.26
5.23
3.27
3.24
3.16
3.04
2.90
3.06
3.03
3.09
3.29
5.68
5.28
4.66
5.58
6.03
4.52
4.05
4.49
4.70
5.76
4.79
5.23
4.66
4.60
5.11
4.23
3.75
3.03
3.75
3.16
3.06
4.35
5.25
4.78
5.33
6.18
6.43
6.12
DISCH
9.40
12.21
-
-
-
-
-
-
-
-
6.30
6.70
36.30
30.50
20.20
-
-
-
4.00
5.40
14.80
83.93
2.69
36.36
-
-
-
-
28.56
5.39
28.56
6.23
4.13
6.83
172.00
0.39
5.37
0.05
0.09
14.82
ACID
992
954
528
451
746
518
588
682
386
487
359
234
255
147
416
182
19
265
176
160
175
46
160
50
378
289
60
306
232
444
23.2
499
1546
489
147
121
164
2
1
17
TOT. FE
194.0
153.0
86.0
70.0
108.0
86.0
91.0
90.0
24.0
67.6
36.5
54.0
75.0
45.0
134.0
75.0
38.5
58.0
36.0
27.0
43.0
25.0
48.6
20.0
106.0
69.0
22.0
46.0
36.0
28.0
36.0
78.5
62.0
50.0
12.5
11.5
133.0
54.0
102.0
8.7
FFE
136.0
131.0
32.0
22.0
66.5
47.0
32.0
27.5
4.5
17.0
25.5
39.0
73.5
40.0
132.0
55.0
33.5
46.5
17.0
21.0
36.5
10.0
41.4
15.6
97.0
64.0
9.0
46.0
32.2
32.0
32.2
59.0
27.0
23.5
12.5
9.0
33.9
11.0
12.0
7.8
S04
2000
2002
1316
1178
1757
1372
1544
1722
839
1250
708
1367
1837
980
1822
2064
1753
1501
1672
1572
1624
556
721
426
1701
1082
356
1322
975
1881
975
1843
1280
1960
17056
1580
1735
1313
941
617
C-2

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96




DATE
9/24/85
10/22/85
11/7/85
11/19/85
1 2/4/85
12/18/85
1/9/86
1/21/86
2/3/86
2/18/86
3/4/86
4/1/86
5/5/86
6/2/86
7/7/86
8/4/86
max
min
Avg
Med
PH
5.35
4.88
4.61
4.02
3.58
3.68
3.47
3.94
3.69
4.32
3.40
2.87
2.90
3.00
3.00
2.90
6.43
2.67
3.70
3.20
DISCH
1.83
12.21
0.95
9.13
10.37
12.48
4.13
7.63
14.82
50.54
21.79
5.40
3.08
2.62
1.21
12.21
172.00
0.05
12.57
6.70
ACID
109
165
298
327
419
592
1080
343
370
80
516
648
678
508
574
478
1546.00
1.00
522.38
483.50
TOT. FE
22.7
29.0
74.0
76.5
84.0
116.0
146.0
62.0
70.0
20.6
85.0
118.0
74.1
66.5
30.5
96.7
257.00
8.70
82.40
75.00
FFE
19.8
27.0
67.4
73.5
75.5
114.0
143.0
62.0
64.0
8.4
62.0
104.0
32.0
30.1
12.2
73.0
143.00
0.90
48.38
37.75
S04
1528
1192
1029
1284
1566
1948
2495
1258
1720
364
1501
2180
2100
2210
2130
2120
3241 .00
296.00
1528.30
1569.00
                                                                      C-3

-------
Appendix C
C-4

-------
   APPENDIX - D
Ernest Discharge Data

-------

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ERNEST
ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
DATE
3/19/81
3/27/81
3/30/81
7/30/81
8/7/81
8/13/81
8/21/81
8/26/81
9/4/81
9/18/81
9/30/81
10/9/81
10/21/81
11/4/81
11/16/81
11/30/81
1/12/82
1/19/82
2/1/82
2/8/82
2/23/83
3/2/82
3/1 7/82
3/30/82
5/3/82
5/1 0/82
5/1 7/82
5/24/82
6/8/82
6/21/82
6/28/82
7/1 2/82
7/1 9/82
7/26/82
8/2/82
8/9/82
8/1 6/82
8/23/82
8/31/82
1/3/83
DAYS
0
8
11
133
141
147
155
160
169
183
195
204
216
230
242
256
299
306
319
326
341
348
363
376
410
417
424
431
446
459
466
480
487
494
501
508
515
522
530
655
PH
2.63
2.71
2.67
2.50
2.55
2.45
2.48
2.35
2.62
2.53
2.36
2.48
2.42
2.46
2.36
2.54
2.69
2.58
2.80
2.43
2.62
2.60
2.54
2.67
2.60
2.63
2.63
2.54
2.58
2.55
2.49
2.50
2.53
2.48
2.42
2.30
2.26
2.24
2.27
2.30
FLOW
55
288
229
42
39
28
17
12
340
47
20
15
12
76
32
47
56
47
113
199
189
208
340
288
94
81
51
42
113
42
51
39
28
17
15
6
3.5
3.5
3
3
ACID
5219
5111
4986
3082
3564
3716
4070
4314
2165
3819
4293
4267
4900
4300
4698
4375
3592
3858
2470
3940
2892
3004
3044
2757
2809
2665
3043
2855
2292
2499
2349
3455
3567
4456
4591
4589
4639
4670
5606
4306
ACIDLD
3444
17663
13701
1553
1668
1248
830
621
8833
2153
1030
768
706
3922
1804
2468
2414
2178
3349
9409
6559
7497
12419
9528
3169
2590
1962
1439
3108
1259
1438
1617
1199
909
826
330
195
196
202
155
FE
461
447
444
360
489
459
500
619
290
391
519
584
674
530
618
486
518
653
366
538
463
525
594
639
705
694
660
615
278
377.2
362
540
596
694
693
624
495
540
540
330
FELD
304
1544
1220
181
229
154
102
89
1183
221
125
105
97
483
237
274
348
368
499
1285
1050
1310
2424
2208
795
675
403
310
377
190
222
253
200
142
125
45
21
23
19
12
FFE
452
440
428
348
320
357
410
419
177
184
399
339
450
364
387
267
396
408
330
520
335
438
485
556
642
634
576
518
160
221
226
319
393
507
470
268
118
107
132
89
S04
5410
5135
5174
4445
3552
3981
4742
3716
3107
3677
2434
5343
4942
2304
4842
4977
3963
4031
1873
2804
2573
3043
3025
2956
2951
2791
3161
4002
2797
2476
2834
3794
4003
4593
5125
4395
5208
5803
5114
4617
SO4LD
3571
17746
14218
2240
1662
1338
967
525
12676
2074
584
962
712
2101
1859
2807
2663
2273
2540
6696
5836
7595
12342
10216
3329
2713
1935
2017
3793
1248
1734
1776
1345
937
923
316
219
244
184
166
                                                                      D-l

-------
Appendix D
ROW
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
DATE
1/10/83
1/17/83
2/1 4/83
2/22/83
2/28/83
3/8/83
3/1 4/83
3/22/83
3/28/83
4/4/83
4/11/83
4/25/83
5/4/83
5/9/83
5/1 8/83
5/23/83
5/31/83
6/7/83
6/1 5/83
6/20/83
6/27/83
7/6/83
7/11/83
7/1 8/83
7/25/83
8/1/83
8/8/83
8/1 5/83
8/22/83
8/29/83
9/6/83
9/1 2/83
9/1 9/83
9/26/83
10/3/83
10/11/83
10/17/83
1 0/24/83
10/31/83
11/14/83
11/21/83
DAYS
662
669
697
705
711
719
725
733
739
746
753
767
776
781
790
795
803
810
818
823
830
839
844
851
858
865
872
879
886
893
901
907
914
921
928
936
942
949
956
970
977
PH
2.40
2.30
2.40
2.50
2.30
2.50
2.40
2.40
2.50
2.50
2.50
2.50
2.50
2.50
2.60
2.60
2.60
2.60
2.60
2.60
2.60
2.60
2.50
2.50
3.10
2.40
2.10
2.30
2.30
2.20
2.20
2.30
2.30
2.30
2.30
2.30
2.30
2.30
2.30
2.30
2.40
FLOW
7
4
8
8
7
15
8
26
75
56
88
136
179
199
179
252
240
161
152
113
100
56
35
17
17
8
4
4
3
7
4
8
4
4
3
6
8
4
4
6
6
ACID
3247
4188
3105
3455
3620
3545
1744
2356
1273
1478
1620
3109
2872
3293
3866
2690
3028
3300
3060
2980
3195
3893
3350
3706
3559
4045
4368
4956
4293
4619
4881
5113
4820
4749
4953
5367
4978
4891
4295
3964
3616
ACIDLD
273
201
298
332
305
639
168
736
1147
995
1713
5081
6178
7875
8316
8146
8733
6384
5589
4046
3839
2680
1409
758
727
4045
210
238
155
389
235
492
232
200
179
387
479
235
206
286
261
FE
276
308
314
353
412
352
400
264
306
354
381
476
638
780
895
705
740
820
808
845
659
760
702
630
536
389
456
478
483
462
515
561
510
527
530
540
528
415
430
395
365
FELD
23
15
30
34
35
63
38
82
276
238
403
778
1372
1865
1925
2135
2134
1586
1476
1147
792
511
295
129
109
54
22
23
17
89
25
54
25
22
19
39
51
20
21
28
26
FFE
44
30
152
156
182
150
168
122
121
142
149
260
448
478
792
600
700
710
659
659
579
536
468
435
311
334
70
80
688
79
79
848
820
700
580
760
860
315
412
183
210
S04
3570
4256
3159
3666
4298
3630
2650
2860
2348
1619
2369
3104
3570
3745
4223
2820
4321
3867
2909
3260
3338
3681
3256
4088
3682
4288
5621
5283
6115
5347
5389
5141
5982
6014
5932
5738
5657
4896
4953
4459
4812
SO4LD
300
205
304
352
362
654
255
894
2116
1089
2505
5073
7679
8955
9084
8539
12461
7981
5313
4427
4011
2477
1369
835
752
412
270
254
220
450
259
494
288
253
214
414
544
235
238
321
347
D-2

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
DATE
11/29/83
1 2/5/83
1 2/1 3/83
1 2/1 9/83
1 2/27/83
1/3/84
1/17/84
1/23/84
1/30/84
2/6/84
2/1 3/84
2/21/84
2/27/84
3/5/84
3/1 2/84
3/1 9/84
3/26/84
4/2/84
4/1 2/84
4/1 6/84
4/23/84
4/30/84
5/7/84
5/1 4/84
5/21/84
5/29/84
6/4/84
6/11/84
6/1 9/84
6/25/84
7/2/84
7/9/84
7/1 6/84
7/30/84
8/6/84
8/1 3/84
8/20/84
8/27/84
9/4/84
9/1 0/84
DAYS
985
991
999
1005
1013
1020
1034
1040
1047
1054
1061
1069
1075
1082
1089
1096
1103
1110
1120
1124
1131
1138
1145
1152
1159
1167
1173
1180
1188
1194
1201
1208
1215
1229
1236
1243
1250
1257
1265
1271
PH
2.40
2.70
2.60
2.50
2.50
2.60
2.60
2.60
2.60
2.60
2.60
2.60
2.60
2.70
2.50
2.60
2.70
2.50
2.60
2.60
2.60
2.70
2.70
2.70
2.60
2.80
2.60
2.60
2.80
2.60
2.60
2.60
2.60
2.60
2.70
2.90
2.60
2.50
2.50
2.50
FLOW
23
81
129
100
12
40
85
50
100
81
80
190
50
170
152
170
189
200
288
263
288
300
275
300
251
350
313
198
345
320
189
199
198
128
100
100
350
200
88
76
ACID
2077
1130
2664
2849
3281
3916
5084
4954
4685
4799
3431
3317
3250
2619
3255
3190
3633
3336
3259
3176
2601
2600
2947
3088
2709
1905
2713
2349
1369
3761
3241
3110
3014
3480
3071
778
3236
2815
3611
3521
ACIDLD
574
110
4130
3424
473
1882
5193
2977
5630
4671
3298
7573
2977
5350
5945
6516
8251
8027
11279
10037
9001
9373
9739
11132
8171
8012
10204
5589
5676
11462
7361
7437
7171
5353
3690
935
350
6765
3818
3216
FE
216
93
272
278
404
778
855
815
820
1120
700
530
520
544
630
815
840
720
685
675
625
660
725
625
720
455
540
635
245
395
454
596
605
695
653
94
425
468
490
460
FELD
60
91
422
334
58
373
873
490
985
1090
673
1210
490
1111
1150
1665
1908
1730
2371
2133
2165
2379
2396
2758
2172
1914
2031
1511
1016
1519
1238
1425
1439
1069
785
113
1787
1125
518
420
FFE
240
190
816
1000
1760
332
798
810
820
765
480
335
290
376
430
550
620
570
455
490
450
505
590
625
455
400
425
405
180
337
375
436
520
590
575
56
270
298
300
432
S04
2329
1139
2686
3080
3658
4724
5685
5211
4958
4869
3491
3831
3532
2928
4645
4112
4030
3416
3266
3243
2872
2880
3135
3490
3112
1938
3203
2710
1538
3933
3285
3349
3399
3680
3481
785
3513
3562
3821
4681
SO4LD
644
1109
4164
3701
527
2271
5807
3131
5958
4739
3356
8747
3131
5981
8484
8400
9153
8210
11303
10249
9939
10382
10360
12581
9386
8151
12047
6448
6376
15124
7461
8008
8087
5660
4183
943
14775
8561
4041
4275
                                                                      D-3

-------
Appendix D
ROW
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
DATE
9/1 7/84
9/24/84
10/1/84
1 0/9/84
1 0/1 6/84
1 0/22/84
1 0/29/84
11/7/84
11/13/84
11/19/84
11/29/84
1 2/5/84
1 2/1 0/84
1 2/1 7/84
1/3/85
1/7/85
1/17/85
2/26/85
3/4/85
3/1 3/85
3/1 9/85
3/26/85
4/1/85
4/8/85
4/1 6/85
4/22/85
4/29/85
5/1 4/85
5/25/85
5/29/85
6/4/85
6/11/85
6/1 8/85
6/24/85
7/2/85
7/8/85
7/1 7/85
7/23/85
7/30/85
8/7/85
DAYS
1278
1285
1292
1300
1307
1313
1320
1329
1335
1341
1351
1358
1363
1370
1387
1391
1401
1441
1447
1456
1462
1469
1475
1482
1490
1496
1503
1518
1529
1533
1539
1546
1553
1559
1567
1573
1582
1588
1595
1603
PH
2.50
2.60
2.50
2.50
2.50
2.40
2.40
2.30
2.30
2.30
2.40
2.40
2.50
3.10
2.50
2.50
2.40
2.70
2.60
2.50
2.60
2.50
2.70
2.50
2.60
2.50
2.70
2.60
2.60
2.60
2.60
2.50
2.50
2.40
2.40
2.40
2.30
2.30
2.30
2.50
FLOW
60
65
46
30
10
15
4
4
8
8
13
6
15
20
161
100
88
51
66
46
3003
3188
470
251
161
179
199
128
152
51
56
42
7
12
17
6
10
7
10
2
ACID
4054
4057
4596
4467
4471
3552
4263
3909
3298
3944
3315
4370
3685
2599
3017
3301
4121
1790
3364
3462
1697
16401
2045
3534
3574
4465
3956
3290
3194
3440
3042
2884
3427
3787
3635
3966
3880
3976
4076
4626
ACIDLD
2923
3169
2541
1074
537
640
205
188
317
379
518
315
664
625
5841
3967
4359
1097
2669
1914
420
375
11553
10662
6916
9607
9463
5062
5836
2109
2048
1456
288
546
743
286
466
335
490
111
FE
705
20
890
900
810
470
516
440
396
374
414
322
310
326
454
568
684
310
418
390
237
1929
306
536
650
725
815
863
751
675
575
470
540
555
495
425
335
356
395
384
FELD
508
797
492
216
97
85
25
21
38
36
65
23
56
78
879
683
724
190
332
216
168
184
1729
1617
1258
1560
1949
1328
1372
414
387
237
45
80
101
31
24
30
47
10
FFE
525
530
660
670
510
220
506
63
100
86
117
82
86
96
246
334
422
170
218
210
95
182
182
370
505
510
736
638
551
480
425
270
175
230
162
134
36
52
17
8
S04
4270
4738
4839
4813
5206
4092
5339
4583
4391
4362
3825
4424
3598
3136
3586
3459
4293
1930
3834
3576
3585
3409
2116
3671
3745
4625
4333
3666
3205
3511
3143
3010
3814
4950
3925
4532
4321
4660
5267
4864
SO4LD
3079
3701
2675
1157
626
738
257
220
422
419
598
319
649
754
6942
4157
4541
1183
3042
1977
2025
17538
11954
11075
7247
9951
10364
5640
5856
2151
2116
1520
321
714
802
327
519
392
634
117
D-4

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
162
163
164
165
166
167
168
169
170
171
172
173
174




DATE
8/27/85
9/5/85
9/1 2/85
9/1 8/85
9/25/85
1 0/1 6/85
1 0/30/85
11/5/85
11/13/85
11/20/85
1 2/4/85
1 2/1 0/85
1 2/1 7/85




DAYS
1623
1632
1639
1645
1652
1673
1687
1693
1701
1708
1722
1728
1735
max
min
Avg
Med
PH
2.30
2.30
2.30
2.30
2.30
2.50
2.50
2.40
2.60
2.50
2.40
2.60
2.50
3.10
2.10
2.51
2.50
FLOW
3
3
3
4
3
2
2
4
35
94
251
81
313
3188
2
127.23
51
ACID
4657
4040
5141
4598
4441
4658
4805
4350
1714
1781
3136
2947
3166
16401
778
3621 .0
3539.5
ACIDLD
168
146
185
182
160
112
116
209
721
2012
9461
2869
11911
17663
111
3367
1843
FE
406
420
428
398
378
410
403
410
116
113
385
376
584
1929
20
527
513
FELD
15
15
15
16
14
10
10
20
49
128
1162
366
2197
2758
10
627
275
FFE
11
8
14
11
13
27
20
13
20
9
163
146
400
1760
8
365
361
S04
5262
5356
5334
5015
4649
5812
5676
4857
1870
1962
3313
142
3689
6115
142
3887
3804
SO4LD
190
193
192
199
168
140
136
117
787
2217
9995
3062
13879
17746
117
3840
2109
                                                                      D-5

-------
Appendix D
D-6

-------
   APPENDIX - E
Fisher Discharge Data

-------

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
FISHER
ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

DATE
11/27/81
3/11/82
5/19/82
6/9/82
6/22/82
7/29/82
8/20/82
8/26/82
9/18/82
10/9/82
11/13/82
12/11/82
1/22/83
2/14/83
5/18/83
8/17/83
8/24/83
11/22/83
12/15/83
12/17/83
1/28/84
3/2/84
3/31/84
4/21/84
5/26/84
6/27/84
7/25/84
8/21/84
9/6/84
9/21/84
10/3/84
10/16/84
10/23/84
10/24/84
10/29/84
11/23/84
12/18/84
1/26/85
2/23/85
3/11/85

DAYS
0
104
173
194
207
244
266
272
295
316
351
379
421
444
537
628
635
725
748
750
792
826
855
876
911
943
971
998
1014
1029
1041
1054
1061
1062
1067
1092
1117
1156
1184
1200

FLOW
50.0
75.0
130.0
273.0
87.0
54.0
30.0
30.0
20.0
30.0
18.1
36.8
30.0
64.0
100.0
44.9
33.0
115.0
614.0
273.0
54.0
100.0
204.0
483.0
100.0
87.0
44.9
122.0
69.0
37.0
-
-
21.0
-
30.0
18.0
130.0
27.0
81.0
64.0

ACID
127.0
209.0
210.0
86.9
105.0
174.0
119.0
125.0
105.0
127.0
138.0
120.0
112.0
108.0
98.6
126.0
107.0
144.0
107.0
93.1
111.0
101.0
146.0
237.0
118.0
182.0
149.0
149.0
141.0
83.6
70.4
68.3
64.2
80.4
80.4
87.7
55.1
41.8
51.0
28.6

S04
490
1100
550
248
440
290
280
300
370
370
440
410
310
210
220
160
240
430
176
120
260
220
410
1200
280
190
190
120
300
49
350
410
410
460
440
550
320
230
272
292

FE
4.00
4.40
7.00
1.06
3.50
3.84
3.32
4.12
4.50
5.30
4.80
2.83
3.10
2.21
2.12
2.56
0.79
2.23
0.99
1.65
2.46
1.13
1.46
4.30
1.40
1.62
1.49
1.34
1.42
2.01
1.81
1.22
1.09
1.14
1.56
2.37
0.86
1.01
1.21
0.98

MN
11.70
25.10
33.00
6.84
9.44
8.94
9.44
9.44
10.50
9.44
12.10
9.47
8.03
6.99
6.08
7.19
8.15
9.68
4.98
3.78
7.58
5.91
11.80
25.40
8.81
8.10
7.72
7.42
9.49
10.90
12.10
12.60
12.60
12.70
15.40
15.90
10.40
6.08
7.00
7.08

AL
-
-
8.00
20.50
3.50
29.50
6.75
8.75
6.30
6.00
0.32
9.20
5.10
1.09
0.50
3.23
4.27
2.75
0.50
0.50
0.50
1.59
2.41
3.47
2.52
1.98
2.24
1.78
2.78
3.10
4.02
4.46
3.77
3.77
4.45
7.56
4.05
1.92
3.58
2.61
                                                                      E-l

-------
Appendix E
ROW
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
DATE
3/20/85
3/29/85
4/10/85
4/17/85
4/24/85
5/1/85
5/10/85
5/16/85
5/22/85
5/28/85
6/1/85
6/26/85
7/24/85
8/19/85
9/21/85
10/26/85
11/18/85
11/23/85
12/23/85
1/18/86
2/17/86
3/22/86
4/10/86
5/17/86
6/10/86
7/15/86
8/12/86
9/13/86
10/10/86
11/15/86
12/13/86
1/17/86
2/14/87
3/14/87
4/11/87
5/9/87
6/13/87
7/14/87
8/12/87
DAYS
1209
1218
1230
1237
1244
1251
1260
1266
1272
1278
1282
1307
1335
1361
1394
1429
1452
1457
1487
1513
1543
1576
1595
1632
1656
1691
1719
1751
1778
1814
1842
1877
1905
1933
1961
1989
2024
2055
2084
FLOW
69.0
107.0
122.0
73.0
45.0
30.0
69.0
-
18.0
69.0
174.0
45.0
54.0
18.0
24.0
9.7
299.0
87.0
64.0
18.0
54.0
448.0
75.0
30.0
33.0
18.0
18.0
14.0
45.0
100.0
130.0
204.0
75.0
75.0
130.0
64.0
23.0
45.0
0.0
ACID
53.0
42.8
53.0
67.2
57.3
63.2
45.4
38.0
51.4
79.0
86.9
61.2
79.0
73.1
57.1
57.1
51.2
49.2
47.3
53.2
53.2
27.6
49.3
62.0
35.3
58.8
56.8
58.8
54.9
45.1
47.0
60.8
51.0
51.0
34.2
48.9
39.1
37.1
34.4
S04
272
280
256
280
300
300
290
342
320
332
280
240
310
450
380
370
332
300
290
270
260
800
830
410
510
430
710
424
368
300
324
810
950
750
368
460
444
390
356
FE
1.01
1.57
1.11
1.40
1.42
1.03
1.49
0.90
1.33
1.25
1.24
1.90
1.33
1.20
0.82
0.72
0.90
1.02
1.17
1.18
0.86
0.49
0.28
0.32
0.05
0.26
0.44
0.25
0.20
0.54
0.49
0.25
0.21
0.42
0.31
0.40
0.29
0.28
0.11
MN
7.35
8.42
7.36
8.77
8.89
9.30
9.65
9.16
10.10
11.70
12.00
10.60
12.70
13.60
11.70
9.06
9.36
7.82
9.19
9.55
4.95
10.20
11.40
7.37
7.82
7.46
5.85
6.77
5.79
3.99
4.63
8.24
10.10
8.13
4.21
5.57
4.24
4.78
4.47
AL
2.29
2.96
2.89
3.59
3.23
3.16
2.56
1.57
4.27
3.77
3.82
2.47
3.95
4.49
2.98
2.90
3.35
4.44
3.94
5.87
2.15
4.51
4.99
6.05
4.60
13.30
6.50
5.79
6.08
1.78
0.50
0.50
4.68
5.12
1.00
1.95
3.25
3.77
1.68
                                       614.0
                                                   237.0
                                                              1200.0
                                                                           7.00
                                                                                      33.00
                                                                                                  29.50
E-2

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW DATE



DAYS
min
Avg
Med
FLOW
0.0
90.1
64.0
ACID
27.6
84.3
67.2
S04
49.0
382.2
324.0
FE
0.05
1.60
1.21
MN
3.78
9.46
8.94
AL
0.32
4.23
3.50
                                                                      E-3

-------
Appendix E
E-4

-------
    APPENDIX - F
Markson Discharge Data

-------

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load

ROW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
MARKSON
DATE
7/30/81
8/6/81
8/13/81
9/10/81
9/17/81
9/24/81
10/1/81
10/8/81
10/15/81
10/22/81
10/29/81
11/5/81
11/12/81
11/19/81
11/25/80
12/4/81
12/10/81
12/17/81
12/24/81
12/31/81
1/7/82
1/14/82
1/21/82
1/28/82
2/4/82
2/11/82
2/18/82
2/25/82
3/4/82
3/11/82
3/19/82
3/25/82
4/1/82
4/15/82
4/22/82
4/30/82
5/6/82
5/14/82
5/21/82

DAYS Flow
0
7
14
42
49
56
63
70
77
84
91
98
105
112
118
127
133
140
147
154
161
168
175
182
189
196
203
210
217
224
232
238
245
259
266
274
280
288
295

PH
3.40
3.28
3.35
3.22
3.00
3.30
3.30
3.20
3.30
3.20
3.10
3.10
3.20
3.20
3.30
3.30
3.60
4.20
3.40
3.60
3.70
3.50
3.30
3.40
3.40
3.30
3.20
3.40
3.40
3.20
3.20
3.20
3.20
3.10
3.10
3.10
3.10
3.20
3.20

Acidity
136
144
30
129
130
114
214
94
104
114
114
120
104
116
110
62
280
282
112
106
136
128
130
106
120
120
100
150
116
100
100
132
112
98
114
120
114
132
128

FE
38.900
43.500
43.900
50.000
43.900
1 1 .200
49.100
46.400
46.500
46.700
49.500
49.700
48.300
50.500
50.300
52.400
54.000
51.900
52.200
53.700
24.900
27.900
49.600
46.700
45.800
42.900
46.200
43.700
41 .900
42.000
35.600
33.200
32.600
26.000
28.200
13.330
27.360
26.400
28.700

MN
-
-
-
-
5.320
5.320
5.320
5.300
5.230
5.240
5.900
5.310
5.270
5.130
5.230
5.610
5.580
5.230
5.350
5.270
2.740
2.750
5.420
4.980
5.180
5.090
5.600
5.020
5.030
5.190
4.600
4.500
4.530
4.210
4.290
4.520
4.490
4.370
4.630

AL
-
-
-
-
2.580
2.850
3.050
2.170
1.900
2.300
2.420
2.260
2.160
2.190
1.990
2.080
2.230
5.180
1.980
1.990
1.270
1.080
2.290
2.190
2.500
2.660
3.150
2.720
2.590
2.740
2.600
2.520
2.580
2.920
2.360
3.050
3.330
2.780
2.330

S04
407
390
353
398
325
360
400
370
365
345
365
360
350
330
400
350
360
320
360
360
365
360
345
350
385
330
335
370
350
350
330
340
315
270
280
270
275
295
270

FE++
32.30
40.30
41.60
45.30
48.40
11.00
51.00
58.00
61.00
52.00
62.62
45.39
54.54
57.57
55.55
60.60
62.62
63.63
57.57
59.59
18.36
31.11
51.51
42.30
45.50
33.66
43.35
33.66
41.31
40.29
35.19
33.00
30.60
20.16
19.95
13.00
18.06
20.16
26.52

AL Load
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
                                                                       F-l

-------
Appendix F
ROW
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
DATE
5/26/82
6/4/82
6/10/82
6/18/82
6/24/82
7/1/82
7/8/82
7/14/82
7/22/82
7/29/82
8/5/82
8/12/82
8/19/82
8/26/82
9/2/82
9/9/82
9/16/82
9/23/82
10/7/82
10/14/82
10/21/82
10/28/82
11/4/82
11/12/82
11/19/82
11/26/82
12/3/82
12/10/82
12/17/82
12/23/82
12/28/82
1/6/83
1/13/83
1/20/83
1/27/83
2/3/83
2/10/83
2/17/83
2/24/83
3/3/83
DAYS Flow
300
309
315
323
329
33
343
349
357
364
371
378
385
392
399
406
413
420
434
441
448
455
462
470
477
484
491
498
505
511
516
525
532
539
546
553
560
567
574
581
PH
3.20
3.20
3.10
3.10
3.10
3.20
3.10
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.30
3.40
3.20
3.50
3.30
3.20
3.50
3.10
3.80
3.50
3.20
3.20
3.30
3.30
3.30
3.20
3.40
3.30
3.10
3.20
3.10
Acidity
104
100
118
118
122
118
124
130
120
112
104
122
90
106
122
100
112
120
128
124
-
118
120
120
128
130
116
120
110
126
122
254
118
148
120
136
120
106
110
120
FE
8.900
36.500
31.200
26.200
26.400
24.900
28.300
28.800
30.400
35.900
39.100
45.600
35.900
10.000
40.700
34.400
42.900
50.700
47.700
58.900
61.900
57.000
60.200
59.500
22.400
56.600
54.300
48.800
63.500
59.100
54.500
55.900
53.960
53.600
55.100
52.000
32.400
31.900
31.500
30.800
MN
4.760
4.770
4.980
4.390
4.590
4.350
4.540
4.830
4.630
4.680
4.890
5.440
4.920
5.420
5.460
4.750
4.890
5.410
5.290
4.880
5.090
5.240
5.380
5.780
5.970
5.650
4.730
5.580
5.830
5.640
5.490
5.270
5.230
5.320
5.160
5.390
4.810
4.730
4.630
4.810
AL
3.100
2.580
3.290
2.880
3.190
3.160
-
3.520
3.300
2.890
3.080
2.670
2.540
2.470
2.360
2.280
2.420
1.850
2.230
2.200
2.320
2.060
1.910
1.020
1.900
2.120
2.200
1.940
2.030
2.250
2.280
2.030
1.990
2.220
2.250
2.140
2.580
2.730
2.770
0.770
S04
310
320
325
305
305
375
315
295
300
296
305
316
320
315
315
345
325
315
365
375
335
315
320
345
345
345
345
345
335
345
345
335
315
290
325
310
360
275
255
305
FE++
3.41
27.00
20.58
16.59
18.69
20.37
24.48
28.40
-
30.60
35.70
43.35
35.70
8.67
39.90
31.00
42.00
43.35
46.41
53.00
55.00
55.00
55.00
56.00
60.00
60.00
54.00
48.00
57.00
50.00
50.00
47.00
50.00
39.00
49.00
50.00
30.00
30.00
31.00
30.00
AL Load
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
F-2

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
DATE
3/10/83
3/17/83
3/25/83
3/31/83
4/7/83
4/14/83
4/21/83
4/28/83
5/5/83
5/12/83
5/19/83
5/26/83
6/2/83
6/9/83
6/16/83
6/23/83
6/30/83
7/7/83
7/14/83
7/20/83
7/28/83
8/4/83
8/18/83
8/25/83
9/8/83
9/15/83
9/22/83
9/29/83
10/6/83
10/13/83
10/20/83
10/27/83
11/3/83
11/10/83
11/17/83
11/23/83
12/1/83
12/8/83
12/14/83
12/22/83
12/29/83
DAYS Flow
588
595
603
609
616
623
630
637
644
651
658
665
672
679
686
693
700
707
714
720
728
735
749
756
770
777
784
791
798
805
812
819
826
833
840
846
854
861
867
875
882
PH
3.20
3.10
3.30
3.30
3.30
3.10
3.10
3.10
3.10
3.10
3.10
4.40
3.10
3.00
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.20
3.10
3.10
3.10
3.10
3.20
3.10
3.10
3.10
3.20
3.20
3.20
3.20
3.20
3.10
3.20
3.20
3.20
3.30
3.40
Acidity
120
110
102
100
106
136
104
112
128
120
112
134
136
115
114
210
342
116
108
116
154
96
98
116
172
108
156
198
176
180
170
158
124
176
134
134
102
114
104
114
108
FE
36.900
33.800
32.300
30.400
23.600
23.600
19.000
19.000
1 9.400
19.000
16.700
5.700
1 7.400
19.600
1 8.400
17.200
23.800
12.900
15.020
27.700
28.900
6.100
21.900
31.700
35.300
37.100
38.000
18.200
39.900
34.400
34.960
40.900
38.400
38.200
40.900
42.370
34.580
29.640
27.900
21.090
17.350
MN
4.980
4.860
5.000
4.840
4.240
5.130
6.870
6.920
7.630
7.480
6.530
5.200
6.400
6.500
5.820
6.110
5.710
5.000
5.630
5.350
5.300
5.750
5.190
5.150
5.600
5.700
5.900
5.920
6.020
5.510
5.210
5.850
5.510
5.540
5.950
5.930
5.240
5.170
4.310
5.050
4.730
AL
3.190
2.820
2.460
2.700
3.470
3.260
4.040
3.480
3.440
-
3.670
-
3.120
3.260
2.950
4.180
2.310
3.500
2.770
2.880
2.240
2.100
2.280
2.350
2.290
1.970
2.040
1.680
1.270
1.840
2.140
1.750
1.640
1.800
1.930
2.010
2.110
1.980
2.510
3.970
2.700
S04
155
300
280
225
265
280
305
295
355
357
345
420
285
325
275
365
335
295
300
285
321
299
265
510
260
285
340
310
320
275
320
355
330
300
362
334
295
285
295
260
225
FE++
35.00
33.00
28.00
21.00
19.00
12.00
6.63
-
8.40
6.63
6.63
5.10
8.80
8.67
11.22
15.81
21.42
12.00
15.00
27.70
28.80
6.00
21.90
31.70
35.30
37.00
38.00
-
-
-
34.90
408.00
38.00
38.10
40.00
41.00
-
14.49
27.50
14.91
14.49
AL Load
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
                                                                       F-3

-------
Appendix F
ROW
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
DATE
1/5/84
1/12/84
1/19/84
1/26/84
2/2/84
2/9/84
2/16/84
2/23/84
3/1/84
3/8/84
3/15/84
3/22/84
3/30/84
4/5/84
4/13/84
4/19/84
5/3/84
5/10/84
5/17/84
5/24/84
6/7/84
6/21/84
6/28/84
7/12/84
7/19/84
7/26/84
8/2/84
8/9/84
8/23/84
8/30/84
9/6/84
9/13/84
9/20/84
9/27/84
1 0/4/84
10/11/84
10/18/84
10/25/84
11/1/84
11/8/84
DAYS
889
896
903
910
917
924
931
938
945
952
959
966
974
980
988
994
1008
1015
1022
1029
1043
1057
1064
1078
1085
1092
1099
1106
1120
1127
1134
1141
1148
1155
1162
1169
1176
1183
1190
1197
Flow
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1623
1356
1186
1104
1064
1024
1024
984
907
869
831
794
794
794
PH
3.20
3.20
3.30
3.00
3.10
3.10
3.10
3.20
3.20
3.30
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.10
3.20
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.10
3.20
3.20
Acidity
110
150
54
142
106
120
112
122
154
194
184
218
110
114
108
94
152
80
144
156
172
184
144
128
110
124
152
110
98
118
120
122
114
118
110
106
110
110
110
114
FE
18.310
13.650
1 1 .390
26.300
25.650
28.000
22.220
3.810
18.770
1 6.840
19.190
24.860
23.540
25.300
1 4.650
1 4.430
13.500
1 6.400
12.800
12.300
1 1 .000
6.800
16.370
12.700
12.300
12.300
13.600
20.600
23.800
34.000
33.200
35.000
21.500
24.200
21.000
25.700
27.800
25.300
6.900
25.100
MN
5.010
3.430
2.620
4.940
5.150
5.290
4.660
5.420
5.340
5.180
5.570
5.780
5.110
5.090
5.180
5.310
4.900
5.300
4.900
4.700
3.900
3.800
5.180
5.100
3.700
4.300
4.200
1.200
4.500
4.600
5.000
4.900
4.200
4.200
5.000
5.500
5.500
6.700
4.900
6.600
AL
4.240
1.910
1.280
2.730
1.800
2.330
2.070
2.460
3.440
2.730
3.220
2.960
2.500
2.760
1.700
2.050
2.700
2.100
2.300
1.500
2.200
2.200
1.870
2.600
2.500
2.100
2.700
0.800
2.600
2.300
2.000
2.100
2.200
2.300
2.400
2.900
2.600
2.500
3.500
3.500
S04
283
350
158
292
283
292
258
316
269
341
307
316
291
291
272
264
260
286
235
272
272
264
243
245
277
256
317
249
270
305
290
276
255
275
274
298
255
284
282
293
FE++
16.38
12.60
-
25.50
-
27.50
18.80
3.80
10.20
12.81
18.69
24.50
23.50
0.66
11.55
10.71
10.71
9.45
8.19
6.09
8.80
6.00
9.34
-
9.87
-
28.15
20.60
24.10
33.15
32.13
33.66
7.59
24.20
21.00
25.70
27.80
28.05
6.90
25.10
AL Load
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
81.9
19.6
64.2
61.0
63.9
60.3
51.7
49.7
54.5
57.4
54.9
63.9
46.8
63.0
F-4

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
DATE
11/15/84
11/21/84
11/29/84
12/6/84
12/13/84
12/20/84
12/28/84
1/3/85
1/10/85
1/17/85
1/24/85
1/31/85
2/7/85
2/14/85
2/21/85
2/28/85
3/7/85
3/14/85
3/21/85
3/28/85
4/4/85
4/11/85
4/18/85
4/25/85
5/2/85
5/9/85
5/16/85
5/23/85
5/30/85
6/6/85
6/13/85
6/20/85
6/27/85
7/3/85
7/11/85
7/18/85
7/25/85
8/1/85
8/8/85
8/15/85
DAYS
1204
1210
1218
1225
1232
1239
1247
1253
1260
1267
1274
1281
1288
1295
1305
1309
1316
1323
1330
1337
1344
1351
1358
1365
1372
1379
1386
1393
1400
1407
1414
1421
1428
1434
1442
1449
1456
1463
1470
1477
Flow
794
757
1186
1443
1356
1270
1399
1443
1399
1313
1186
1186
1104
1762
1715
1904
1608
1399
1313
1228
1313
1443
1443
1356
1270
1577
1488
1399
1270
1186
1145
1145
1104
1064
1024
984
907
907
869
869
PH
3.20
3.20
3.20
3.30
3.20
3.20
3.20
3.20
3.30
3.40
3.50
3.30
3.40
3.30
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.20
3.10
3.20
3.20
3.10
3.10
3.10
3.10
3.10
3.20
3.20
3.30
3.30
3.30
3.20
3.30
3.30
3.20
3.30
Acidity
126
120
102
106
118
124
106
102
118
92
108
102
104
110
100
90
96
88
90
98
100
100
92
96
104
96
80
92
92
86
90
92
100
80
128
86
78
92
98
92
FE
27.000
44.000
30.000
30.000
29.569
31.290
33.961
32.794
23.094
12.510
26.331
21.965
32.125
27.006
47.837
27.706
24.718
25.887
29.120
20.675
25.545
26.890
23.168
25.092
25.374
29.277
28.376
25.557
20.992
22.321
28.997
29.512
33.395
32.667
49.871
33.180
33.349
35.080
29.991
41.140
MN
5.100
5.200
5.700
5.090
4.424
4.834
5.166
4.894
3.798
2.068
4.319
3.439
4.490
3.766
4.587
4.496
4.062
4.299
4.684
3.824
4.462
4.587
4.023
4.331
4.309
4.715
4.154
3.637
3.478
3.622
3.551
2.815
4.877
4.807
7.465
4.661
4.754
4.948
4.153
4.445
AL
4.000
1.900
2.800
2.220
1.756
1.940
1.867
2.209
1.126
0.751
1.709
1.459
2.266
0.943
1.968
2.264
2.120
1.963
2.077
1.778
2.678
1.925
1.489
1.795
1.511
2.276
2.400
0.954
1.399
1.399
1.509
1.079
1.606
1.499
1.663
1.591
1.632
1.668
1.333
1.418
S04
285
320
310
261
265
291
287
263
253
271
249
273
270
274
261
237
240
238
219
247
252
259
253
253
252
236
246
227
246
247
237
261
266
257
281
300
284
270
285
254
FE++
27.00
44.00
47.40
45.00
29.50
31.29
33.96
32.75
33.09
12.51
26.33
20.90
27.50
27.01
21.40
17.34
15.54
16.17
19.53
18.90
20.91
20.40
4.93
15.30
20.91
23.46
19.89
23.00
17.20
15.50
23.50
28.56
30.60
29.58
28.05
33.15
33.00
34.00
34.50
35.50
AL Load
48.7
47.3
82.4
88.3
72.1
73.8
86.9
84.9
63.9
32.6
61.6
49.0
59.6
79.8
94.6
102.9
78.5
72.3
73.9
56.4
70.4
79.6
69.8
70.6
65.8
89.4
74.3
61.2
53.1
51.6
48.9
38.7
64.7
61.5
91.9
55.1
51.8
53.9
43.4
46.4
                                                                       F-5

-------
Appendix F
ROW
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
DATE
8/22/85
8/29/85
9/5/85
9/12/85
9/19/85
9/26/85
10/3/85
10/10/85
10/17/85
10/24/85
10/31/85
11/7/85
11/14/85
11/22/85
11/27/85
12/5/85
12/12/85
12/19/85
12/26/85
1/2/86
1/9/86
1/16/86
1/23/86
1/30/86
2/6/86
2/13/86
2/20/86
2/27/86
3/6/86
3/13/86
3/20/86
3/27/86
4/3/86
4/10/86
4/17/86
4/24/86
5/1/86
5/8/86
5/15/86
5/22/86
DAYS
1484
1491
1498
1505
1512
1519
1526
1533
1540
1547
1554
1561
1568
1576
1581
1589
1596
1603
1610
1617
1624
1631
1638
1645
1652
1659
1666
1673
1680
1687
1694
1701
1708
1715
1722
1729
1736
1743
1750
1757
Flow
831
794
794
794
794
794
869
869
831
831
794
794
794
907
1024
1809
1715
1577
1399
1228
1145
1104
1104
1270
1313
1488
2298
4793
2767
2767
6533
3772
2820
2198
2820
4547
3371
2555
2000
4006
PH
3.40
3.30
3.20
3.30
3.40
3.40
3.40
3.30
3.30
3.30
3.30
3.20
3.20
3.20
3.20
3.30
3.30
3.20
3.20
3.30
3.30
3.30
3.20
3.30
3.30
3.20
3.20
3.30
3.20
3.20
3.30
3.30
3.30
3.40
3.30
3.40
3.40
3.40
3.30
3.40
Acidity
110
106
90
134
106
118
114
94
102
102
96
86
90
82
96
90
114
92
88
94
96
108
104
128
92
100
104
130
92
114
106
78
98
94
98
80
82
82
98
114
FE
36.620
35.917
39.801
32.232
42.605
41.471
40.882
44.588
36.542
38.551
39.511
37.989
42.059
40.527
49.232
33.762
27.262
25.320
28.270
27.100
33.434
33.000
34.800
39.300
35.500
39.000
33.500
22.900
19.700
1 9.400
11.100
1 1 .800
1 1 .300
1 6.400
18.300
1 1 .200
10.300
13.700
21.200
1 6.400
MN
5.055
4.762
5.061
4.418
5.041
5.469
5.219
5.775
5.115
4.698
5.322
4.781
5.533
5.078
6.378
4.554
4.384
4.607
5.294
5.885
5.712
4.050
5.260
5.410
5.036
5.260
5.120
8.160
6.230
5.470
5.180
5.170
3.950
5.065
5.240
4.970
4.460
5.104
5.530
5.170
AL
1.178
1.443
1.290
0.922
1.274
1.199
1.500
1.889
1.284
1.340
1.608
1.009
1.301
1.587
2.421
1.596
2.063
2.167
2.043
3.337
2.410
0.800
2.138
2.014
2.410
2.390
2.085
5.870
3.890
3.320
3.804
3.240
2.420
2.930
2.501
2.850
2.680
2.770
3.360
2.480
S04
229
274
270
271
279
292
309
283
298
299
313
294
293
325
320
285
281
263
306
296
291
325
330
293
302
297
287
326
331
348
282
281
259
254
252
244
250
255
210
259
FE++
36.00
35.50
36.00
32.00
39.50
37.00
39.00
39.00
38.00
36.00
38.00
38.00
39.50
20.50
38.50
33.50
27.00
25.00
25.00
27.00
28.50
29.50
31.50
37.50
34.50
33.00
31.00
13.50
15.20
16.00
4.00
5.70
10.00
12.30
13.00
4.50
5.83
3.70
3.30
12.20
AL Load
50.5
45.4
48.3
42.2
48.1
52.2
54.5
60.3
51.1
46.9
50.8
45.6
52.8
55.4
78.5
99.0
90.4
87.3
89.0
86.9
78.6
53.7
69.8
82.6
79.5
94.1
141.4
470.1
207.2
181.9
406.8
234.4
133.9
133.8
177.6
271.6
180.7
156.7
132.9
248.9
F-6

-------
Statistical Analysis of Abandoned Mine Drainage in the Assessment of Pollution Load
ROW
241
242
243
244
245
246
247
248
249
250
251
252
253




DATE
5/29/86
6/5/86
6/12/86
6/19/86
6/26/86
7/2/86
7/10/86
7/17/86
7/24/86
8/1/86
8/7/86
8/14/86
8/21/86




DAYS
1764
1771
1778
1785
1792
1798
1806
1813
1820
1828
1834
1841
1848
max
min
Avg
Med
Flow
3427
2608
2049
1762
1532
1399
1270
1186
1145
1270
1228
1145
1064
6533
757
1504
1228
PH
3.40
3.30
3.40
3.40
3.30
3.30
3.30
3.30
3.30
3.20
3.20
3.30
3.20
4.40
3.00
3.24
3.20
Acidity
32
76
102
86
94
102
100
100
92
102
86
86
94
342
30
117
110
FE
1 1 .800
18.700
16.300
17.100
14.100
20.200
22.900
24.900
22.200
20.100
25.100
24.100
26.300
63.500
3.810
30.703
28.997
MN
4.980
5.040
5.370
5.640
5.310
7.280
5.610
6.860
5.060
4.930
5.820
5.310
5.690
8.160
1.200
5.044
5.100
AL
3.000
2.480
2.560
2.570
2.630
2.520
2.470
2.230
2.070
1.990
2.180
2.150
2.300
5.870
0.751
2.322
2.265
S04
257
257
246
247
222
243
256
254
254
271
252
280
293
510
155
298
293
FE++
5.00
8.40
12.00
12.40
13.00
14.91
18.50
19.00
20.00
20.10
20.50
19.00
21.50
408.00
0.66
29.66
27.70
AL Load
205.1
158.0
132.3
119.5
97.8
122.4
85.6
97.8
69.6
75.3
85.9
73.1
72.8
470.1
0.0
38.0
0.0
                                                                       F-7

-------
Appendix F
F-8

-------