-------

-------
on
          Assessment of Statistical Tests in
             40  CFR 75.21  (FR, 12/3/91):
            Alternative Monitoring Systems
                            Prepared by

                       Emissions Monitoring Section
                         Source Control Branch
                          Acid Rain Division
                 Office of Atmospheric and Indoor Air Programs
                    U.S. Environmental Protection Agency
                        Washington, D.C. 20460
                        The Cadmus Group, Inc.
                      Durham, North Carolina 27713
CO
—        '              •• ^QUARTERS LIBRARY
                       .-.WitfMMENTAL PROTECTION AGENCY
EJ                      WASHINGTON, D.C. 20460

-------

-------
                                 Project Summary

Title: >., ,L--?           Assessment of Statistical Tests in 40 CFR 75.21 (FR, 12/3/91):
                         Alternative Monitoring Systems

Authors:                 Elliot Lieberman and John Schakenbach.
                         Doris A. Price, Chief, Emissions Monitoring Section.
                         Source Control Branch
                         Acid Rain Division
                         Office of Atmospheric and Indoor Air Programs
                         U.S. Environmental Protection Agency

                         William Warren-Hicks, Susan E. Spruill, Jane E. Mudano.
                         The Cadmus Group, Inc.

Date:                    October 1992

-------
                                       Abstract




       This report assesses the three statistical tests that were included in the proposed acid rain



regulations (40 CFR 75.21) for determining whether an alternative monitoring system provides




information with the same precision as a continuous emissions monitoring system as required




under Section 412(a), Title IV, of the Clean Air Act Amendments of 1990.  The report presents




the results  of applying the statistical tests to  six databases. Also discussed is a procedure for




insuring the applicability of the proposed tests when the alternative monitoring system (AMS)




measurements or the  continuous emissions monitoring system  (CEMS)  measurements are



autocorrelated.




       The analysis found fifty-four subsets of paired CEMS/CEMS data and three subsets of




paired AMS/CEMS data that passed all three tests, leading to the conclusion that the proposed




tests are stringent but not preclusive.  The report presents a procedure for inflating the variance




used in the F-test and  the standard error of the  mean used in the bks test to compensate for




possible underestimation of these statistics when measurement data are autocorrelated.

-------
1.     Introduction
       Section 412(a) of Title IV of the dean Air Act Amendments of 1990 requires that




alternative monitoring systems (AMS) provide information with "the same precision, reliability,




accessibility, and timeliness as that provided by GEMS" (continuous emissions monitoring




system).  To give an explicit and objective basis for determining whether an AMS satisfies the




precision requirement, the proposed acid rain regulations (40 CFR 75.21), published in the




Federal Register  on December 3, 1991, set forth three  statistical tests  for evaluating the




equivalency of an AMS and a GEMS. The tests are  designed to gauge the extent of systematic




error (bias test), random error (F-test), and correlation between the AMS and CEMS.




       To appraise the  stringency and workability of the tests  and  to  evaluate comments




submitted on the proposed tests, EPA first assembled six databases that approximate a range of




potential AMS/CEMS behavior and then applied the proposed statistical  tests to these databases.




Since none of the databases was originally intended to demonstrate AMS/CEMS equivalency,




there are disparities between these databases and the specifications that would have to be met by




data submitted to EPA under the AMS provisions. However, despite these disparities, analysis




of these data allowed EPA to evaluate whether the tests were serving their intended purpose and



whether modifications were needed to enhance the applicability of the tests in cases where time



dependencies (also called "autocorrelation") are discernable in the data.

-------
                             Alternative Monitoring System

2.     Databases-  •?--•..(••••••  •- •>•_•-,,  ...,.  -• •  *.'.-... -,••-...,-   •          .....  ...,,.......,-..,.

       Table 1 summarizes the  characteristics  of the databases  analyzed  in this study.  To

increase the effective number of datasets that were available for analysis, wherever feasible, the

three statistical tests were applied to appropriate subsets of a database as well as the whole

database.  To qualify for analysis, a subset had to meet conditions similar to those contained in

§75.21(a)(l)(vi) of the proposed regulations, which require owners or operators who petition EPA

for use of an AMS to provide paired hourly observations from both a GEMS and the proposed

AMS for a minimum of 30 successive unit operating days (720 hours). No more than 10 percent

of the observations are allowed to be missing. In qualifying subsets,  observations without

concurrent values in its paired dataset were flagged.   Then,  any observation preceded by a

flagged observation was also flagged. Finally,  all flagged observations were discarded. This

constituted the "lagging" procedure necessary for performing in the autocorrelation  analysis

described below.  Thus, in  the end, qualifying subsets may have fewer man  90% of the

observations for the required 30-day period of coverage.

       The  specific features of the datasets  and subsets analyzed in this study are described

below:

       UARG: In comments (Docket A-90-51,  IV-D-185) on the proposed statistical  tests for
                                  •
alternative monitoring  system, the Utility Air Regulatory Group  provided  22 hours of paired

hourly data obtained from two test teams concurrently performing EPA Test Method 6 for SO2

(40 CFR Part 60) at an unidentified power plant While the number of observations falls short

-------
Table 1: Characteristics of Databases Examined in This Study
••••-•••;. •..••..Name--; A--C-V: -.-
Utility Air Regulatory Group
(unidentified source)
Virginia Power Co.
Chesapeake Energy Center Unit
#4
Pennsylvania Electric Co.
Homer City Unit #1
Pennsylvania Electric Co.
Homer City Unit #3
Northern States Power Co.
Sherburne County Unit #3
Niagara Mohawk
Oswego Unit #6
Monitoring Method -'-•-•
2 Reference Method 6
5CEMS
Coal Sampling and Analysis
CEMS
Coal Sampling and Analysis
CEMS
Coal Sampling and Analysis
CEMS
Oil Sampling and Analysis
CEMS
Monitoring
Frequency
Hourly
Hourly
Daily
Daily
Daily
Daily
Daily
Hourly
Weekly
hourly
Monitoring
Duration
24 hours
63 days
730 days
730 days
730 days
455 days
       of the 720 hours required under the proposed regulations, these data were included in the analysis



       since they had been submitted to EPA as part of comments on the alternative monitoring system



       statistical tests. Due to the small sample size of the whole database, no subsets of the data were



       analyzed.



             Virginia Power Co, Chesapeake  Unit #4:  To provide a baseline assessment of the



       proposed statistical tests for alternative monitoring systems, EPA performed the tests on 63 days



       of hourly data from five CEMS  installed on a common duct at  this unit  The concurrent



       measurements from two CEMS were paired, one being designated the AMS. This enabled EPA

-------
to assess the claim that the statistical tests were so restrictive that not even two CEMS could pass




them.




       The Chesapeake unit data were made available to EPA by Virginia Power, which was




testing competing brands of CEMS  in anticipation of making a purchasing decision. The five




CEMS were not required to meet EPA's certification requirements during  installation, so the




results on the statistical tests for AMS would not be expected to be fully comparable to results




that would be obtained from a certified CEMS.  Nevertheless, it was anticipated that analysis of




these data would indicate whether the tests were preclusive, as some commenters had maintained.




       EPA contracted Entropy Environmentalists, Inc. to collect  the  data  and perform a




preliminary analysis of the data.  To maintain the anonymity of the manufacturers of the CEMS




that were tested, they were designated by letters A-E. The data and Entropy's report appears




under the title "Study of Proposed Acid Rain Regulations Section 75.21, Alternative Monitoring




Systems"  (EPA Contract No. 68-02-4462; Work  Assignment No.  91-156).   A  more




comprehensive analysis of the data was subsequently performed for EPA by The Cadmus Group,



Inc. and is reported here.




       There were two rounds of subsetting the data from the five CEMS.  In the first round,




monitor A was arbitrarily designated the "official" CEMS. Its measurements were paired, in turn,



with concurrent values from monitors B-E, which were each considered to represent an AMS.




The subsetting process described below was then applied to the paired data from CEMS A-B, A-




C, A-D, and A-E. This produced 37 qualifying subsets in addition to the four full datasets.




       In the second round of  subsetting, no CEMS was designated a priori as the AMS.



Instead, the subsetting process was applied to all possible pairs of CEMS measurements, i.e., A-

-------
B, A-C, A-D, A-E, B-C, B-D, B-E, C-D, C-E, and D-E. Then, the data were analyzed to reveal




the number of passes on each of the three statistical tests regardless of which of the paired CEMS



was  designated as the AMS.   This-producedi72 qualifying .subsets in addition to the ten full




datasets.             .




       Data subsets were created by rolling a 720-observation window through the original paired




data stream in increments of five observations. Thus, the first subset was created by keeping




only the first 720 observations in the data stream.  Hie second subset was created by omitting




the first 5 observations in the data stream and then keeping next 720  observations.  Therefore,




data Subset 1 contains the original observations 1 through 720, whereas data Subset 2 contains




the original observations 6 through 725. This process of subsetting was continued throughout




the data stream until  no subset that contained 720 observations could be created.  Since the




proposed rule specifies that a minimum of 90% of the data stream must contain nonmissing



values, all data subsets which contained less than 648 (90% of 720) paired CEMS observations,




where one or both CEMS values were missing, were discarded.  Lagging, the process of pairing



an observation with the observation immediately preceding it, was necessary for calculation of




autocorrelation coefficients. Further reduction in the number of observations may have occurred




in some subsets when the data were lagged,  potentially pairing missing values to non-missing




values.   However, these subsets  were not  discarded  (e.g., Subset  1  of a-d contains  613




observations).




       Homer City Units #1 and #3: The Homer Gty datasets contained daily CEMS and AMS




readings. In order to test the same time span as 720 hourly readings would cover, data subsets



were created in groups of 30 observations (30 days = 720 hours). Unlike the Chesapeake data,

-------
the original data stream was "blocked" into non-overlapping and contiguous subsets of 30




observations.  Those subsets which contained less than 27  observations (90% of 30) were




discarded.  Further reduction in the number of observations may have occurred in some subsets




when the data were lagged, potentially pairing missing values to non-missing values.  However,




these subsets were not discarded (e.g. Subset 3 of Homer City #1 contains 26 observations).




Subsets from Homer Gty #1 were created  independently of the Homer City #3 subsets.




       Northern States Power Company:  Although the AMS data from Northern States were




recorded daily, the CEMS data were recorded hourly, with AMS data repeated for each hour of




a given day.  As with the Homer Gty data, the original data stream was "blocked" into non-




overlapping and contiguous subsets of 720 observations. Those subsets which contained less than



648 observations (90% of 720) were discarded.  Further reduction in the number of observations




may have occurred in some subsets when the data were lagged, potentially pairing missing values




to non-missing values (none occurred for Northern States).




       Niagara Mohawk: Although the AMS data from Niagara Mohawk were recorded weekly,



the CEMS data were recorded hourly. The AMS data were repeated for each hour of a given




week.  Similar to the Northern States data, the original data stream was "blocked" into non-



overlapping and contiguous subsets of 720 observations. Those subsets which contained less than




648 observations (90% of 720) were discarded.  Further reduction in the number of observations



may have occurred in some subsets when the data were lagged, potentially pairing missing values




to non-missing values'(none occurred for Niagara Mohawk).
                                          8

-------
3.     Statistical Procedures
       The three statistical tests for precision specified in §75.21 are the
       F-test:
                                   Pr
                                                .05
                             Eq. (1)
where S^AMS is the sample variance of the AMS measurements, S^CEUS is the sample variance of
the CEMS measurements, and F05 is the critical value of the F statistic at a = 0.05.  If the ratio
of the sample variances exceeds F 05, we reject die hypothesis that the variances of the AMS and
CEMS are equal with a 95% level of confidence (i.e., a = 0.05).
       Bias test:

                                                                            Eq. (2)
where  d  is the mean difference between the CEMS and the AMS,  Sj is the standard error
of the mean difference, and   *ao25  is  the critical value of the t statistic for rejecting the
hypothesis that, with 97.5% confidence (Le., a = 0.025), there is no systematic error in the AMS
measurements when compared to the CEMS measurements.
       Correlation test:
                                     (JlUSjCEMS)
0.80]
                                                                              Eq. (3)
where  r(flMSJC^tSi  is  the correlation  coefficient  between the CEMS  and AMS  emissions
measurements.

-------
4.     Autocorrelation Analysis

       Long-term, systematically collected sequential data, like that required under §75.21 of the

proposed rule, often display  a significantly high level of correlation between  successive

measurements, known as autocorrelation. The variance calculated from autocorrelated data tends

be  underestimated. In the parlance of statistics, the variance is said to be downwardly biased,

i.e., it underestimates the population variance/  If an underestimated variance is used in tests

of hypotheses, it can affect the likelihood of committing Type I error:  rejecting a hypothesis that

is actually true (Cochran  1977: 220; Wolter, 1984; Magee, 1989; Box, et al, 1978, p. 89;

Gujarati, 1988, p. 364).  Since both the  proposed bias test and F-test are variants of hypothesis

testing it is necessary to explore ways to compensate for underestimated variances.

       This section presents a procedure for adjusting variances when data are autocorrelated.

In particular,  this section discusses  the variance inflation factor required for an unbiased

estimation of a sample variance (VIF),  the inflation factor required for an unbiased estimation

of a standard  error of a sample mean  (SEIF), and the inflation factor for the best, unbiased

estimation of a standard error using an unbiased estimate of the sample variance (SEVIF).  The

discussion concludes with a comparison  of the test results  obtained when the bias  test and F-test

are  applied to the datasets described above, first, not using and, then, using the variance and

standard error adjustments presented in  this section.

       Statistical Background.  A population consists of an unknown, potentially infinite number

of measurements which can be characterized by a  relative frequency distribution. Two numerical
    lfrhe appearance of the term "bias" here should not be confused with the "bias test," where
"bias" refers to the systematic error between CEMS and AMS measurements.

                                          10

-------
measures used to describe a  frequency distribution  are:  the mean,  u,  a measure of the

population's central tendency and the variance, a2, a  measure of the population's dispersion,

expressed as a function of the deviations of the measurements  from their mean.

       Consider a sample of  size n taken from  a population of measurements which are

independent and identically distributed.  The population variance is o2 and the mean of these n

measurements has the variance 
-------
where  S2  is the sample variance obtained by Eq. (4), n is the sample size and p is the degree
of autocorrelation (Wolter, 1984).

       As with the population mean and variance, the population autocorrelation, p, is usually

unknown. The best unbiased estimate of p is the sample autocorrelation, r, which is defined by:
                                                                           Eq. (6)
where 0, is the measurement at time i (original), £, is the measurement at time i-1 (LAG1), and O

and  L  are the means of the original measurements and the lagged measurements, respectively
(Steel and Tome, 1980 [p.272]). Substituting r for p in Eq.  (5) will result in the best, unbiased

estimate of the sample variance:
                                        2r    +  2r(l-r")
                                     (fi-l)d-r)
                                     Eq. (7)
      The standard error of the mean of n samples drawn from an autocorrelated population can
be denoted by  a-  and defined as:
                                   0!
                                   n
(1+P)    2p(l-p")
U-P)     n(l-p)2
Eq. (8)
                                         12

-------
where  o2  is the population variance obtained from an autocorrelated population of size ft, and

p is the degree of autocorrelation (Box and Jenkins, 1976 [p. 194]).  Note that for large sample



sizes a portion of Eq. (5):
and of Eq. (8):
                                       2p(l-p")
                                       2p(l-pB)
will be nearly equal to zero.  Figures 1 and 2 show the ranges of sensitivity of Eq. (5) and Eq.




(8) to changes in p and n. Replacing   o2 and p in Eq. (8) with  S^,^ from Eq. (5) and


r as defined in Eq. (6), respectively, will result in an unbiased estimate of the standard error of




the sample mean,  5-
       *           *i
                                     £
                                      n
n(l-r)2
                                 1  -
                                         2r
   2K1-I"1)
                                     (n-DU-r)
Eq. (9)
Figure 3 shows the ranges of sensitivity of Eq. (9) to changes in r and n.
                                          13

-------

-------

                    1
                    !
ea
       -g-g

-------
 05
 0«

 B

       1
I
                                                                       o
                                                                       §
§   !
                                           CO

-------
      Proposed Tests. Two of the three statistical tests used to evaluate the precision of the




AMS could be affected by autocorrelated measurements. The proposed F-test, which is used to




determine whether the variances of two populations are significantly different, relies on unbiased




estimates of the sample variance for the AMS and CEMS measurements.  Thus, Eq. (1) can be




restated as follows:
                                Pr
                                                 as
Eq. (10)
where   S^	   is  the  best,  unbiased estimate of  the  sample variance  of the AMS
            MMtttrf





measurements and  Sraiis ..   is the best, unbiased estimate of the sample variance of the
                        MMMM0


CEMS measurements.  Unbiased estimates of the sample variance of the AMS and CEMS




measurements that are used in this test can be derived from Eq.(7) where r and S2 are calculated




separately for the CEMS and AMS measurements.




       Similarly, the bias test in the proposed regulations relies on an unbiased estimate of the




standard error of mean. Thus, Eq. (2) can  be restated as:
                                  Pr
                                       d ,tf...i
                                               *J025
Eq. (11)
An unbiased estimate of the standard error of the mean difference can be obtained from Eq. (9).
                                         17

-------
       Comparative Analysis. The F-test was performed on each of the datasets described above




in section 2 by taking the ratio of each paired AMS variance and GEMS variance and comparing




it'to-the critical F-value at o=.05 for the appropriate sample size.  Of the sixty-eight subsets




tested, 43 passed the F-test




       Autocorrelation coefficients for both the GEMS and AMS were calculated using Eq. (6)




and  are listed  in  Table 2.   Also  shown are the p-values  achieved on the F-test   The




autocorrelation coefficients for AMS data subsets ranged from r=-0.01 (Homer City 3, subset 3)




to r=0.99 (Niagara Mohawk, subset 5). Similarly, autocorrelation coefficients for CEMS subsets




ranged from r=0.01 (Homer City 1, subset 24) to r=0.98 (Northern States, subset 7).  Variance




inflation factors (VIFs) for each data subset were calculated from r and ft, using Eq (7). The




VIFs derived from the AMS and CEMS measurements for each dataset are shown in Table 2.




In most cases, the VTF of the CEMS was slightly larger than the VIF of the corresponding AMS.




Consequently many F-test statistics decreased  slightly when  inflation factors were applied.




However, only two F-test results changed from passes to failures (Northern States, subsets 1 and



23).    In both cases the AMS autocorrelation  coefficient  was  larger than  the  CEMS




autocorrelation.




       For the  bias test, the difference between  the CEMS and  AMS  measurements was




calculated for each observation pair. Mean differences, variances of the differences using Eq




(4), and autocorrelation of the differences using Eq (6) were calculated for each data subset




These statistics and the results of the bias tests  (using unadjusted standard errors) for each subset




are listed in Table 3.  The autocorrelation coefficients ranged from r=0.04 (Homer City,  subset




7) to *=0.94 (Northern States, subsets 7 and 11).
                                          18

-------
T»

£
5=
 8

I
 5



I
i

 ?

I
«
 §

t
"5
•c
O
2

TB
c
|5>

o
en

'55


in
p

 ii
 as
CM

 V
             W-j
             ss

               3
             01
             u
             M

             3
             ^i
             §
         ID o ij
         O-H Oft,
         £jju3
         (tf tO Og

         JJ^IS0
         14 R
        > M
         O t-l
          -^ O?.,
           U JJ "

           a o
             8^
                   Cu Eu Eu EM EM
                                      EM EM
                                                        Ow O* CU CM. Oi P« 0* 0* pu Pu Pu CU Pu EM P* 1^4 p4 Du P* fiu EM Pu CU
                   oooooooooooooooooooooooooooooooooooooooooooooooo
                                                                        > iH O tH O O <
                                                                                                 >ooooooooooo
                   oooooooooooooc

-------
o
CM
       ?
  a   A
  H
  4-1

  5   «


       jj
       i
       b





     n<~<     Oi t>i Q. PM ft4 tu CM GU Cu tu [M fe A 0< Ft, (X I
     CQ tH                           '                       ^i
     id n>
     -  "•-     ooisiooooooi-toCTOOinoisiof-rM
             oot->oooooooocMOOc-~ocno**an
             c^orMooooooooaNoocnocnooov
             ooL/ioooooooovooroooooor'}
             OOOO OOOO OOOO OOOO OOOrH

       r-l v-4 O O C7 O O *-l O O O O rH fH O TH O O O O

  •H   OJ
  B|

             oaocnmt
  O         <-* CM o r-t \t> ro t-t m o t-* ro CM y> ^* CM ro m in o in

       *J     ooiMtnHunoooencMmoooo'fl'Oooom

       ft)
       L>     O O rH C^ CM iH n O r-< >-l CM 

     ?




     •"       -----         ---        -•   5

                                                      co   tn
                                                      CM   C
     Id                                                C4  -H


  d                                                       S
a> O M       o>ODtni/inor4oa>u>u>ocnroo\^imotcn^i   d
U-H O tn     <3DiHrH\Da>GOrHOOU>tn«4>^lCM^CnU>rHa>a>r^
C JJ iJ 5;     fimOiouMof-oirH^ir'ii^^imistf-r'ir^t-icN
  (d o 2     ^ ^ ^ **^ ^ '^ ^ *~* ^^ ^ ^ ^ *? ^ ^ ^* ^ ^ ^* ^*  ^


Sic1                                      	
                                                           o

                    cg\ooO(-iooSu>ooa^cn

             oooooooooooooooooooo
                                                       i
                                                           1-1
                                                           841
     »H       i/> a> o ro r-i CM o en o i/> CM i/> u> on an r~ u> vo in ox

3 J-> -i

»H *"H I


>H                                                       1J

             in on c~- c~ W) oo r~ v> an «> r- r- u> r~ *o c- «> r- o» m  f
       in     i/> en en an en en cncn o\ en an en »tHt-CMCMP-CMCMencMCMCMCMenCMpCMeM«H



     _7i
     a K     CM 1/1 v-f«« us r« ec an o >H CM ro ^* in \o co «H CM ro *H
    ^                        »H rH »H «H «H rH H H CM (M (N

     CO
               W
                      id >;
                                                                                        •q   to
                                                                                        «8   ««
                                                                             ;:isi
 igOu IS   O

^4)S>uD   a«!;
 O > co O >.  >t>->'.


 0 Id h O U  • CjT4
 a c « -H  ••o    x
 ldMXJMUOt
-------
       Variance inflation factors were calculated from the number of observations and estimated




autocorrelation coefficient of each data subset, using Eq. (7). These were then applied to each




subset's estimate of variance. Variance adjustments ranged from 1.00025 (for Chesapeake a-e,




subset 76) to  1.16892 (Homer  City, subset:8) with a mean VIF of 1.02133.   The VIFs for




variances of the differences are  listed in Table 3.




       Inflation factors for the standard errors of the differences were calculated by two methods.




The first method used Eq (8), substituting the unadjusted variance of the sample differences,





 S2  for   o2  .  This inflation factor is labelled SEIF. The second method uses Eq (9), which







simultaneously incorporates the inflated sample variance,  shown in Eq. (7), and an adjustment




for underestimation hi the standard error of the sample means, represented by Eq. (8).  When




applied, this dual inflation factor results in the best, unbiased estimate of the standard error of



the differences and is labelled SEVIF to denote that both VIF and SEIF have been applied. (See




Appendix B for a fuller treatment of the statistical basis for this procedure.) The purpose for




calculating both standard error inflation factors is to illustrate the impact of obtaining an unbiased




estimate of the sample variance before applying an adjustment to the standard error of the bias




test.




       Forty-eight (48) of the 68 subsets tested passed the bias  test without  applying either



inflation  factor.  An additional two subsets (Homer City 1, subset 10; Homer City 3, subset 8)




passed when SEIF was applied. Table 3 shows the magnitude of each inflation factor for each




data subset   Using SEVIF compared to SEIF reduced  all t-values slightly.   However, the




additional adjustment was very small in most cases (averaging 1.02133) and none of the pass/fail
                                           21

-------
it
UJ
CO

'55
•5)
1
 o>
I
•o
5 i*
T7 UJ
 §£
 ss
 SS
— 'c
    5
15  o
•5s
•c  g
 o  E
fl
 5
IS
eG *
Is
.S  2
S  ®
o ^
f-I

il
1^
s?^
li
*-'»
o  a
£1
E  *
3
CO
w
tfi^c4oii'i*-i\or'iwa>mHinu>invootvakui0tc>0>r*r-.c>CQt-0ir~cnoo<«»inoootatr4CQmsDr«a|i0trHU}a>r-nor4OOoooooo
                                                                                          ooooooo
                                                                                          ooooooo
                                         I  I  I I  I  I I  I I  1  I
                                                                   wcDwoi
                                                                             rHO O O O O O Q O O O O O O O
                               > o w> ^ to av u> u> « *n v p- ov
                                                                  r- r» o *F 01 o o m ch  r-  r>
                        IN «o r« ui (M in « m r» in r* o w o c* o M o n o « O H o *-int
                                                                                           O«-l<-i<-t
                     oooooooooooooooooooooooooooooooooooooooooooooooo


                                           uttttauaauttoiuanattiaiamuciao
                                           IIOVIliffiVVOOOOVIIVOOIIOfiDO

-------



























•ri
|
c
8C
w .
(D

1-
applied
u.
CO

tppliad
M
n
rH
s



t. *

Ms
M n
a 5
S
HJ
R fc
W





u
*H
1








Pass
-value Fail
o
«
a
f
is
D
3
i-l
a
>
a
*H
a
*J
a -H
•H
?
a
3
>
u

^
•Sill

03
si
£1
0
*O 0
o u
Is
s

' o
** V
Q
i
Maau
Differ*
-^
,,
n
*J
a

<#
§
oooooooooooooooooco**
ooooooooooooooooocnm
OQOOOOOOOOOOQOOOQOttNl
OOOOOOOOOOOOOOOOOOO
«,^^MOk-H«««rt^««gS«0
c^nvw^ww^moiinwei^ctc^w*
II 1 • 1 1 111 II
OOOOOOOO O'O OOOOOOOOlW
QOOOOQOOOOOOOOOOO01C4
ooooooooooooooo o o A n
OOOOOOOOOOOOOOOOOOO
^^^^^^^^^^ .H.^ H ,H ,H ^ rt o 0
ttg££g£w£3S3S££3!oSS£
11 . . I I t 111 II
oooaoooooooooooooovH
OOOOOOOOOOOOOOOOOOO
OOOOOOOOOOOOOOOOOOO
^^^^^.H^^^^^I^HH^rHMHO
•vinr-^i?)cacDr-Sot-N*-iowvNO
pjaj^co^^mwujajcoriwcjco^^o^
1 1 1 1 ) 1 1 1 1 1 1 1 1 1 1 1 1 1
OOOOOOOOOOOOOOOOOOH
OOOOOOOOOOOOOOOOOOO
OOOOOOOOOOOOOOOOOOO
OOOOOOOOOOOOOOOOOOO

OOOOOOOOOOOOOOOOOOO
OOOOOOOOOOOOOOOOOOO
OOOOOOOOOOOOOOOOOOO

OOOOOOOOOOOOOOOOOOO
oooooooooooooooooom
^
o
OOOOOOOOOOOOOOOOOOO
•
OOOOOOOOOOOOOOOOOOO
OOOOOOOOQOOGOOOOOOr-l
m
OOOOOOOOOOOOOOOOOOO
1 1 1 1 1 1 1 1 p 1 1 f 1 1 1 1 1 1 «-l
vooYO*0kOt0to»cooinr-owchCh>H
OOOOOOOOOOOOOOOOOOO
cofiv^ooavooinoooor-octooot
u>fH^iooaAOHo9m-«iAU>m«-ir»mH
^sssssssssss^sss
!"""ES""E""J!
0^jB^J*X*i££X^iJ^^*<
iiigiiiiiiiiiiiiiii
d
In
1
in *
•D a
S . 1
•Hi S
1 -s -i
So »p
m Li
! ^ «
I *J IB V
£ 1 » • «
.* -0 B »
5 g o&o
o El H n *o 4J
«S "«4
« d Sv>
S i-^ ^ «H
•B 0 • S^yS
n >o I'D •
C o -3 ii o
« U .815
• °.ssg.g
. s «s«^s
tu '01 0 tf h
*c > *0 o o a
0 O *H ^
CO O 0 l*«O 3
<* « 10 V 0 O
4 oo a-H^s
^- ti u
-H 0 4 « O
g l^^^s
0 I£SC.
« TO «1 C-O 9
O > > 0) i
B • 0 O n
£ rt ° ° > «
•S 5 So OB
» — -H-H -o ,;
1 -Si S3 — g
0 ^axa a g
& S3 «
- - t^^s3 5
S 3 S . 2 ±
•r-l MU 8— 0 *0
i-t o CO — ' »
m °?gs< §
0 ^V*r-l 03C ^
^ s!I-5- 1
«j g.Hatfaiai ^
n bi a o o M ^
c •Jaoo-HS n
» -q >i >, o* „
a « B-HH J
« S l&glS^ «
•o - S 55 * * o ••
« SmanuO •"
o S^""SJ*5
"3 ' rt? 0 §P b g ;
i ? ;::-i^ * -s
ce 39^ « S o •
_ u BSS**1 'H
CD . . a S — c ... e
ii^S£3?o g
u zgu^^ma.., S
i !;P!:l!
I. HJ"SH r?
i "HSlill s§ si
•3 -ES«t1I s § i3i ^^
o ^ S-" *il » • "5 "
s "«ars>s«s B 5
g«fc"sis".a *; ° ? +i
5rfiS8is i « ^s lip
gl?i!£S|S | -3 . "^ .
rar-l~.8«Mtt * " a
~^ >, — « o « i s " s
raaa> M 8 - «
il*2Ei|5 • £ i «
B^B^^I! a "a « 1 1
5 S o « „ „ fl "
klitJ~*>V "i q
•J4r°"'*i;i o § f B c
5§t!l§S8 8 g • ^ .
•.•SioNi.-o'" S s g g §
lsg.$.^s 2 | 2 fl g
&slous5 I « -3 f *
g"5St;S0 3 o E 1 ti
JISSIg5S«S8 * 1
S'SZXISBS g | 8 S §
A B « S Jr.
335^ w
o ooooota as <
M n rt - *i w



•












HI
r?
»« 2
-* H ^«
r« — f
??z «
si "«
^3" *
?l? H~
5P V
5

. I
8
3
i

I
1
g
i



-------
results changed when SEYIF was applied compared to the pass/fail results using SEIF.  Table



4 shows the ranges and averages of each inflation factor used in the bias test and the ranges of



bias test t-values.      .  ,; :  ..                               ......    .  .  . •    .  •.




       Summary of Findings.  The effect of using underestimated variance is most prominent in



the bias test, where the underestimation may result in detection of significant systematic error



when in fact the systematic error may not be statistically significant  In contrast, underestimation



of variance may affect the F-test from two different directions. Since this test is tile ratio of the



AMS variance to the CEMS variance, it is possible that different degrees of underestimation may



be introduced into the estimated variance of the CEMS and AMS measurements.  Adjusting for



the underestimated variance of each system individually could increase or decrease the resulting



F-value depending on which variance requires greater adjustment  (More highly correlated data



will have more downwardly biased estimates of variance.) In the case where the autocorrelations



of the CEMS and the AMS are equal, the variance adjustments would cancel and the adjusted



F-test results would be equal to that of the unadjusted F-test



       However with large sample sizes, as would be required under the proposed regulations,



the underestimation of the variances and standard error can be expected to be minimal except for



cases where the autocorrelation coefficient is very high (r £ 0.90). Therefore, applying variance



inflation factors are not expected to substantially change the pass/fail rates on either the F-test



or the bias test
                                         24

-------
 CO
 v


I
+mf
i
 CO
CO



CO
3
«L






1
u_
c
o
I
c









i
£

§
1
fi
5
E
.i
CZ
1
i
5
|
i
^£
5

F

e
c
O
•.cs
co
&
UJ
O)
'w
^
8
o>

o>
CO
^


CO
t












UJ
O
•2L









S3
T™
t—
CM



in

i



07
E
o-
UJ
^
CO

o
in
|^fc

O5
Uu)
1
T—
c\i
CO
ct
00
in

•*
CO
§


^— V.
UJ
CO,
06
CT
LU
S
CO

0>
•"•

in
!•»;
•
CD
CO
(£)
CO
CM
CM
?$
CO

CD
CM
s

^•^
UL
UJ
03
CT
111

-------
5.     Test Results



       Table 5 shows the results of applying the three statistical tests as specified in §75.21 of



the proposed regulations to:



       o     the data reported by UARG;



       o     the Chesapeake data subsetted as described above with monitor A designated as



             the "official" CEMS and each of the other monitors representing an AMS;



       o     the Homer City data subsetted as described above;



       o     the Northern States data analyzed at the level of refinement of the CEMS.  That



             is, hourly CEMS measurements were paired with daily CSA measurements for



             corresponding periods of coverage.



       o     the Niagara Mohawk data analyzed at the level of refinement of the CEMS.  That



             is, hourly CEMS measurements were paired with weekly OSA measurements for



             corresponding periods of coverage.



       The paired Reference Method 6 data reported by UARG failed the bias test but passed



the F-test and the correlation test   For  the paired CEMS/CEMS measurements  from the



Chesapeake unit, 26 subsets passed the bias test, 26 passed the F-test, 20 passed the correlation



test, and 20  passed all three tests.  For the  paired CEMS/CSA measurements from the Homer



City and Northern States units, 24 subsets passed the bias test, 18 passed the F-test, 5 passed



the correlation test, and none passed all three tests. For the CEMS/OSA measurements from the



Niagara Mohawk unit, all three subsets passed the bias and F-test, none passed the correlation



test, and, consequently, none passed all three test
                                         26

-------
 c
 CD



I
 O

'*5
JO
H-
 c
 O
V)


ui
O
 0}




 I
H=


I

*-<
 CD

•a

 S


15
 a>



 o
 CD


 O

O

•o
 c
 CD
 a>
 a>
 w
 a>
 co
 ro
 o

 to


 IO
 CD
cc


ih*
 CO



•!>
lo
co oo
IX
0 ^
 to '
-- £u.---
•0
!!
" O
U

% ••=
w 
1 =
Is

Z
!l
CO
*8
' g

CO



(N
O

(O
*
CM
0>

_l
<
| UARG Attachment E



--






















ID
f*.



O

O
CO



1

^t-
*
r-

<
| Chesapeake CEMS a-b



!
i



i

i



i

!

CO
(0
V
^
Z
1 Chesapeake CEMS a-b




00
to



o

in
^>
w


o
o
o
o

00
o>
*
m
co
3
I Chesapeake CEMS a-c





GO

oo

o>
00
03
0>
|| Chesapeake CEMS a-c




CO



o
o
8

in
•*


8
8

§
•~
oo

CO
to
-
|| Chesapeake CEMS a-c




-



1


-------
 a

 c

 c
 o



10



£ o
f °J
A
a"


!s8
?Jv
f •*• O



•a w to
liSg
?Sv
'-in «s






'Ms
o£u-

8=5
,gu.
•g
-1
I'

1^
(£"•
,|

•o
4)
-1
I'

a-s
£"-
-1

i.
s S
||
z
II
(A

1
O
(/I
(D
.*




00
03



m
•*
3

o
»-




to
en
1
a>
CM
CO
<
TS
|| Chesapeake CEMS a-




m
r*.



co
*—
r>
CM

to
O



CM
CO
!>.

v»
co
»»
to
CO
tS
|| Chesapeake CEMS a-i




in
o>



8

CM
co
*
•n
1 Chesapeake CEMS a-




to
ix



to
U)
r-
CM

10
o



CM
O>
CO
m

p»
CO
i
CO
co
in

1 Chesapeake CEMS an




lO
o>



r-
(O
CO

CM

CD
CN

8
1—


s
p»
0)

g
1
co
co
PS

1 Chesapeake CEMS an




in
m



CM
p*.
00

m
o


CM
O
CM
00

CM
en
9
o>
CM
CO
«0

|| Chesapeake CEMS an




1+
IV



CO
ps
CM

g



3
to
0>

U)
r»
i
CO
CO
CO

Chesapeake CEMS a-<




to
tn



en
CM
CD

CO
O>
O


oo
r-
CM
tn

CM
*•
•-•
en
CM
CO
o
T3
|| Chesapeake CEMS a-




^
P«,



CO

•*
o



s
u>
en

O
r».
i
OQ
5
*•*
*^
|| Chesapeake CEMS ai




to
o>



CO
(D
is.
r»

%
O


CM
CO
O)

CO
CO
1
0)
CM
CO
CM
TI
1 Chesapeake CEMS a-


























^
0)



3
00
r-

co
O)
O


§
en
en

tn
^
p*
co
CO
^-
<
4U
|| Chesapeake CEMS a-i




CO
00



5)
g

0>
00
o


en
CM
CD

«—
O
*
en
CM
CD

u
1 Chesapeake CEMS an



CL
CO
o>



oo
IO
o

CM
O



CM
CO
CO
0)

en
f-
to
CM
CO
CM
ffi
|| Chesapeake CEMS a-

-------
 

 CO



o _
'*= 2
to oo
£ A
a^

1«
$*? V
+ "• o


TJ t» IO
*£g
•s; ^ O
? £ v
-go






2l =
o> 2 co
a*"-

10 *cQ
CO UL
•D
s s
rs
a~
«) flj
£"-
«
"1
TJ
•g 4>
|l
u
S'B
£"-
, s
0.-5

•o
CO
-1
I5
U
Z
ifs
s-s
(A

"a>
o
CO
1



-
CD
CO


CO
CO
o
cn
g
o

00
co
oo
00



CN
i
O>
CM

CO
CO
in
Cn
O)
o

8
oo
O)

en
CM
«*
m
CM
CO
CO

§ 1 Chesapeake C o> 00 in en co en m 10 • en CM CO |v cp eb V) i I Chesapeake C o s 00 Cn CO CM «* m CM CO oo f a tO i 1 Chesapeake C m 00 CO o oo CO O) o en s 0) ID *~ i o> CM CO 0>

O tv CD 00 cn oo CM ^ in CM CD o 9 CB lO i 1 Chesapeake C tn CO 10 cn oo 00 on O oo r- CO i cn CM co - «H CO to § || Chesapeake C in cn 00 if> cn o o CO CM SI co * CM If) CM CO CM ¥ (D CO i 1 Chesapeake C ~^r TT 8) k. »ss fr*g l|i w o S CM CD 1 r» IV. cn cn 0) cn t cn o co cn 3 4fc 2r 0 CO o CO h- * fv q CO CM CS s cn cn CM CM ^- 1 O CO 4fc s CO o CO cn q •cl- ot r*. CM CM O - CM CO CM CM »t & D £ o 00 V— cn CO <* i*. rv f- o S cn cn CO CD "T CO CM CO % ^ U » X CO IN. O> CO cn q CO CD cn cn cn cn co CO t cn CM <» =»t 2r 0 k.


-------
 


I

IO
 CD



•io
+* 2K
JO 00
JA
e».

l|g
•s * •
r7 v
T^- "• &



•o ts if)
0 .« CM
3^9
v 8 v
-g a

-




s"«» =
fc'S «
g-2 "•

:r.Jg'»:<"
£"•
•g
3 S
"=••=
r
i»
£ eo
£"•
, §
o-«

^
I*
I1

"

r»
rv
o


q
(O
*t
o
CO
o>
^
5
«
&
u
fc
i




CM
0)


CO
CM
U)
. en

O
IO
o


1
CM
O
*
m
CM
«—
CO
8
o




m
oo


CM
g
CO

CO
CO
o


o
o
8
3
CO
o
CO
CM
co
£•
(J
«
I




CO
o>


5
r»
U)

CO
o>
o


I
O)
r>»
rs,
CO
CM
CO
CO
£•
u
a;
o




o
on


CO
ID
o>
CO

^



§
CO
CO
•*
00
CM
^
CO
^
u
0>
o




0)
CD

(L
CM
CO
00
O)

CM
o


I
05
CO
•»
«—
O
m
10
co
£
Q
^
o




00

0.
CM
t^
r»
oo

U)
CO
o


1
IO
o
CO
o
CO
CO
co
£
u




O5
f-

0.
CM
CM
in
IO

u>
0)
o


o>
O)
CD
O)
g
r»
on
CM
r-.
CO
^
u
1
£




in
CO

O.
10
s

5



0>
en
en
O)
CO
CO
CO
en
CM
eo
«
^
O
V
o
























CO
CO


en
en
en
en

*•
en



en
en
en
en
CO
00
CO
*
1
8
00
in
3
at
8
|| Northern Sta




co
co


eo
CM
in
o

CO



en
en
en
en
00
en
to

«
»•»
in
00
CO
f-
CO
CM
to
ID
II Northern Sta




CM
CM


i

5
CM


O>
05
at
at
r«.
01
1
o
CM
fSi
co
v>
&
|| Northern Sta




!•»


O

to
in



O5
o>
o>
en
00
*•
<0
t
8
r«.
*
w
8
1 Northern Sta




s


§

CO
CM
CO


en
en
en
en
en
CO
*
en
f»
CD
in
1
1"
Northern Sta




PO
i

en
en
en
en

O
CO
O


en
en
en
en
CO
CO
CO
t
o
CM
r«
CO
I
I Northern Sta





















-------
•o
 0)
 o
o

in
 CD
I-



.io
s«
£ A
o *
u


"5 «•• to
- SO
n> S- >;
4^ T V
«- "• o



•B w 10
ȣS
•=; ~ O
«S BI .;
•- S to






18=5
I*."-

•0 :=

0>
A
n

m
(0

CD
?
U)
0)
to
00

| Northern States




GO
*-



1

(N
0
«S


s
O>
CD

(O
00
t
O
w
r^
o>

|| Northern States




o
CO



w
1

CO





CO
«
rji
8
r»
O

| Northern States




3



o>
o>
o>

o>
o>
_
00
CO
o


0>
05
m
en

CM
eo
*?
8
r»

01
*?
r^
0)
«0
co

|| Northern States




8



en
en
en
en

to
O


0)
en
o>
en

in
CO
oi
o

u>
o
o

T"1
CM



05
en
 in
tn «-

-------
     •u

     to

     IN


     s
     10

     V
o
•
 fi   T>     2
 •0-2     g
 i   i   ll
         o


         I
         QJ

         "S
             i
             $
     I
     Q.

     ra
     V)
     £   12
     ?
           E s=
        --6
         8|

         oo

         ~ 00

         T2..I
                 «
        -O..S
 52|   8g
las   £•§

cNcTs
co co m
                  E

                  1
                  a
                  10

                  •fc

                  1

                  I
         I       P-   s
         Co—   •= CD   CO
         g o> '5   ."5. .£   c

                       o
    ..  N-^   "re ^   •¥
    CM o 3   -o '«   g
H

s

%
CO

4>

                        i
                        z
                        J5
                        '5
                             S

                             •3
                    o
                    m
(A =
     «   v>* , «^
     o  p w S3
     >  »s-s
     —  
-------
       Table 6 analyzes the same datasets as Table 5 but uses the variance inflation factors as



described above in the bias test and F-test to compensate for autocorrelation in the data.  The test



results for the data reported by UARG remain unchanged. For the CEMS/CEMS data from the ;



Chesapeake unit, 27 rather than 26 subsets pass the bias test For the CEMS/CSA data 26 rather



than 24 subsets pass the bias test, 16 rather than 18 pass the F-test, and 1 subset passes all three



test The test results for the CEMS/OSA data remain unchanged.



       In a effort to determine whether on not the refinement of the AMS affects the 3 statistical



tests, Northern States and Niagara Mohawk data were analyzed at the level of refinement of .the



AMS.  That is, for  the Northern States data daily CSA measurements are paired with the daily



average CEMS measurements for periods of coverage, and for the Niagara Mohawk data weekly



OSA measurements are paired with the average weekly CEMS measurements for the period of



coverage. Table 7  and Table 8 report test results for Northern States CEMS/CSA data and the



Niagara Mohawk CEMS/OSA data analyzed at the level of refinement of the AMS.  Both



without (Table 7) and with (Table 8) the variance inflation estimate  two of the CEMS/OSA



subsets pass all three tests while none of the CEMS/CSA pass all three tests.



       Table 9 and Table  10 report test results for the CEMS/CEMS data from the Chesapeake



unit using all possible  combinations of CEMS.  This analysis differs from that reported in



Table 5 and Table 6, where monitor A was designated as the "official" CEMS, while each of the



other monitors was considered a separate AMS. These tables reveal the number of subsets that



will pass  a statistical test regardless of which monitor has been designated the "official" CEMS.



Here, the one-tail t-test and F-test were applied to data from all possible pairings of CEMS.  For



each CEMS  pair, the tests were performed twice: once with one monitor designated as the
                                         33

-------
 0)
 +rf
 CD




 1
 Ul

 CO
 0)
 o


 CO

 •c
I
•^


 O)
 c
'55



i
UJ
O
 0)

3=
 CO
cc
+••
 CO



I


 (0
 to



 c
 o
 §
 o
o

•o

 CO
 co
 V
 co
 05

m


 o
 (0

 0)

DC



CO



•So
5«
£ A
0 ^
U
UiiSSSBS

Isg
?f5
^- "- o



•p W IO
5 « VM
•— ^ r*
m „ "
? « V
*~S e






181
a£u.

1^
£«*•
•o
ll
U


m
to
5




8
o
o

If
CO
00
CO
r»
|| Chesapeake CEMS a-c




CO



o
o
8

^t
*


oo

CO
^
CO
00
CD
oo
|| Chesapeake CEMS a-c




CO



§

CO
IO



1

3
co
oo
CO
en
|| Chesapeake CEMS a-c




CO



o
o
o
o

CO
«*
*•


00

CO
CO
oo
co
O
| Chesapeake CEMS a-c




CO



|


co
00
CD
^-
|| Chesapeake CEMS a-c




*-



o

r»
IO
**


oo
O
o
o

IO
*—
CO
CO
CO
CM
| Chesapeake CEMS a-c

-------
 Q>
 o
o

CD




g -:
U'
u


I i


^ 4_i If)
 "5
a
>



•a
ts 5fi
I?
o
u


«> s=
^1
>

I.

u


z
o •§
CA
01
Z
Q





£





co
en
en
en



£
O






m

»
CO

CO
en

5

CO
U
X





CM





«
IO
en



g
o






|

00
eo
CD

IO
CM

*~

«
U
i





00





Ps
O
Ps



CO
o






q

IO
IO

0
CO

CM

«
i





CO
en




t
0
§



?;
o






|

00
Ps

CD
CM

CO

«
U
CO





O
en





00
IO
CO
eo



J
,—






it
CO
«*
q

JJ
^

00
CM

*

CO
U
o





en
CD





00
en
eo
en



5
o






o
o
o
q

10
CD

o
CO

10

2
u
CO
i





»





CM
CM
PS
00



10
CO
0






CO
PS
o
q

00
IO
01

o
CO

CD

CO
G
fc
o





Ps





ao
Ps
en
IO



en
O






en
en
en
CD

CM
CD
t

en
CM

Ps

4fc
u
o
X





10
cq





00
en
00



o
_






en
en
en
en

CM
ID

en
CM

00

w
0
a>
i









































CD
CO





en
en
en
en



%
O






en
en
en
en

o
Ps
2

o
CO
ao
IO
?

1
Northern





m
CD





10
CO
o
q



CM
CM
^






en
en
en
en

01
CO
I

co
*~
Ps
*~

tfl
s
V)
|[ Northern





o
co





g
0



CO
CO
CM






en
en
en
en

5
CO

,_
PS
CO
CM

to
I
CO
1 Northern





01
01





0
8



CM
10
CM






en
en
en
en

en
CO
en

O
CM
Ps
CO

I
i
|| Northern





Ps





g
8



m
CO
r-






cn
en
en
en

to
op

O
CM
PS
*

co
1
C/J
|| Northern





s





o
o
o
q



CO
IO
CO






en
en
en
en

CO
to
i

en
Ps
co
IO

1
5
CO
Northern





CO





en
en
en
en



co
CO
O






en
en
en
en

CD
CO
9

o
CM
Ps
CD

1
re
|[ Northern

-------
 0>

 c

1
 o
o
 * *
CD
 CD



|o
«5§
S*°
* u_ tL
*- **^ o



•o «> m
%ȣ
" 8 v
-il &






1|=5
«§£"-

w.-=
' 2 w
.. £t

s
§5
u

g =
££
. S
"2
T,
S s
Q--=
JS

^» !=
2 w
£"-
CO
-it

^3
I1
Z
n
CO Q>
+-* |n]
w S
°c?

M
o

CO
i
en

to



CN
0)
0)
0)
oo
O


s
0)

s
1
o>
CN

. 00
o


^*
n
en
00

•*
01
c-
0)
CN
CO
-
?
(0
CO
| Chesapeake C




U)
en



CN
to
CO
r«
o>
o


00
r-
eo
en


CN
 v>
HI
Sa-s
w «! °
•5SS5
*«- lP
fr!l
if-
M I B


























(N
CO



O
O
q
«N
r»
*—


en
en
en
en

m
O
t
CO
en
i

•*
^
L)
^
D
O




CD
r*



00
00
q
o

en

oo
O
CN
•
o
CO


«fc
O
w
V
o




o
CO



in
00
q
in
CO
CN


*
eo
CN
O

en
en

CO
CN
CN

*fc
4-*
U
1
£



i
CO
T~



CM
CO
!>.
co
f»
O


r«.
en
en
en

en
en
•7
CO
«N
CO

»t
£
u
^
09
Z




CD
fx



8
9
r«
co



in
in
CO
en

en
00
*
en
CN
^

4fc
2-
u
&
o

-------

1
_c
"-P

 o
O


CD


.|0
is is
CO CO
£ A
s*

1«S
?Jv
^- "- CJ



•0 W 10
J3£g
CQ __ •
v S v
^ ta **






S "g !S
o> 2 eo
$*"•
Jw
1^ '
•a
S
I|
i*
u
1 «
CO 14^
§
*l

•o
1 S
||
o

SIS
fifi
0)
a^

•o
U
I5
z
II
(A

Data Source7


-
00
00


CO
00
»-
CM

3



o
o
CO

CO
00
9
CO

CO
IN.
IN.

CO
r-
9
CO
CO
CO

1 Chesapeake CEMS a-i



10
en


o
*•
fN.
CO

**•
en
O

-
o>
00
00
CD

en
t
9
O)
CM
CO
*t

<0
CO
*
I
S
Q)
6



U)
fN


o>
o>

o


-•
CN
fN
IN,

rN
9
eo
CO
in

I Chesapeake CEMS an



to
en


rN
IN.
in
CO

CM
05
o


IN.
•*
in
IN

en
CO
9
en
CN
CO
CO

1 Chesapeake CEMS an



«•
r»
.

CO

O



CN
10
IN.
IN.

CO
r-
9
co
CO
tN

1 Chesapeake CEMS an



in
en


CO
CO
CO
00

en
O


»•*•
en
in
IN.

en
CO
9
en
CM
CO
CO

CO
V)
2e
1
JB
U



•*
fN


CO
CO
CN

O



^
CO
CO
oo

o
fN
9
CO
CO
en
^3
|| Chesapeake CEMS an



in
o>


co
CO

CM
O
O


in
to
rN

en
o
*—
i
en
CM
CD
O
•o
|| Chesapeake CEMS an



*
IN.


in
co
"*

O



en
O)
00

fN
CO
9
CO
CO
-
^
Chesapeake CEMS a-



10
en


CO
00
CO
fN

*
en
o


CM
00
00

00
CM
1
en
CM
CO
CN
•n
|| Chesapeake CEMS an






















0.
^t
en

D.
O
CO
oo

in
en
O


00
IN.
00
en

in
CM
CM
IN,
CO
CO
3
O)
|| Chesapeake CEMS a-


a.
CO
oo

0.
o
CM
t
en

CO
CO
o


CM
00
CO

en
9
en
CM
CO

fU
I Chesapeake CEMS a-


a.
CO
en

a.
CO
co
-*
in

en
en
O


*—
in
en
en

en
in
CM
m
CM
CO
CM

-------
T3
 CD


 C
 O
o

cb



c
IK A
t ^

"S *- w
£ »> o
15 £J
•— <3



•a to to
5 .« CM
~ r- o
1" .. •
*f m V
*™ i3 o






|Jl

..ir
§ «
5
ll
CO
I J3

•a
v
IS
Is

co sr
ffl UL
a|

"S
H
o
2
„
(0 Q)
S5
a 3
CO
c*
s
1
a



,
£


|

in
CM
CM


CD
O>

en
en

s
CO
CO
o
to

|| Northern States




CM
CO


O

CM
en



en
en
en
en

CM
*7
o
CM
co

[Northern States




o
CO


8
o

en



00
en

CM
CD
"
: O
, CM
r.

1"" 	
Northern States




- - f

%





*
..
^ s v
"*
-


•"
flj

























o

0.
en
en
en
cn

O
CO
o


en
en
en
en

 &+-
O To °
e JS ++
V) C 4-1

-------
stream.
       c


       H
    tL  S
    "a  J9
    H  .E
     '   M

    —  O

    «  S
                 0.
                 I
        o
        •5
                —   a
                "   1
                     c
                     J
                                                              52
                                                              W

                                                              £
                           £
                           jQ
                           2
                           •5
            o
            t>
            CO
         CB
        J3
                                                              CO
            (0
            TO

            ID
        £
        to

        Ul
        u
S"
o,
H
Q,
0)
"a.
i
i
i
£
^
+




a

        1

         I
         (O
                           •o
                           0>
                                                              a>
                                                               II
                                                               S
                                                               »
                                                               CA

                                                               fr
                                                              •O
                                                              O
                                                              CO
                                                      =?      5
4S S  09    c c     5
'B^^==    5
!"<»«.* fr.J.
S-£c|Oo>L2>.
sll'Sse'cl-S
,-= £ o  a
(J co Z  co

O    O     00
                                     5 Jo .2 in
                                                               ra

                                                               w


                                                               if
                                                               w
CO
s

-------
3

O
II

w
o
.tt «-•
c 15
.1-5
£

* I
^ eo
< c

ol


II

E o
CO *-

•^ re


Is
*- Q)
jo .0
cu n)
si
U. 0)
  Ul



Is
co CD
JA
«k



13 ** w
Si to o
JJx
•*
__ __^_


•o w 10
ȣpj
•5 H 9
V 8 V
*~ CD to






355

Jig
u»
t.
§
i1

s =
<» CO
£u.
s
5
&
•0
D
|2

M r=
n n
£"-
S
n
i
•O
* ...
*1
|5
u
z
„,*&
§ 8

o>

rx
m
to
to
_i
_i
<
1 Northern States



•:.!.*•
r*
r»



CM
m
to
o

IX
CO



OJ
0>
o>
0)

n
en
0>
(N
f-

1 Northern States




5



§

5
m


O)
0)
o>
e»

§
°?
r»
CN
CN

I) Northern States




o>
(M








o>

to
r+-
O
1
s
*•

|| Northern States




o



o
8
o

10
0)
CO
**•


m
m
en
en

«


1 Northern States




en
to



m
8

CO
o
«


en
en
en
en

00
en
»*•
*
8
r-.

I Northern States



• •
^



*
to
o

CM
co
C4


en
en
en
en

00
«P
oo
CM
00

|| Northern States




o
CO



i

r»
*
to


en
en
en
en

t—
•+
o
i
o
CO
0)

|| Northern States ~




IO
^~



r*

?
CM


m
en


IO
o
CM
o
co
co

1 Northern States
1
CO CD
±f 'C
3 10
O
0) •.£


31

-------
 O
O


c
-BO
(0 00
S A
0 «»•
U




"o ** 10
1 I9
*~ °





•o w in
a> «> CM
~ *~ o
Jv
&

















« «> -
S 
r-
-
-




-



..


CO
1*
^ »^
U tS o
Q *^ S
|||
f* ^ Tn
§3
ty> £S





































u.

U-

p>
•


a.

CO
in
0)
O
U)
o


a
0)
0)
en
en


m

i^
1


CM
CO
^



.*
1
O
5

en


a.

en
00
O
10
CO
en
o


a.
QO
en
en


-^
o
o
»


^.

-



£
O
S
iS
CO
a)
CO
Z
a.

a.

in
oo


Q.

rs
co
CO
en
O
^


a.
en
O)
en
en


en
en




U)

CM



^
o
3
2
CO
a
.2
Z
CM

ts





CO







«
-


'







'

CO
is

"O ^^^ «O
|V5 c
g >g —

§ § "o
WZl-
                                                                                      CO

-------
•o
 V

£
H~


So
»S
E A
3*


"° *j ID
! f V



•Q 09 IO
0} V CM
= i- q
"~ m e






]H
1 ..?..
•o
Si


cn

10
CM
CO
CO
_i
<
Northern States


	 •
£



i

00
00


CD
CD
CD

to
O
en
CO
CM
_

I Northern States


;.
S



CM

CM


cn
en
en
en

cn
00
IO
r-
CM
CM

1 Northern States


..i.
en
CM



1

CO


CO
CO
CO
cn

£
9
o
CO
CO

U Northern States


...-L.
CO
1*.



CO
o
o

co
CM

CO
CO
en
en

CO
o
i
o
CO
*

1 Northern States



o



|

CM
cn
CD

en
%
CO

CO
00


-------
•o
 V
 O
o
 I •
00

.2
A
 o


i!
Jo-"
««
£ A
0 *
o


» w m
If v







"o S in
2 to .:
1 CO V
-a *











111
s =

1.
tl
a
|1
(D
13
I
^
Q)
HJ
u
^
CO 'to
s
CO


•§
s §
ECO
>
Z
2 S
CO £
Q ^
OT

M
1
2
CO
O
u.

u.
"
CO
to


"•
CM
m
a
^.
o
«N

0.

 " % %<

, - %


w
> ..





o>
CN
in
o

a.

en
en
on
en


m
CO

(0
d


ara Mohawk
D
CO
Z
0.

Q.

0>
en


o.
oo
CM

o

Q.

j^
en
en


g
r^
i
*
•-


ara Mohawk
<3>
CO
Z
D.

 flj
D fi
z t-

                                                                      

                                                                         "a, €

                                                                         H "S
                                                                         >*  c
   ^  §
    f-  "
    i  I
    ^,  o
        to
                                                                             09

                                                                            •5
                                                                                  CO
i
                                                                     .0
5   «

i   *
a.
i
                                                                                              IO
                                                                                              0>
                                                                                              

                         »
                         ra

                         co
                                                                                              co

-------
"official" GEMS, and then with the other monitor designated as the "official" CEMS.



       Table 9 indicates that without using variance inflation factors, 114 subsets pass the bias



test,. 117 pass the F-test, 68 pass.the correlation.test, and.53.pass all three jests,._As indicated in



Table 10, using variance inflation factors results in 119 subsets passing the bias test, 116 passing



the F-test, 68 passing the correlation test, and 54 passing all three tests.
                                            44

-------
•8

g

1
5=
c

I
I
6
UJ
o
•s
CO
o
O

JS
S
10


to
I
O

TJ

TO
en

.2

fl
Si
6
          iis
          I?8
          2 «> if
          + -M o
                     o
                    O

1
value
                     «
                     •« JS
                     Qs
                       CO
S
    o>
                          01
    CM
                          O)
E
me
G
                              s

E
G
                                     u.

a-b

                                            0

C
                                S
                                CM
                                                          UL
                                                          (O
                                       3
                                                              CXI
                                       o
                                           fe
                                                                 CO

Si
                                                                     
       o

                                                         8
                                                         »
                                                                     O
                                                                                   UL
                                                            LL
              5
                                                                                   (D
                                                                                   GO
                                                         o
                                                                                       o>
O
                     s

                                                                                O
O

-------
•o
 CD
O
O

a>

£
A
CD
18
£ X
it*
 at -•?


                      *"5
                    U

             h^
             55
             H
             S
                      J

                      1
                       (0
                o
               CO
                           §
                           «
                           o>
                           2
                           O
                        §
                               ™
                                CM
                        u
                                    e»
                                    o>
                                        o>
                                        Oi
                                        9
                                            5
                                            o>
                                             o»
                                             to
                                             CM

                                                 OJ
                                          0)

                                                          u>
                                                          O

                                                             a.
                                                      s
                                                              (0
                                                                  en
                                                               3
                                                                      a>
                                                               81
                                                                       ao
                                                                       Of
                                                                           CD
                                                                   9*
                                                                           o>
                                                                        O
                                                                                   CD
                                                                            o
                                                                            (J

                                                                                        CM

                                                                                                 $
                                                                                                     CO
                                                                                                     q
                                                                                                     CO
O
                                                                                                         u>

                                                                                                         CM
                                                                                                         CO
                                                                                                         u
O


                                                                                                              (D
                                                                                                              m
                                                                                                      CO

                                                                                                      UJ
                                                                                                      o


                                                                                                      I


-------
1


 o
O

o>



O

.1
U

1'5
|"
§
1
a.
-o
f1

oS "5
TO il
a>
1
a.
•o
S «
IJ
!S
z
«s-
co
1
§
CO
i
d




m
o>



§
S

5
o


S

S
9
1
•*
1 Chesapeake CEMS a-d




in
r«.



in
IS
CM

U)
o



8
S

£
i
CO
»^
CD
IO
| Chesapeake CEMS a-d




in
o>



i
00

3
o


£
^"
CO

S
9
1
co
•o
&
O
6




S



in
in
S

S
1—


S
&

S
1
CO
CD
N
I
CO
UJ
o
i
o




m
o>



CM
i

5
o


g

Si
9
1
CO
1 Chesapeake CEMS a-d




it



1

S
w-


i

IS
i
CO
CD
o>
1 Chesapeake CEMS a-d




in
at



o>

S
o


00
§

9!
1
S
O
^
| Chesapeake CEMS a-d




S



§
8

s



S
S

£

co
£
^
t
CO
UJ
o
1
6




S8



je

3
6


§
CD

8

§
CM
1
CO
Ul
o
!
6




£



&
8

en
d>
o


It)
8

co
o»

i
^
|| Chesapeake CEMS d-a




m
h-



&
R

8
o


CO
§

a

m
5

| Chesapeake CEMS d-a




S



U)
to

o>
o



CO
1

s
o
1
CM
| Chesapeake CEMS d-a




S



CM
g

s
0


§
3

5

to
5
CO
S
CO
UJ
u
1
o




g



3

c!




S



i

S



CO
£

§!
o
1
00
| Chesapeake CEMS d-a




it



N

in
o>
0


S
S

in
r*


1 Chesapeake CEMS d-a




10
o>



g

§



co-
S

*

i
o
*"
	
Chesapeake CEMS d-a

-------
•g
 o
O
of
 ID
        "33 "  »
        |2S
          m
                  2

                 «'
TJ
S
                      J

:1
                       CO
  s
                           8
                           m
                               8
                                s
                                0)
                                                 CM

                                                      §
                                                      0>
                                                      (0
                                                      (D
                                                          CM
                                                              in
S
                                                                   (D

                                                                       a
                                                                       »
                                                                       O

                                                                           o
                                                                                o>
                                                          O
                                                                                    IO

                                                                                        10

                                                                                        CD
                                                                   3
                                                                                        CO

                                                                       8
                                                                                             CM
                                                                           S
                                                                                                          LL
                                                                                CO
                                                                                o
                                                                                                          eo
                                                                                                          S
                                                                                                          ol
                                                                                                              to
                                                                                                      I
                                                                                LU
                                                                                O
                                                                                      CO

                                                                                      LU
                                                                                      O
                                        &

                                        I
                                        6
                                            CO

                                            LU
                                            O
                                                                                           o

-------
 I
1
a
6>

i
sg
£ X
          CD
U


is
                   II
                     5
                     03
&
                         in
    o
                              10
                                  g
            CM
                                  co
            o
                           O
                u
                                          oi
                                          CO
                                          s
                       O
                                              3
                                              3
                                              a>
o
                            c\i
                                3
                                    
                                    8-
                                                          in
                                                   c\i
                                                          CM

                                                                  CO
                                                           0
U
                                                                          u.
                                                                          3
                                                         o
                                                                                      CP
                                                                 s

                                                                         o
                                                                                                   i
                                                                                                   v
                                                                                                   i
                                                                                                   CO

O

-------
      g
     "<& O
     J5 co
     O
     -Sift
     i
o
O
cri

                                      «P

                        0

                           s
                                         CM
s
                                            ig
                                            o>
                                            w
                                 £
esapeake CEMS c-d
                                                  CO
                                                  IO
                                                  2
      S3
                                                     o.
                                                     O)
                                          £
S
C
                                                           o>
                                                           o>
                                                           »
C

                                                              »
                                                              CD
s
                                                              05
C
                      S
                                                                    CO
                                                                    w
                                                      s
c-d
Chesapeake CE
                                                                       CO
                                                                       CO
                                                                       CM
                            o
                                                                          U-
                                                                          o
                     s
                                  s
s
                                                                             CO
ke CEMS d-c
C

-------
'f£S
   m
                   
                      8
                                                                            0

                                                                                 U)
                                                                                 C\l
                                                                                            LL
                                                                                            10
                                            §
                                                                                                  «
C
                                            5
                                                                                                        o>
                                                                                                        CO
                                                                                                        
                                                                                                             CO
                                                                                                             o
                                                             81
                                                                                                                         (O
                                                                                                                         in
                                                                                                                        CO
                                                                                                                        0
                                                                                                                              (0
C
                                                                                                                                    «
                                                                                                                                    cn
                                                                                                                                    I

-------
8

i
               8
            tn
•o
 I


I
6J

i
1
                        o
                            CD
                         o
                        0

                        0
                            CO
                                 u.
                                 a
                                 CO
                                      S
8
                                       co
                                            CO
                                                 o>
                                                 8
C
                                                       CM
                                                       CM
                                                            to
u
     8

                                                                 u.
S
C

                                                                      8
                                                                       CM
8
                                                                            «
                8
                                                                            CO
e-c
Ch
                                                                                           u.
                                                                                                                      u.
8
8
                                                                     o
                                                                     CO
00
«
                                                                                                                 8
                                                                                                                 o
                                                                                 "

                               S

                               §
                               §
                                §
                                §
                           g
                           §
                                                                                 «
                                                                                 CM
                                                                                            CM
                     S
                                                                                          CO


                                                                                          CM
                      8
                                                                                         CM



                                                                                         CM
                      g
                                                                                                                            CM
                                                                                 S
                                                                                 CO
                                     fe
                                     CD

                                     ct
                                     co
                                                                                       10
                                                                                            CO
                                                                                                       00
                                                                                                                            CM
CEMS e-c
Ch
e-c
ake CE
ke CEMS e-c
                                     en

                                     Ul
                                     o
                                     o
e-c
eake C
C
ake CEMS e-c
CEMS e-c
Ch
eake CEMS e-c
C
                                                o
                                                           o



                                                           o*



                                                          "i"
                                                           o
                                                                          <•>
                                                                          o
                                                                                                                                       co
                                                 
-------
•o
 CD
 O
O



c.
1 i
I"


•° in
|||




^|«
1 s|
m





§1|

|1
•o
P

|2
1
Q.
•o
P
U

g —
|i2
=
Q.
•tf 0)
I1
z
ll
0 =
CO
"§
1
Q




«



i

s
0


o
CO
S

o>

i

1 Chesapeake CEMS d-e




5



CO
s
0

TO
T"



1

5
ni

r%
5
CM
1 Chesapeake CEMS d-e




ff



o>

8
o


co
§

s

1
TO
CO
•6
CO
!
6




s



i

CM



i

S
t
CO

| Chesapeake CEMS d-e




ff



i

m
CO
o


i
o



s
0


0)

Si

CD
O>
»,.
Chesapeake CEMS d-e




5



i

§



i



CD
O
1 Chesapeake CEMS d-e




w



i

m
CO
o


1

8

i
^
1 Chesapeake CEMS d-e




5



CD

CO
o



I™"
CD



CO
CM
1 Chesapeake CEMS d-e




CO



1

S



i

TO

N
^
| Chesapeake CEMS e-d




m



CO
o

o



o>

3

1

Chesapeake CEMS e-d




5



1

a>
CO
o


i

P

CD
CM
| Chesapeake CEMS e-d




S



CO
0

o



i

s
•
i
TO
«
O
1
o




5



i

S
o


CD
CM
O

S

CD

1
0
1
0




w



i
O

CO

n

a>

3
i
CD
in
CO
0
1
ID
6




5



1

s
O
n

in
S
T"

3

CD
CO
1 Chesapeake CEMS e-d




w



5

CD

Q

i

CO

1
I'-
ll Chesapeake CEMS e-d

-------
CD
 
-------
1
s
.t;



S
 CO
O

 co
co

Ul
O
I
-2
£
 CO
 S

5
•o
 i
 ts
m
 o
6
T—

 0

1



C '
C0 
^
UARG Attachment E




8



CO

CO
0


CO

5
a>
^
1 UARG Attachment E























u>



o

§


i

2
K
1^
d
1 Chesapeake CEMS a-b




j



j'

l


i

i
CO
CO
V
i
Chesapeake CEMS a-b




in



O)

o


o

s.
1
*
| Chesapeake CEMS b-a




i



t

i


i

i
co
V
i
CO
Ul
P
1
p























S



0

c\i


§

U)
1
^
|| Chesapeake CEMS a-c




co
r*-



o

o»
in


o
8
o

CD
5
CO

| Chesapeake CEMS a-c



•
5



o

%
in


CO
o
o

C\j
1
CM
i
UI
o
o




S



8
o
o

5


8
8

CD
1
CO
s
CO
Ul
p
p




5



1

s
U)


CM
8

8
CO
I

0
Ul
0
1




s



§

0>
s


o

CO
5
CO
u>
|| Chesapeake CEMS a-c




5



o
o
8

CO
- CO
If)


CO
8
o

f-
*•"
CO
1
CO
CO
Ul
p
1
CD
!
p




»



o
8
o

»


8
8

5
CD
1

| Chesapeake CEMS a-c




5



1

s
IO


I

CO
CO
CD
CO
Chesapeake CEMS a-c




co



o

§


|

co
1
05
|| Chesapeake CEMS a-c




5



8
o
0

ID


CO
8
o

co
CO
1
o
| Chesapeake CEMS a-c

-------
!
o




jo eo
t= A
o


!»s
*I8




1|«
liS
m





Jl"

"& =
|
r»-
U>


§
3
m
CO
S
CM
Chesapeake CEMS a-c




S



$
s
5
o


o>
en
CD
CD
S
•P
i
^
s
en
o
I
6




fs



!
S
o


g
s
8
«9
to
^
S
a
u
c
S
8
S
j
a























*



I
o>
1 ^
o


s
«
3
*
i
CM
1 Chesapeake CEMS c-a




S
*•"•


CO
CD
O>
CO
S
0


I
S
«p
§
(0
CO
S
CO
o
1
o




?



s
g
CO
0


o>
g
«?
I
Tf
|| Chesapeake CEMS c-a





en
o>
9*
«P
i
to
s

a-
1
0




5



I
00
V—
0


i
o>
h.
T"
9
i
to
s
(0
01
O
i
o




s



o>

o


o>
S
CD
5
«9
I
N

s
o


CD
CD
3
9
s>
9
' CO
o>
1 Chesapeake CEMS c-a




5



CO
CD
3
CO
0


g
§»
co
*^
«?
9
CD
0
Chesapeake CEMS c-a




co
K



a>
S
o


I
S

fe & CD O g CO g 9 S CO CM 1 Chesapeake CEMS a-d K w 10 o to K i^ JS CD co CD to Chesapeake CEMS a-d


-------
1


o
 CD
i



e
J8 oo
§ A



® fiS
=3 * °-
*2*



3?8
•si o
5ii
^ m °





S"g=5
$ 8i£
c5°-

K "5
w IX.
•o

o


O)
8
00

3
9
§
•*
TJ
(b
CO
UJ
o
I
6




IS



8
Si

8.


1

S
9
m
to
in
Chesapeake CEMS a-d




in
o>



£
to

§>
d


s
U)
i*.

s
9
1
(O
| Chesapeake CEMS a-d




R



2
S

q


S
fe

s
9
to




oo

i^
en
d


K
l>-

s
9
1
CO
1 Chesapeake CEMS a-d




S



*
5

s


^
0)
in
h-

o
r*.
9
m

Chesapeake CEMS a-d




S



00
i

CM
O>
d


n
CD
oo

§
T
1
o
•D
(b
CO
0
1
o>
1
CD
5




h.



§
m

q


IO
U)
r-

fe
9
CO
CO
^
| Chesapeake CEMS a-d




£



S
R

S
d


i

CO
CM
1
I
CM
1 Chesapeake CEMS a-d




CD
oo



r>-
R

S
d


N
s

g
0
en
3
^
<
1 Chesapeake CEMS d-a




IS



1

m
<»
o


CO
2

CO
t-.
o
CO
CO

I
0
6




s



CO
£!

O
*•
T~


CO
CO
CO

S
o
§
CM
| Chesapeake CEMS d-a




2



5
R

%
0


1

CD
h-
o
«
<0
CO
Chesapeake CEMS d-a




S



o>
%

0


T—
fmm
CO

o>
•*
o
1
xt
| Chesapeake CEMS d-a




R



i

in
en
o


O)
B

.in
r-.
o
CO
(0
u>
10
•6
LU
0
CD
O




m
a>



co
O
^

q


i
CM

0>

-------
&
o
             m
O

o
                          ll
                             "•
                             o
                          o
                             '5
                             IL

                             l
                             TO
                             11
Computed
value

                                   CM

                                        CM

                                                    3

                                                         CO
                                                               S

                                                               CM
CEMS a-e
C
                                                                     o
                                                                     CO
C
                                                                          (D
                                                                                *
                                                                      EM
                                                                                to
                                                                                      CO
8
                                                                                      (O
C
                                                                                                                                         u_
                                                                                           o>
                                                                                           00
                                                                                                 o>
                                                                                             o>
                                                                                             00
                      CD
                      O>
                 O)
                 00
to
0)
ct
8
                                                                                  in
                                                                                  m
                                                                                                 
                                                                                                       00
                                                                                                       00
                            §
                            S
                                                                                                                        03
                                                                                                                               0>
                                                                                                                               10

                                                                                           °>
                                                                                           o
                                                                                             en
                                                                                             o
                            §
                            d
                                                                                                                        g
                                                                                                                                    CM
                                        oo
                                        en
                                        o
                                                                                                                                               CM
                                                                                           en
                                                                                           r-
                                                                                                                  CO
                                                                                                              00
                                                                                                              §
                                              00

                                              oo
                                        en
                                        §
§
      
-------
        « o
       0
       lis
       •? al
       ^ B
•u
a>

,c



1
n
                     «
                   I1
"i
                   cd
                     (O
                         eo
                         o>

8
                             u>
                              s.
                                 5
9


                                          ao
                           S
                                              O>

                                                  CM

                                                      o>
                                                      CO
                                                      CM
                                                          §
                                                      o>
                                                         .S
                                                         ' o-
                                                      cti
                                                          CM
                                                          CM
                                                      CO
                                   O
                                    at
                                    
                                                                    CO


                                                                    6
                                                          8
                 §
                 8
                                                                  in
                                                          8
                                                                                               o>
                                                     i
                                 i
2
CO


o

ai
                                                                                       O
CO

UJ
O

I
                                                                         6
JD

•6




UJ

O

O



I
CD
                                                                      O

-------
•o
 
fi
o
z
•s-
|D O
%i tO
w fi
O 3
03
CM
§
§
CO
i
Q




&


s


fe
o


00
o
o>

8
1
I
*
1 Chesapeake GEMS b-e




i
1

i

i



i

i

i
i
1 Chesapeake GEMS b-e




fc
1 1

8
8

3
V*


2
g

a

I
3
| Chesapeake CEMS e-b




i


i

i



i

i

CO
s
V
i
| Chesapeake CEMS e-b
























s


o>

£
0


o>

8
t
I
_l
I
CO
0
6
ll



s?


i
O)

s
0


0>

K

^


o>
s
o>

s
0


£
s

5
«?
i
CM
|| Chesapeake CEMS c-d




s?


0>

s
o


S

s

«

>
o>

a
^
0


00
en
a>
o>

S
9
1
*
1
0
s
ii



S


o>
g
a»

^•
U)
0


a>

S

|| Chesapeake CEMS c-d




^


§
o>
0>

s
o


CO

k
«?
i
K
S
CO
LLJ
O
i
6




fe


S
^

o>
^«
0


co
o>
o>
o>

S



i
O)

S
0


i

is
9
i
o
| Chesapeake CEMS c-d




$>


i

O>
O


•a>
g
e»

IS
«?
§
o
1 Chesapeake CEMS c-d




81


o>

to
IO
o


oo
o>
g

s
9
1
^
1 Chesapeake CEMS c-d




«


0)

o>
t—
o


o>
g
o>

S
«?
1
CM
*^
1
UJ
0
1
S




s


§

^
01


I

0
CO
•*
co
3
S
1 Chesapeake CEMS d-c




S


0
o
8

£



§
o

w
«
c^
CO

I Chesapeake CEMS d-c

-------
1
o



o
O


^
lill




,|«,
•? « ?
T~E°





•"*-

W is
Computed
value

m w
1
>
O.
-0
S *
I1

1*
w Q,
O
1
O.
•P
•e 


o
8
o

w
10


g
8

s
CO
1
Tf
1 Chesapeake CEMS d-c




n


o

eo



O

s
CO
CD
10
|| Chesapeake CEMS d-c




^


o

CO
CM
in


8

S
CO
3
to
to
1 Chesapeake CEMS d-c




s


8
8

g
T**


1

fe
CO
1
N
1 Chesapeake CEMS d-c




«


o
8
o

8
in


8

g
CO
1
eo
u
•6
CO
Ul
o
6




s


8
o
o

2



CM

W
CO
i
0>
Chesapeake CEMS d-c




CO.
it

8
8

r-
10


8

S
CO
1
o
| Chesapeake CEMS d-c




S
ii

o
o
8

S



8




en

%
o


i

o
•9
*-•
*
| Chesapeake CEMS c-e




§


o>
O)

R
o


i
0)

Si

i

1 Chesapeake CEMS c-e




3


8
en
o>

gj
0


O)

§

et
to
CM
I
Ul
o
6




8


i

w
o


o>

s

&
(0
CO
1 Chesapeake CEMS c-e




*


a>
o>
O)

Si
o


en
en
en

^~
i
1

| Chesapeake CEMS c-e




§
£L

O>
O>
OS
o>

«
0


o>
en

£
9
1
in
| Chesapeake CEMS c-e




T"
o.

i

CM
O


o>
s
O)

t

CD
CO
CD
6
to
UJ
o
6




§
Q.

en
en
en
CO

CO
o


en
o
o>
o>

in
en
i
r-
| Chesapeake CEMS c-e

-------
•u
 
i
a
Computed
value

8l~
52 «
rfu.
o>
°-
•g
S Sj
I1
z
ll
Ds
Ol
o
CO
i
o




5



o>

SI
o


CD

1*.
-i
i
co
1 Chesapeake GEMS c-e




§



^
S

s
0


o>
o>
g

r*.
«?
i
o>
1 Chesapeake CEMS c-e




§



o>

1


s
g

CD
^
i
o
1 Chesapeake CEMS c-e




o
CO



o>

s
0


o>

1*.

g
0>

N
O


g
g

K
^
s
CO
N
CD
6
O
1
c§




fe



o
o
g

10
K
oi


0

R
2
V-
_l
^
s
Ul
O
1
CD
6




O
00



§

8


g
0
o

Si
CO
i

| Chesapeake GEMS e-c




51



I

5!
•*


0

PI
*
i
CM
1 Chesapeake GEMS e-c




o
CO



o
g
o

0>
2


o

CO
o>
CO
i
CO
1 Chesapeake CEMS e-c




9?



§

5


o

^
ir
1
*
I
CO
o
1
o




g



o

2


0

fc
o>
fe
CO
in
«
O
5




5



0
o
g

eg
*


o

r^
^"
i
CD
|| Chesapeake CEMS e-c




§



g
g

«
T~


0

10
CD
S
I'-
ll Chesapeake CEMS e-c

. t •


5



o

s
^


o

N
•*
i
CO
1 Chesapeake CEMS e-c




§



o

CO
2


o

r-.
o>
i
a
Chesapeake CEMS e-c




9



o

o
5


O

CD
<*'
s
CO
o
I
O
1




§



o
o
g

5


g
g

r-
0>
i
T"
(Chesapeake GEMS e-c




co
CJ



0
o
g

s
•*


0
o
g

r-
•*
i
CM
|| Chesapeake GEMS e-c
























£



i

0
o>
0


i

CD
0
N.
m
*
1 Chesapeake GEMS d-e

-------
1
.1

o
U
•8


.
c
11
1"


"§•§§
=5 * •
*£&



•sis
!S ^^ O
Si a
CO






2 IS =
jl*
. %-_- , . n .

2 "^5
TO 1.1
"S
}!

li
TO y_
®
1
Q.
Computed
value

I55
£"•
o>
_3
TJ


00
0


CD
s
Ol
S>
O
1

?
1 Chesapeake CEMS c




5>



8
S

•*


§
o>
s
1
!•-
CD
CM
ID
| Chesapeake CEMS d




R



§
§}

oo
d


£
(O
w
3
0
i
CO
£
T>
0
J
S
a
6




en



i

CO
^*


fe
5>
S
i
N.
S
•«
T
|| Chesapeake GEMS d

<"'!'


S



s
s

s
6


£
en
CXI
m
o
S
CO
10
o>
| Chesapeake CEMS d




en



CM
O>

O


s
IO
oq
S
1
r^
CD
CO
T
1 Chesapeake CEMS c




P



o
3
o>

s
6


«
5
o
1
K
«
I] Chesapeake CEMS d




5



1

o

CD
to
UJ
o
6




CO



CO
*

0)
q


00
u>
00
1*;
S
9
N
CD
0




3

oo
00
d


co
8
S
V™
t^
»•
CO
CM
•o
|| Chesapeake CEMS e




S



S
o
o

5


in
T—
CO
t»;
S
9
i
n
•o
1 Chesapeake CEMS e




Z



1

S
d


«
CD
q
§
^
h-
CO
•*
•>>
•o
d>
CO
O
6




R



^~
o

o
«M


3
^
S
9
1
10
•D
| Chesapeake CEMS e




o>



oo
o
00
co

o>
d


S
s

!•».
CO
CD
•o
1 Chesapeake CEMS e




S



0
o

5


g
cq
T—
in
9
1
K
•o
Chesapeake CEMS e

-------
      I4
      ll§
      JI8
       m
•o
CD
                1.1
                ii
               o
               O
is
^
                i
              "S
              B $
              0
               t a>
              Da
                to
                I
                    co
                    co
                   O
                       LL
                       CM
                       
       0

                             CM
                             CM
d>

CO


UJ
o



i
CD

8-
CO
CD

6
                 >

                                CJ

                                                   eg
                              CD


                            \ |



                            mmj JO




                            •"* .e

                            I  £



                            "c S
                                            CD



                                            1

                                         -  =
                                                           s
                                                     CD
                                           CO


                                           Ul

                                           o

                                           CD
                                                         a ™
                                      o

                                      i
                                                     I
                                                 I   *
                                                 s   =.

-------
6.     Conclusions



       The analysis  performed in this report indicates  that statistical  tests for  alternative



monitoring systems, are stringent but not preclusive, particularly if augmented by a procedure that



compensates for variance underestimation due to autocorrelation.  Despite the absence of strict



QA/QC procedures in the field tests at the Chesapeake unit, a substantial number of subsets of



the paired CEMS/CEMS data passed the three prescribed statistical test, whether or not a variance



inflation estimate was used (Table 5, Table 6, Table 9,  and Table 10).  Applying variance



inflation estimates to the available CSA/CEMS data, one subset passed all three statistical tests



(Table 6). Two OSA/CEMS subsets passed all three tests when the data were analyzed at the



level of refinement of the alternative monitoring system (Table 7 and Table 8).  The latter results



suggest that under-performance on the correlation test may have been due to limitations in the



data rather than to the stringency  of the test Having to pair hourly CEMS measurements with



daily AMS values for the Northern States Power Co. database and  with weekly AMS values for



the Niagara Mohawk database is likely to have had a detrimental impact on correlation test



results. Under the proposed regulations, which require hourly measurements for both the CEMS



and AMS, this confounding factor should not be present
                                          65

-------
                                     References
Box, George E.P. and Gwilym M. Jenkins. 1976. Time Series Analysis: Forecasting and Control.
Revised Edition. Holden-Day, San Francisco, CA.

Box, George E.P., William G. Hunter, and J. Stuart Hunter. 1978. Statistics for Experimenters.
John Wiley & Sons, New York, NY.

Clean Air Act Amendments, 1990. Public Law 101-549,101st Congress, November 15, 1990.

Cochran, William G, 1977. Sampling  Techniques. John Wiley and Sons, New York, NY.

40 CFR, Part 60. Code of Federal Regulations, Title 40 ™ Protection of Environment, Part 60 -
— Standards of Performance for New  Stationary Sources. Revised as of July 1, 1991.

40 CFR, Part 75 Code of Federal Regulations, Title 40 — Protection of Environment, Part 75 -
- Continuous Emissions Monitoring: Proposed  Rule.   Federal Register, vol.  56, no. 232
(December 3, 1991). pp. 63291-63335.

Gujarati, Damodar N. 1988. Basic Econometrics. 2nd Edition. McGraw-Hill Book Company, New
York, NY.

Magee, Lonnie. 1989. Bias approximations for covariance parameter estimators  in the linear
model with AR(1) errors. Commun. Statist Theory Meth., 18(2):395-422.

Rawlings, John O.  1988. Applied Regression Analysis: A Research Tool.  Wadsworth  &
Brooks/Cole Statistics/Probability Series.  Pacific Grove, CA.

Steel, Robert G.D. and James  H. Tome. 1980. Principles and Procedures of Statistics: A
biometrical approach. 2nd Edition.McGraw-Hill, New York,  NY.

Wolter, Kirk M. 1984. An investigation of some estimators of variance for systematic sampling.
JASA. 79(388):781-790.
                                        66

-------
                                       Appendices
  ...  Three appendices supplement this report     ....-.,,,„..    ..,--.    	     ,.._,




       Appendix A summarizes the results of screening each of the databases used in this study



for normality and autocorrelation.




       Appendix B is a paper by Dr. David A. Dickey, Professor of Statistics at North Carolina




State University, entitled "Effects of Autocorrelation on Statistical Analysis."  It provides a




theoretical background for the discussion in Section 4 ("Autocorrelation Analysis") of this report




       Appendix C provides documentation on the data subsets analyzed in this report

-------
Appendk A

-------
                        THE CADMUS GROUP, INC.
                              Executive Park, Suite 220
                                 1920 Highway 54
                                Durham, NC 27713
             	.Tefepispne: (91§)_554-9454    . Telefax:.(919).544:9453..._..

                                 May 14, 1992
To:
From:
Subject:
Elliot Lieberman
Emissions Monitoring Section, ARD

William Warren-Hicks	
Susan E.  Spruill
Jane E. Mudano
The Cadmus Group, Inc.

Statistical Analysis of Alternative Monitoring Systems
Please find enclosed analyses and summaries for parts 2(a-f), 3, and 4(a-e) of your memo
dated April 21, 1992, requesting testing of alternative monitoring (AM) systems. The
following data were analyzed:

o     UARG data from Attachment E of Public Comments (Table 1, page 4). There are 24
      hours of CEM (A) and AM (B) data, but two hours are missing. All data were
      recorded in ppm.

o     Chesapeake data from Entropy (Section 75.21; EPA Contract No.  68-02-4462;
      Work assignment No. 91-156). One reference CEM (A) and four alternative CEMs
      (B-E) were monitored hourly  for approximately  63 days. All SO2 data were recorded
      in ppm.

o     Homer City Unit #1 (from KEA), recorded as daily CEM (Ibs/MMBtu) and using daily
      coal sampling (CSA). Sampling covered a 730 day period.

o     Homer City Unit #3 (from KEA), recorded as daily CEM (Ibs/MMBtu) and using daily
      coal sampling (CSA). Sampling covered a period of approximately  730 days.

o     Niagra Mohawk (from KEA),  recorded as hourly CEM (Ibs/MMBtu)  and using weekly
      oil sampling (OSA). Sampling covered a period  of approximately 455 days.

o     Northern States Power Company (from KEA), recorded as hourly CEM (Ibs/MMBtu)
      and using daily  coal sampling (CSA). Sampling  covered a period of approximately
      730 days.

-------
The following summaries are labeled to correspond to the memorandum dated April 21,
1992:
2.
Screen data to determine whether it is normally distributed.
The SAS procedure UNIVARIATE was applied to all CEMs and AMs in order to:

      a)     determine the mean
      b)     determine the standard deviation of the mean
      c)     compute the Shapiro-Wilks test (or Kolmogorov test) for normality
      d)     graph normality (Q-Q) plots, and
      e)     graph frequency distribution histograms of the data.

Table 1  summarizes the univariate results: 2a, 2b, and 2c, above. For the test of
normality, note that the UNIVARIATE procedure will use the Shapiro-Wilks test whenever
there are less than 2000 observations, and will automatically use the Kolmogorov test
whenever there are 2000 or more observations. Please also note that there are a few
problems with these tests which will be discussed in part  3.

Normality plots (2d) and frequency histograms (2e) for each CEM and AM are also
enclosed. In addition, time-series plots were produced for the CEMs, AMs, and their
differences (CEM-AM).
3.     Screen data, which is not normally distributed, to determine whether it is
       loonormallv distributed.

It should be noted that nearly all variables failed the test for normality, based on 95%
probability  (a=.05).  This is because the normality test is quite sensitive to large sample
size. Due to this sensitivity, the test for normality will generally reject the hypothesis that
the data are normally distributed.  In addition, there are a few "outlier" observations in the
data sets we  used, which may "skew" their distributions.  Classical statistics theory which
assumes large sample populations to be normally distributed.  Therefore, we do not
recommend the use of these tests for determining normality.

Instead, we recommend that you observe only the values of the normality statistics
(Shapiro-Wilk's W, and Kolmogorov's D), ignoring the associated probability, and visually
inspect the frequency histograms and Q-Q plots. Both statistics (W and 0} have a range
between 0 and  1: a W= 1 (or D =0) would result if the data were perfectly normally
distributed; values approaching W = 0 (or D = 1) increase the probability that the data are
not normally distributed. Note that most statistics are very close to the extreme of the
range which denotes normality. In addition, nearly all frequency distributions demonstrate
symmetric curves with the mean approximately equal to the median. The Q-Q plots
demonstrate the straight diagonal alignment of the residuals which is typical of the normal
distribution. Based on these observations, we determined that all data are actually

-------
normally distributed, except for the AM data (CEM B) from the UARG table.

The Shapiro-Wilks test is appropriate for the UARG dataset because of its small sample
size (N = 22). The CEM (CEM A) data were found to be normally distributed.  Both UARG
CEM A and CEM B were transformed using the natural  tog and univariate analyses were
rerun to determine  if the transformation normalized their distributions. This transformation
was  unsuccessful.  Because CEM A and the CEM B do not appear to come from the same
distribution,  it is not appropriate to compute the difference between them.  However, for
consistency, statistical summaries of these differences were reported.

There was no justification for testing for normality of the differences (CEM-AM).
Differences between normally distributed variables are also normally distributed, and all
variables analyzed are assumed to be normally distributed.
4.    Autorearession analysis using SAS AUTOREG procedure.

The AUTOREG procedure is not available on our SAS contract. However, determination of
autocorrelation of CEMs and AMs could be determined by a number of other methods:

      a)     Pearson correlation of CEM  (or AM) values and the first order lag of those
             values (ie.pcEM.^cEMi).
      b)     Regression of the CEM (or AM) values on the first order lag of those values
             and observing the slope (/?).
      c)     Time series regression of CEM (or AM) over time and computing the Durbin-
             Watson statistic (D) for first order autocorrelation.  This test  is available in
             the AUTOREG procedure. As D approaches zero, the probability of
             significant autocorrelation increases.  A table of critical values for D can be
             found in most statistics texts. For large samples (N> 100), the critical value
             is usually around D = 1.5 at  o= .05.

Table 2  summarizes the results of the above tests for CEM and AM data. Due to the high
autocorrelation which existed in nearly all CEM and AM data sets, differences between
CEM and AM were computed from the residuals of the regressions of these values on their
Lag1 {part b, above). Residuals from such analyses are independent, therefore differences
between the residuals of the CEM and the residuals of the AM are corrected for the
autocorrelation of the CEM and AM data.  As a check, we ran a time series analysis of
these residual differences and the computed the Durbin-Watson autocorrelation statistic
(given in Table 2). All  residual differences were uncorrelated.

-------
Table 1.  Summary of univariate analysis results.
Data Source
UARG
CEM A
CEMB3
Chesapeake
CEM A
CEM B3
CEM C3
CEM D3
CEM E3
Homer City
Unit 1 CEM
Unit 1 CSA3
Unit 3 CEM
Unit 3 CSA3
Niagara Mohawk
CEM
OSA3
Northern States
CEM
CSA3
- N1
Mean
Standard
Deviation
Normal
Statistic2
Normal

22
22
484.00
468.64
36.77
46.13
W=0.9356
W=0.8825
Yes
No

1617
1342
1560
1448
1521
649.81
641.92
593.36
642.13
644.38
95.08
114.91
152.13
89.04
89.00
W =0.9707
W=0.8915
W=0.7617
W = 0.9548
W = 0.9606
.Yes
Yes
Yes
Yes
Yes

497
572
496
578
2.42
2.53
1.49
1.44
0.19
0.26
0.15
0.12
W=0.9716
W= 0.9944
W=0.9583
W = 0.9534
Yes
Yes
Yes
Yes

6801
62
0.61
0.73
0.08
0.04
D = 0.0637
W = 0.9390
Yes
Yes

16081
667
1.28
1.46
0.22
0.14
0=0.1644
W=0.9785
Yes
Yes
1   Homer City alternative monitoring (AM) measured daily, Niagara Mohawk AM measured
   weekly. Northern States AM measured daily, all others measured hourly.

2   W = Shapiro-Wilks test, range: 0_<.W<.1
   D = Kolmogorov test, range: OjC.D.<.1
      As W  approaches  0 (D approaches  1) the probability of  rejecting  N~(/j, a2)
      increases.
   alternative monitor

-------
Table 2. Summary of autoregression analysis results.
Data Source
UARG
CEM A
CEM B5
CEM A - CEM B6
Chesapeake
CEM A
CEM B5
CEM Cs
CEM D5
CEME5
CEM A - CEM B6
CEM A - CEM C6
CEM A - CEM D6
CEM A - CEM E6
Homer City
Unit 1
CEM
CSA5
CEM - CSA8
Homer City
Unit3
CEM
CSA5
CEM - CSA8
Niagara Mohawk
CEM
OSA5
CEM - OSA6
Pearson1
Correlation
Regression2
Coefficient (/?)
Durbin-Watson
Autocorrelation3
D
Statistic4

0.6841
0.6554

0.6358
0.6659

0.268
0.023
-0.376
1.385
1.925
2.694

0.9347
0.9449
0.8631
0.9739
0.9099




0.9233
0.9401
0.8668
0.9759
0.9095




0.885
0.892
0.838
0.960
0.884
-0.391
-0.218
-0.304
-0.400
0.229
0.214
0.316
0.071
0.224
2.783
2.436
2.608
2.789

0.7431
0.7376

0.7348
0.7348

0.735
0.737
-0.108
0.529
0.524
2.212
•
0.8507
0.7991


0.7834
0.9964

0.8421
0.7998


0.7782
0.9964

0.817
0.792
-0.155

0.760
0.996
-0.217
0.363
0.412
2.305

0.479
0.007
2.434

-------
Table 2 continued
Data Source
Northern States
CEM
CSA5
CEM - CSA8
Pearson1
Correlation
Regression2
Coefficient (/?)
Durbin-Watson
Autocorrelation3

0.8129
0.9754

0.8088
0.9757

0.754
0.975
-0.091
D
Statistic4

0.492
0.050
2.181
1  Pearson correlation of value {CEM or AM) to its Lag 1 value

2  Simple regression of original value (CEM or AM) on its Lag1 value

3  First-order autocorrelation from time-series regression

4  Durbin-Watson statistic (for N> 100, critical D = 1.5 at a=0.05)

5  Alternative monitor

6  Differences computed from residuals in order to remove autocorrelation
   within CEMS and AMS

-------
Variable=CEM A
                             UARG attachment E data  15:07 Tuesday, May 12,  1992
                                                                               8

                              UNIVARIATE PROCEDURE
                Stem Leaf                     #
                  56 3                        1
                  54 35                       2
                  52 5                        1
                  50 1                        1
                  48 991366                   6
                  46 8959                     4
                  44 24937                    5
                  42 29                       2
                     	+	+	+	+
                 Multiply Stem.Leaf by 10**+1
                                             Boxplot
570+
430+
                                 Normal Probability Plot
                                                               * +++++
                                                        *  *+++++
                                                      *+++++
                                                 +++*+
                                           ++*****
                                      ++****
                                *+*+** *
                         * ++*+++
-2
-1
                                      +1
                                                               +2

-------
Variable-CEM B
                             UARG attachment E data  15:07 Tuesday,  May 12,  199
                                                                              1

                              UNIVARIATE PROCEDURE
                Stem Leaf                     #
                  52 250                      3
                  50^19067                    5
                  48 955                      3
                  46 20                       2
                  44 6                        1
                  42 464                      3
                  40 35001                    5
                     	+	+	+	+
                 Multiply Stem.Leaf by 10**+1
       Boxplot
               530+
               470+
               410+
    Normal Probability Plot

                   ** **+*+
                **++++
             **+++
         +++*+
     +++* **
*++*+  *
   *++*
-2
    -1
+1
                                                               +2

-------
   I
   o>
I
Q
     o
     w
'- oo
                       (radd)
                                              I

-------
w
1
    O)
         iC
                                                            r §3



                                                            ^8
                                                           - 00
                                                             
-------
w
•S

1
P
                                         rs
                                          §
                                             o

                                             3
                                         1 O
                                         - 00
                                         P O
                             a
                 (uidd)

-------
                             UARG attachment E data
                                                   17:27 Wednesday, May 13, 19
Model: MODEL1
Dependent Variable: CEM_A
                              Analysis of Variance
Source
Model
Error
C Total
Root MSE
Dep Mean
C.V.
Variable DF
INTERCEP 1
CEM B 1
Sum of Mean
DF Squares Square F Value
1 19302.18052 19302.18052 42.451
20 9093.81948 454.69097
21 28396.00000
21.32348 R-square 0.6797
484.00000 Adj R-sq 0.6637
4.40568
Parameter Estimates
Parameter Standard T for HO:
Estimate Error Parameter=0 Prob
176.008727 47.48895471 3.706 0
0.657207 0.10086893 6.515 0
Pirob>F
0.0001
> IT)
.0014
.0001

-------
                                Chesapeake Data
                              UNIVARIATE PROCEDURE
                                9:16 Thursday, May 14, 1992
                                                          2
Variable=CEM A
                         Histogram                        f
    925+*                                                 3
       „******                                           57
       .********                                         79
       .***********                                     109
       ,******************                              173
       . *****************************                   281
       , *********************************************   441
       ,****************************                    275
       .**************                                  137
    475+******                                           53
       .*                                                 6
       .*                                                 2
                                                   Boxplot
                                                      0
                                                      0
                                                   *—+—*
                                                      0
                                                      0
     25+*

        * may represent up to 10 counts
                                 Normal Probability Plot
               925+                                                  *
                                                              ********
                                                          *****+++
                                                      *****+
                                                  *****
                                             +*****
                                       ********
                                ********
                          *******+
               475+********++
                   *+++
                   *
                25+*
---- + ---- + ---- + ----
   -2         -1
                                        ---- + ---- + ---- + ---- 4. ---- +
                                            0        +1        +2

-------
                                Chesapeake Data


                              UNIVARIATE PROCEDURE
9:16 Thursday, May 14,  1992
                          4
Variable=CEM B
                         Histogram
    925+*
       .*****
       (**********
       (**************
       ,**************************
       .********************************
       .*********************************************
       .***********************************
       .****************
    475+*******
       .*
       .**
       .*
     25-*-**
        	+	+	+	+	4.	+.
        * may represent up to 7 counts
     f
     2
    35
    65
    96
   180
   224
   314
   242
   107
    45
     3
    14
     1

     1

     1

    12
                   Boxplot
                      0

1
*- — I


1
(-— *

                      0
                      0

                      0

                      *
                                 Normal Probability Plot
               9254-                                                ++*
                                                               *******
                                                          ******
                                                      *****
                                                  *****
                                             ******
                                       *******
                                 *******+
                            ******+++
               475+     *****+++
                       +*+++
                   ++****
                     *
                25+**
                       -2
-1
+1
          +2

-------
                                Chesapeake Data
                              UNIVARIATE PROCEDURE
                          9:16 Thursday, May 14, 1992
                                                    €
Variable=CEM C
    875+***
                          Histogram
        ,*******
        *********
        ,************
        ,******************************
        , ************************************************
        ,******************************************
        , ***********************
        , **************
        ,****
                                  #
                                 22
                                 54
                                 65
                                 92
                                238
                                380
                                330
                                180
                                109
                                 25
                                        Boxplot
                                           0
                                           0
                                        *_____*
     25+*********
        	+	+	+	+	+	+
        * may represent up to 8 counts
                                 65
               875+
      Normal Probability Plot
                                  ++++ ****
                               +4.+******
                            ++*****
                         ++****
                     +******
                *******
          *******+
      *****++++
 ****** +++
**
25-j-** *******
   +- --- +—

       -2
-1
                          +1
                                                               +2

-------
  Chesapeake Data


UNIVARIATE PROCEDURE
                                                     9:16 Thursday, May 14, 1992
                                                                               8
Variable=CEM D
                         Histogram                        #
     925+*                                                1
        .*****                                           41
        .*******                                         58
        .**********                                      gg
        ( ***************                                134
     675-1-***** ************************                  256
        .A*******************************************   3Q8
        .*********************************              295
        .****************                               2.42
        .*****                                           45
     425+*                                                2
                                          Boxplot
                                             0
                                             0
                                             0
                                          *—+—*
         * may represent up to 9 counts
                                 Normal Probability Plot
               925+                                                  *
                                                               *******
                                                           *****++++
                                                       * * * * *++
                                                   +****+
               675+                          ++******
                                        ********
                                 ********
                          ********
                   ********++
               425+*+++
-2
-1
                      +1
                                                               +2

-------
                                Chesapeake Data      9:16 Thursday, May 14, 1992
                                                                              10

                              UNIVARIATE PROCEDURE

Variable=CEM_E

 *                        Histogram                        #             Boxplot
     925+*    .                                            5                0
        .*****                                           38                0
 *       .*******                                 ,        61                0
        (**********                                      86
        .***************                                128
     675+**************************************         336             +—— — t-
        .********************************************   394             *__-).__*
        . *******************************                279             +-.____+
        .****************                               136
        .******                                          54
     425+*                                                4
         ---- + ---- + ---- + ---- + ---- + ---- + ---- + ---- + ----
         * may represent up to 9 counts


                                 Normal Probability Plot
               925+                                                  *
                                                               *******
                                                           *****++++
                                                        ****++
                                                      *
               675+                          +*******
                                       ********
                                 *******
                          ********
                   ********+
               425+*+++
                       -2        -1         0        +1        +2

-------
 Tf
 *J
 *2
    OQ
    ^»

    I
 s
^:
o
           §   8
0   8   8   §   8   Q   °
     00   b-   «D   UD
                                                             IS
                                                                   s
                                                                   I
                                                 	I  <=,
                              (uidd)

-------
g

    O
    a

 0)
1
o
                                      8   8
                                      ^P   CO
                             (uidd)

-------
I
 I  I

    I
 CQ

X
o
                                                                   I
                                        g   8   «
                                             CO   
-------
II
0) Q
o S:
    o
I
                 £

                 3
                 s
                                      1
          O

          O
          O5
S  8
^  CO
                  (uidd)

-------
!i
Ifl
8 p
O « H
  8°
W o5


I
         O
         o
         en
                 i » ' » ' i
                 (uidd)
8  S
T*  CO
i ' ' « ' i

  o

-------
                                  i
                            1- o
             I    I    I     I
(uidd)

-------
                          1
(radd)

-------
                          o
                          a
(radd)

-------
                               J- o
             CN1   ^P
              i     i
(uidd)

-------
                             Homer City daily data
                           30
  9:16 Thursday, May 14,  1992
                                     UNIT=1
                              UNIVARIATE PROCEDURE
Variable=SO2CEM
                         Histogram
         2.85+***
             .********
             .*****************
             .*********************************
             .**********************************
             .*******************************
             .**********************
             .************
             . ****
             . **
             . **
         1.75+*
              	+	+	+	H	(.	+	
              * may represent up to 3 counts
  #
  9
 23
 50
 98
101
 91
 65
 36
 11
  6
  5
  2
Boxplot
*	+—*
   0
   0
                                 Normal Probability Plot
              2.85+                                             +*****
                                                           ******
                                                      ******
                                                *******
                                          ******+
                                     ******+
                                ******
                            *****
                       ++***
                   +++***
                   **  *
              1.75+*

                       -2         -1          0         +1        +2

-------
                             Homer City daily data
                                                     9:16 Thursday, May 14, 199
                                     UNIT=1
                              UNIVARIATE PROCEDURE
Variable=AACS
                         Histogram
   3.35+*
       .*
   3.15+***
       .******
   2.95+**********
       .************************
   2.75+******************************
       .*******************************************
   2.55+*****************************************
       . *********************************************
   2.35+*********************************
       .***********************
   2.15+******************
       .******
   1.95+*****

   1.75+

   1.55+

   1.35+*
        ^ ^ ^ ^ JL	1 			. _1	 	I  ^m,mm, ^L^k  	L   ^fc^^^L^M^  11 ^^  	I
        _. ™- -r._,^___._^.  _^^..^ ^qp_«v^^^^Vi.._i^~j.^ ^•.. f — —^.~-^^-^mm— iy.
        * may represent  up to 2 counts
     #
     l
     2
     5
     11
     20
     48
     60
     85
     81
     90
     65
     46
     36
     12
     9
                                       Boxplot
                                           0
                                           0
                                       *--+—*
                                 Normal Probability Plot
              3.35+                                                  *
                  I
              3.15+                                               ****
                  I                                            *****
              2.95+                                        ****
                  I                                    ******
              2.75+                                ****
                  |                            *****
              2.55+                        ****
                  j                     *****
              2.35+                *****
                  j             *****
              2.15+       ******
                  j     ****+
              1.95+*+**+

              1.75+

              1.55+

              1.35+*
-2
-1
+1
                                                               +2

-------
                                                                         I
                                                                        Q
<=>
CO
LO
10

o
p
o

-------
   e
   I
   Q
O  bo
   CQ


   1

   t
   P
        O
        CO
10
iq
o
                                                                       I
                                                                       (3
                                                                       3
                                                                  00
p
o

-------
8°,
O
 §€
W o
          
-------
                                                I
                                                p
                                                1
CO
CM
 p
 T-J


SOS
§

-------
Q
   P
   CO
                                                I
                                                p

-------
CO
  -
O  g

 S'
 C
 I

W
                                                                   CO
                                                                        P

                                                                        3
                                      S
                                      CO
         P
         cq
10

o
S
 I
p
cq

 I

-------
                                 Niagara Mohawk
                              UNIVARIATE PROCEDURE
                                                 10:50 Wednesday, May 13, 1992
                                                                           142
Variable=S02CEM
                        Histogram                         #
0.925+*                                                   1
     .*    .                                              10
     .**                                                 65
     .******                                            244
     .**************                                    574
     .******************************                   1229
0.625+******************************************       1753
     .***********************************************  1971
     .***************                                   627
     .****                                              167
     .***                                               115
     .**                                                 43
0.325+*                                                   2

      * may represent up to 42 counts
                                                                         Boxplot
                                                                            0
                                                                            0
                                                                            0
                                                                         -t-	+
                                                                         *	1	*
                                                                         +	-I-
                                                                            0
                                                                            0
                                                                            0
                                 Normal Probability Plot
             0.925+                                                  *
                                                                     *
                                                                   ***
                                                             *******+
                                                       *******+
                                                 *******
             0.625+                       ********
                                 **********
                           *******++
                       +****++
                   ******
                   *
             0.325+*
-2
-1
                                                   +1
                                                               +2

-------
                           Niagara Mohawk OSA by week
                                                     7:20 Thursday, May 14, 199
Variable=SO2OSA
                              UNIVARIATE PROCEDURE
                Stem Leaf                     #
                  80 006                      3
                  79 09                       2
                  78 19                       2
                  77 01256                    5
                  76 025                      3
                  75 004                      3
                  74 56667                    5
                  73 234577                   6
                  72 277                      3
                  71 0344579                  7
                  70 1259                     4
                  69 1225579                  7
                  68 4489                     4
                  67 3356799                  7
                  66 8                        1
                     	+_.	+	+	+
                 Multiply Stem.Leaf by lO**-2
                                   Boxplot
                                   *—+—*
             0.805+
             0.735+
        Normal Probability Plot
                                    * *++
                                   *++
                               ***++
                            ****+
                           *+++
                         **+
                      ***
                    ***
                  +**
               +***
            ++***
          +****
       V***
* ** ***
             0.665+  *
-2
-1
                            +1
                                                               +2

-------
                           3
                           £
o
co
o
o
o

-------
                       o
                      ffi
 *
i-«O
 ••
OO


  SOS
p
o

-------
                                          I
                                     - o
CO
o

-------
                                                       Lb.
                                      SAS      10:50 Wednesday, May 13, 1992 12

                              UNIVARIATE PROCEDURE
Variable=S02CEM
   9.25+*
                          Histogram
#
1
Boxplc



4








0
.*
•
.*
*
.*
.75+
.*
.*
.*
.*
.*
.**
. **************************
.*
.25+*
1
3
7

3
2
1
3
13
613
*********************** 15005
238
191
*
*
*

*
*
*
*
*
0
+ 	 (-•
0
*
        * may represent up to 313 counts
              9.25+
                                 Normal Probability Plot
                                                                     *

                                                                     *
              4.75+
                                                                     *
                                                                     *
                                                                     *
                                                                     *
                                                                     *
                                                       ++++++*********
                         *************************************
                   ++*****++++++
              0.25+***
                   	H	^	1-	+	+	H	+	+	1-	1-
                       -2         -1         0        +1        +2

-------
                      Northern States Power CSA daily data                    45
                                                     9:16 Thursday,  May 14,  1992

                              UNIVARIATE PROCEDURE
Variable=S02CSA
   2,15+*
       .*
       .*
       .**
                          Histogram
       . ************
       .*****************************
       . ***********************************************
       . ****************************
       .********
       . ***
   1.05+*
  #
  1
  1
  1
  8
 21
 60
141
235
140
 40
 14
  5
Boxplot
   *
   *
   0
   0
   0
* — + — *
   0
   0
        * may represent up to 5 counts
              2.15+
                                 Normal Probability Plot
                                                                     *
                                                                     *
                                                                 *****
                                                             *****+++
                                                      +*******
                                               +********
                                       **********
                               *********
                         ******++
                   +******
              1.05+*
                   +	+	+	h	+	+	+	+	+	+	+
                       -2        -1         0        +1        +2

-------
PH
                            zos

-------
1 I '
Ci
00
1 I '
t-
CO
CO

-------
 o

O

 JH
   o
   S
   Q
                                                         o
                                                        a

                                                        3
                                                    oo
                        (M
CM

 I
                                 ZOS

-------
Appendix B

-------
 EFFECTS OF AUTOCORRELATION ON STATISTICAL ANALYSES
 D.  A.  DICKEY
 Prepared July 28,  1992  for the Cadmus  Group
 INTRODUCTION

      In  this paper, we  review the  concept of autocorrelation, explain how
 to look  for  it,  and explain how to adjust for  it  in standard statistical
 formulas.  Formula  (7)  shows  the effect  on the variance  of a sample mean
 and formula  (10)  shows  the effect  on the estimate of  individual variance.
 The square root  of a variance is called  a standard deviation and  is needed
 to decide if data points  or means  are unusual.  A data point more than 1.96
 standard deviations from  the  mean  will occur by chance only 5% of the time
 and hence is considered unusual.

      Likewise,  a sample  mean more than  1.96 standard deviations  from a
 hypothesized long run mean casts doubt on that long run  mean.  Here, of
 course,  standard deviation refers  to the standard deviation of a  sample
 mean.  When  the  standard  deviation of the mean is estimated from  the data,
 it is referred to as the  standard  error  of the mean.  We will consider a
 sample mean  more than 2 standard errors  from a hypothesized long  term mean
 as significant evidence against that long term mean but  the normal or t
 tables could be  used to provide a  slightly more accurate number than 2,
 depending on the sample size.
1. AUTOCORRELATION

     We use statistics to deal with variation.  For example, a certain type
of automobile may get on average 28 MPG  (miles per gallon), but individual
mileages will vary around this mean, some particular cars doing better and
some worse.  If automobile types are to be compared by sampling, this
variability must somehow be accounted for.

     Most statistical texts concern independent data.  For example, if I
take a random sample of automobiles, the fact that car 7 is over the mean
MPG does not lead me to expect anything in particular about the MPG of car
8 or car 6.  When one deviation from the mean tells us nothing about any
other, the data are said to be independent.  To look at an example where
this independence obviously would not hold, consider measurements of flow
rate in a stream taken every hour.  If the stream is flowing much faster
than average now, we would expect it to be flowing faster than average one
hour from now, that is, the stream is high now and an hour is not enough
time to clear the excess water from the stream.

     Pollutants in a stream, in the air, etc. may also exhibit this failure
of independence.  When data taken over time fail to be independent, we say
they are autocorrelated.  The most common type of autocorrelation is
positive, that is, positive deviations from the mean tend to be followed by
positive and negative deviations by negative. , In order to adjust standard
statistical formulas to deal with autocorrelation, it becomes necessary to
pin down the nature of the autocorrelation more precisely.  This is the
role of time series modeling.

     In 1976, a book Time Series Analysis: Forecasting and Control by G. E.
P. Box and G. M. Jenkins (Holden-Day publishers) popularized time series
modeling.  The authors stressed models called AutoRegressive Integrated

-------
     Moving Averages,  or ARIMA models.   A subset of these models,  autoregressive
     or AR models,  is  discussed below.   This subset forms a relatively simple
     yet powerful class of models.

          The autoregressive model  of order 1,  AR(1),  is written as

                    Y(t)  = M + r( Y(t-l)  - M )  + e(t)     , t-1,2,3,...       (1)

     where Y-(t)  is the value of "the data at'time t, for example the flow rate of
     a river at hour t.  M is the process long term mean and r is a number
  ,   strictly between  -1 and 1.  We interpret r as a proportion when r>0.
     Finally e(t)  is an unanticipated error or "shock" to the system as it is
     sometimes called.  This e(t) series is assumed to be an independent
     sequence with mean 0 and constant variance.

          Model (1) expresses the deviation of Y(t) from the mean M as a
     proportion r of the previous deviation plus an unanticipated shock e(t)
     and hence is quite realistic for many economic and physical situations.

          The AR(1) model can be extended to a general AR(k)  model in which Y(t)
     depends on k previous values as

     Y(t)  - M + PI ( Y(t-l)  - M ) + P2  ( Y(t-2) - M )  + ...
                           + Pk ( Y(t-k)  - M )  + e(t)     , t=l,2,3,...       (2)

     where PI, P2,  ..., Pk are numbers called autoregressive coefficients and
     the previous values Y(t-l), Y(t-2),  etc. are referred to as lags of Y.


     2.  REGRESSION ESTIMATES

          In section 1, the class of ARIMA models popularized by Box and Jenkins
     was introduced and one model,  the autoregressive order 1 or AR(1) was
     singled out.   In  this and the  next section, we look at how we can tell if
     AR(1)  is appropriate for our data.  The main tools here will be least
     squares regression and the autocorrelation function.

          Least squares regression  is a topic covered in standard statistical
     textbooks.   The application of regression to time series is covered in
     detail in chapter 8 of the book Introduction to Statistical Time Series by
     Wayne Fuller (Wiley 1976). In  an example on page 341 and 342, Fuller shows
     how to determine  the necessary number of lags in a model by running a
     regression on many lags then using standard tests statistics, t and F
     produced by most  regression programs, to decide how many lags can be
     omitted.-  If we can omit all but Y(t-l)  from our model,  then the AR(1) is
     appropriate.

          Alternatively,  we could look at the partial autocorrelation function
     which is computed by most time series packages,  such as PROC ARIMA in the
     SAS computer package (SAS is the registered trademark of SAS Institute,
   5  Gary,  N.  Carolina).   The jth partial autocorrelation coefficient is
     essentially the lag j coefficient in the multiple regression of Y(t) on
     Y(t-l),  ...,  Y(t-j)  as explained in The SAS System for Forecasting Time
•  ...  series by J.  C. Brocklebank and 0. A. Dickey (SAS Institute publishers).
I    Only the lag 1 partial autocorrelation would estimate a nonzero value for
•    an  AR(1).

-------
   3.  AUTOCORRELATION FUNCTION

       Another way to determine if an AR(1) is appropriate is to look at the
   autocorrelation function R(j).  R(j) is the correlation between Y(t) and
   Y(t-j) where j is called the lag number.  This function is produced by
   most time series computer programs, for example PROG ARIMA and PROC AUTOREG
   in the SAS computer package .                      .

       Specifically, R(j)Vs= G(j)/G(0) where G(j) is called the autocovariance
   function and is defined as the covariance between Y(t) and Y(t-j).  Letting
   the variance of e be denoted V(e), we find from Fuller (page 37 equation
   2.3.5) that the covariance at lag j for an AR(l) model is
            Autocovariance =  G(j) = r**j V(e)/(l-r**2)
(3)
  where ** denotes exponentiation and * denotes multiplication, e.g. 3*5=15,
  3**2 = 3*3=9, 2**3 = 2*2*2=8. Now the variance of Y is G(0), that is, the
  variance of Y is, for an AR(1) model,

            variance of Y =  G(0) «= V(e)/(l-r**2)

  so we see that the variance of the shocks, V(e), can be quite different
  than the variance, G(0), of the data if r is near 1.

       Using R(j) = G(j)/G(0) and equation (3) it is easily seen that

             R(j) - r**j                                            (4)

  for the AR(l) model.  It is important to note that these formulas are only
  appropriate for autoregressive order 1 models, not for moving average
  models, general ARIMA models, or general autoregressive order k models.
  Equation (4) shows that the autocorrelation function of an AR(1) decays
  exponentially.
  4. EXAMPLE

       A dataset of 60 observations using model (1) with r=.7 and M=100 is
  generated in SAS and analyzed with PROC ARIMA.  Here is the program and
  part of the output;

                               PROGRAM

ata epa; y=100 + 10*normal(1827651)/sgrt(l-.7**2); output;
 do i«2 to 60; y=lOO + .7*(y-100) + 10*normal(1827631); output;
nd;

roc arima; identify var=y nlag=10;

                               OUTPUT

                     Name of variable « Y.
                     Mean of working series * 91.87989
                     Standard deviation     - 9.979986
                     Number of observations =60

       Notice that the autocorrelations die off in approximately an
  exponential manner at least for the first few lags.  The dots represent two
  standard errors so lines of asterisks extending beyond the two standard

-------
   .errors indicate statistically significant (non zero)  autocorrelations.  We
   are saying that if the autocorrelation were truly 0,  an estimated
   autocorrelation more than 2 standard errors from 0 would be unusual and
   hence we reject the idea of 0 autocorrelation based on our estimate. Here,
   of course, standard error refers to a standard error appropriate for
   autocorrelation estimates.
         ,   .     	*          Autocorrelations


Lag Covariance Correlation -1987654321
                                                 01234567891
"0
1
2
3
4
5
6
7
8
9
10

99.600
49.884765
25.278038
7.336610
7.443015
11.986170
-1.278828
1.552965
0.862817
-6.552627
-11.546358

1.00000
0.50085
0.25380
0.07366
0.07473
0.12034
-0.01284
0.01559
0.00866
-0.06579
-0.11593

*
.
.
.
*
*
.
. .
*
**
********************
**********
*****.
*
*
**
.
.
.
.
.
"." marks two standard errors





















        The partial autocorrelations show one nonzero lag value,  .50085,  and
   the rest are insignificant,  being within the two standard error bounds:
                           Partial Autocorrelations


           Lag Correlation -198765432101234567891
1
2
3
4
5
6
7
8
9
10
0.50085
0.00393
-0.07331
0.08643
0.08939
-0.16725
0.08753
0.01601
-0.14884
-0.05415
*
•
*
*
•
. ***
•
*
. ***
*
**********
•
•
** .
** .
•
** .
•
•
*








t

        The estimated autocorrelations and partial autocorrelations seem to be
   in line with what would be expected for an autoregressive order 1 series.
  .5.  VARIANCE OF THE MEAN,  INDEPENDENT SAMPLES
   -
•        This section deals with sample means.  Returning to the example of MPG
I  .in cars,  suppose a particular brand of car has mean 28 MPG (for the entire
•   fleet of all such cars ever to be produced). If I take a random sample of
•   10 cars from the production line and measure MPG, I might get a sample
•   average 25.0 MPG.  Another sample of 10 might have a sample average 27.2
•   MPG and another 28.3.  Is this much variation in means of samples of 10 cars
I   reasonable?  It depends on the individual car-to-car variation, V, in MPG.

-------
 If I know the variance V of MPG from car to car,  I can compute the variance
 among sample means from samples of size 10 to see if 25.0,  27.2,  and 28.3
 are reasonable numbers.   The formula when the data are independent is
                   variance of means - v/n
                       (5)
 where V is  the  individual  car-to-car variance for these independent cars,
 and n is the.number of cars  in each sample.  This formula is very well
 known,  for  example,  see Snedecof and Cochran's "Statistical  Methods, eighth
 edition,  page 43  (Iowa State University Press, publisher).  For a time
 series,  of  course,  formula (5)  becomes
               variance  of means  = G(0)/n
                       (5a)
      If the estimated variance  in MPG  is  14.4  then  the  estimated variance
 associated with a mean  of 10 would be  14.4/10  =1.44  and  the corresponding
 standard error of the mean (square root of  this variance)  is 1.2 so none of
 our sample means is more  than 2  standard  deviations from  the fleet average
 28 MPG. If, instead of  1.2, the  standard  error were 0.5,  then the sample
 mean  25 would be quite  unusual since it would  now be  4  standard  errors  away
 from  the fleet average.   Clearly, the  decision of whether a  sample mean is
 statistically significantly far  from any  stated value depends on having the
 correct standard error  available.  Recall that the  cutoff value  of 2
 standard errors can be  refined by referring to tables of  the t
 distribution.


 6.  EFFECT OF AUTOCORRELATION ON THE VARIANCE  OF THE  MEAN.

      For autocorrelated data, the variance  of  a sample  mean  is no longer
 given by formula (5).   A  formula giving an  approximation  to  the  variance is
 on page 194, (6.3.17) of  Box and Jenkins.   This same  formula in  a different
 form  is given in Corollary 6.1.1.2, page  232 of Fuller.   The formula, while
 approximate, holds for  a  large subset  of  ARIMA models including  the AR(i)
 we are discussing here.

   „  We can do better than an approximation if we restrict ourselves to
 the AR(1)  case. In particular, the Fuller text page 232 line 10  shows that
 the variance of a sample mean of n consecutive values of  a time  series  Y(t)
 is exactly given by
 [n G(0) + 2(n-l) G(l) + 2(n-2) G(2)
2(n-3) G(3) + ...
+ 2 G(n-l)]/(n*n)
                                                                    (6)
where G(j) is the autocovariance of the time series in question.  In the
AR(1) case, G(j) is given by  (3) and is seen to be G(j) - G(0) r**j so we
can plug G(j) = G(0) r**j into expression (6) and do some algebraic
simplification to get the desired variance.  Note that formulas  (3) and (6)
are all we really need.  One could write a computer program to evaluate (3)
and use it in (6) for every case, however, algebraic reduction will give us
a nice formula, (7), for the AR(1) case that will be easy to use.

     I now show the algebra just to provide a technical reference. Let

        S =  {n + 2(n-l) r + 2(n-2) r**2 +  ... + 2 r**(n-l)J

so that expression (6) is S*G(0)/(n*n).  Notice that, for independent data,
r=0 so S=n and expression (6) reduces to expression (5a). All we need to do

-------
 .is  algebraically  reduce  S  then calculate  S*G(0)/(n*n) .   Now write S  and
  then multiply  S by r getting

  S  =   [n +  2(n-l)  r + 2(n-2)  r**2  +   ...•*•  2  r**(n-l)]
  rs  -   [         n r + 2(n-l)  r**2  +  2(n-2)  r**3  +  ...  +2  r**n]

  and subtracting,

  (l-r)S « n  +  (n-2)r - 2[r**2 + r**3  + r**4  +  ...  + r;**n]

                 - 2[r +  r**2 + r**3  + r**4  +  ...  + r**n]

                 - 2r[l + r + r**2  +  r**J  +  . . . + r**(n-l)]

        = n(l-fr) - 2rD


  where D =  [1 + r  + r**2  +  r**3 +  ...  + r**(n~l)].   Computing  D and rD we
  have

  D = (l + r  + r**2 + r**3 + ... + r**(n-l)]
  rD  =   [ r  + r**2 + r**3 + ... + r**(n-l) + r**n]

  and subtracting

  (l-r)D = l-r**n   so D »=  (l-r**n)/(l-r) .

  Now that we have  D and (l-r)S = n(l+r)  -  2rD  we  solve for S as

  S = n(l+r)/(l-r)  -2r[ (l-r**n)/(l-r) ]/(l-r)

                   -2r(l-r**n)/(l-r)**2.
 Remember that expression  (6) ,  the variance  of  the mean in the AR(1)  case,
 is S G(0)/{n*n).  Thus  expression  (6) becomes,  for  the AR(l)  case,

   ( G(0)/n  )  [ (l+r)/(l-r)  -2 (r/n) (l-r**n)/(l-r) **2  ]            *
 This is our target  formula.  Notice  that  if  the data are independent,  r=0
 and the expression  in square brackets  becomes  1.   Since  G(0)  is the
 variance of Y this  gives the well  known formula (5a)  for the  variance  of
 the mean of n independent  observations.   The expression  in square brackets
 is thus seen to be  an inf later  for the usual variance formula.  This
 expression does not approach 1  when  n  gets large and hence is an important
 adjustment even in  very large samples.

      As an example, with 50 observations, autocorrelation r=.8 (first  order
 autocorrelation to  be exact), and  process variance G(0)  = 128,  the usual
 formula for the variance of the mean in independent data would give
 G(0)/n=128/50 =2.56 and the standard  error  would be 1.6.   This is
. incorrect, however  since formula  (7) shows that 2.56 should be multiplied
               r) -2(r/n)(l-r**n)/(l-r)**2]   -
       (1.8)/.2 - 2(.016)(1-.8**50)/(.04)  •       •         " •
       9 - -0.032(1-0.0000143)/(.04) =8.2

 giving the proper variance  20.99  and  standard  error 4.58.   Notice that this
 is quite an adjustment.

-------
7. EFFECT ON THE ESTIMATE OF INDIVIDUAL VARIANCE

     In the above example, the variance of the data, G(0) , vas assumed to
be the known value 128.  Of course, the true variance  is  never known and it
must be estimated from the data, using the sample variance.  The usual
formula for the sample variance is
S  -   (Y(l) - Y)**2 -I-  (Y(2) - Y)**2 +  ... +  (Y(n) - Y)**2]/(n-l)

 - [Y(l)**2 + Y(2)**2 +  ... + Y(n)**2  - n Y**2  ]/(n-l)      (8)

where Y -  [ Y(l) + Y(2)  +  ... + Y(n) ]/n is the sample mean. For the
numbers 10, 12, 13, 7, 8 we get
    10,
               0 + 4-1-9 +
               26/4 - 6.5.
4]/4 - [100+144+169+49+64-2500/5J/4
Why is n-1 instead of n used in the denominator of S  ?  To answer this,
use lower case letters and let y(t) = Y(t)-M where, as before, K is the
theoretical mean. The first part of expression (8) shows that y(t) can be
substituted for Y(t) and we get

      2                            _                     _
(n-i)S  -  (y(l) - y)**2 + (y(2) - y)**2 +  ... +  (y(n) - y)**2]

 = [y(l)**2 + y(2)**2 + ... + y(n)**2 - n y**2 ]              (9).

where y = Y-M is the mean of the y(t) values (y(t)-y = Y(t)-M - Y+K).

                                                              2
     Now the last part of expression (9) is the numerator of S and if we
take its expected value (the mean value in repeated sampling) assuming
independent observations, we get
    G(0) + G(0) + G(0)
                             + G(0) - n  (G(0)/n) =  (n-l)G(O)
                                 (10)
because the expected value of each y(t)**2 is by definition the variance
of Y (this being G(0)) and the last term is by definition n times the
variance of the sample mean. This shows that division by n-1 is required
to get an unbiased estimator, that is, one whose expected value is the
quantity to be estimated. Notice that if the data are from an AR(1)
process, the n(G(0)/n) in formula (10) would need to be replaced by
n G(0)/n
                      -2(r/n) (l-r**n)/(l-r)**2]

                          2
so that the numerator of S  would now estimate

    G(0)f n - (l+r)/(l-r) +2(r/n) (l-r**n)/(l-r)**2]

     2
and S  would estimate this divided by (n-1) , namely  .

    G(0)[ 1 + l/(n-l)[-2r/(l-r)]  +2(r/n) (l-r**n)/(l-r)**2/(n-l)
so that if we divided S  by

-------
  [ 1 - 2r/[(n-l) (1-r)]  +2 (r/n) (l-r**n)/(l-r)**2/(n-l)  ]      **(io)**

 we would have an unbiased estimate of G(0) .  In the example from section 6,
 the sample variance estimates  [(50) (128) -  (50) (20.74) ]/49 - .853 G(0), in
 other words we would multiply  the sample variance by 1/0.853 = 1.172 to get
 an unbiased estimate.

      Expression  (10) approaches 1 as n gets large and  hence becomes less
 important as the sample size increases although for fixed n, the value of r
 also plays an important role.


 8. SUMMARY

      In this paper two formulas were developed for dealing with
 autocorrelation.  Autocorrelation is assumed to arise  from a first order
 autoregressive process with autocorrelation parameter  r and process
 variance G(0) .


  ( G(0)/n ) [ (l+r)/(l-r) -2(r/n) (l-r**n)/(l-r)**2 ]           **(?)**.

            n-l) (l-r)]  +2(r/n) (l-r**n)/(l-r)**2/(n-l)  ]       **(io)**
 Formula  (7) shows a multiplicative adjustment to the usual formula, G(0)/n,
 for the variance of a sample mean.  For r>0 we conclude that
 autocorrelation increases the variability of sample means around the long
 run mean.  The sample variance estimates a multiple of the true variance of
 individuals.  Formula (10) gives that multiple. If r=0, the  multiple is l
 so the sample variance is unbiased.  If r>0, the multiple is less than 1
 and we say the sample variance is biased downward.

      Since the sample variance is defined in terms of deviations from the
 sample mean, the interpretation is that the points vary less around the
 sample mean under autocorrelation than they do for independent data.
 Dividing Che sample variance by expression (10) provides an unbiased
 estimator. As n gets large, expression  (10) approaches l for any fixed r
 and hence is less important for large samples.

      The formulas involve r which is an unknown, but estimable, quantity.
 Plugging in an estimated r obviously produces an approximate adjustment.


 REFERENCES

 Box, G. E. P. and G. M. Jenkins (1976) . Time Series Analysis: Forecasting
 and Control.  Holden-Day.

 Brocklebank, J. C. and D. A. Dickey  (1986) . The SAS System for Forecasting
 Time Series.  SAS Institute, Gary, N.C.

 Fuller, Wayne (1976) .  introduction to Statistical Time Series. Wiley.

* Snedecor, G. W. and W. G. Cochran (1989)  Statistical Methods , eighth   .
 edition. Iowa State University Press                     .
 SYMBOLS

-------
.M
      theoretical, or long term, mean
V
G(0)
V(e)

G(j)
R(j)
r

n

Y
      theoretical variance
      theoretical variance  of time series
      theoretical variance  of shocks  in time series

      covariance at  lag  j                 .
      autocorrelation at lag j.    R(j) = G(j)/G(0)
      autoregressive order  1 model
      lag  1 autocorrelation in an AR(1)

      sample size, number of observations

      sample mean
      standard formula for sample variance

-------
Appendix C

-------
Virginia Power Co., Chesapeake Energy Center, Unit #4
Alternative Monitoring System Study 	 Subsets Summary

OBS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

LABEL
a-c
a-c
a-c
a-c
a-c
a-c
a-c
a-c
a-c
a-c
a-c
a-c
a-d
a-d
a-d
a-d
a-d
a-d
a-d
a-d
a-d
a-d
a-d
a-d
a-e
' a-e
a-e
a-e
a-e
a-e
a-e
a-e
a-e
a-e
a-e
a-e
c-d
c-d
c-d
c-d
c-d
c-d
c-d
c-d
c-d
c-d
c-d
c-d
c-e
c-e
c-e
c-e
Setno
(in dataset)
20
21
23
24
26
27
29
30
32
33
35
36
40
41
43
44
46
47
49
50
52
53
55
56
60
61
63
64
66
67
69
70
72
73
75
76
140
141
143
144
146
147
149
150
152
153
155
156
160
161
163
164
Subset
(in tables)
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
Start
Date
10992
20892
10992
20892
10992
20892
10992
20892
11092
20992
11092
20992
10992
20892
10992
20892
10992
20892
10992
20892
11092
20992
11092
20992
10992
20892
10992
20892
10992
20892
10992
20892
11092
20992
11092
20992
10992
20892
10992
20892
10992
20892
10992
20892
11092
20992
11092
20992
10992
20892
10992
20892
Start
Time
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
End
Date
20892
30992
20892
30992
20892
30992
20892
30992
20992
31092
20992
31092
20892
30992
20892
30992
20892
30992
20892
30992
20992
31092
20992
31092
20892
30992
20892
30992
20892
30992
20892
30992
20992
31092
20992
31092
20892
30992
20892
30992
20892
30992
20892
30992
20992
31092
20992
31092
20892
30992
20892
30992
End
Time
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
7
8
12
13
n
(before lagging)
683
682
683
682
683
682
683
682
683
682
683
682
653
670
653
670
653
670
653
670
653
670
653
670
672
663
672
663
672
663
672
663
672
663
672
663
661
676
661
676
661
676
661
676
661
676
661
676
707
701
707
701

-------
Virginia Power Co., Chesapeake Energy Center, Unit #4
Alternative Monitoring System Study 	 Subsets Summary

OBS

53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72

LABEL

c-e
c-e
c-e
c-e
c-e
c-e
c-e
c-e
d-e
d-e
d-e
d-e
d-e
d-e
d-e
d-e
d-e
d-e
d-e
d-e
Setno
(in dataset)

166
167
169
170
172
173
175
176
180
181
183
184
186
187
189
190
192
193
195
196
Subset
(in tables)
. : .„• 1.
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
Start
Date

10992
20892
10992
20892
11092
20992
11092
20992
10992
20892
10992
20892
10992
20892
10992
20892
11092
20992
11092
20992
Start
Time

17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
End
Date

20892
30992
20892
30992
20992
31092
20992
31092
20892
30992
20892
30992
20892
30992
20892
30992
20992
31092
20992
31092
End
Time
•'
17
18
22
23
3
4
8
9
7
8
12
13
17
18
22
23
3
4
8
9
n
(before lagging)

707
701
707
701
707
701
707
701
647
658
647
658
647
658
647
658
647
658
647
658

-------
Pennsylvania Electric Co., Homer city Unit fl
Alternative Monitoring System study 	 Subsets summary

DBS
1
2
3
4

Unit
1
1
1
1

setno
6
vio-v
21
24
Start
Date
53185
92885
82486
112286
End
Date
62985
102785
92286
122186
n
(before lagging)
30
28
28
30

-------
Pennsylvania Electric Co., Homer City unit I3
Alternative Monitoring System study 	 Subsets summary

OBS
£
2
3
I
5
6
7
8

Unit
3
3
3
3
3
3
3
3

Setno
3
.-5 ..
7
8
16
17
20
21
Start
Date
30285
	 50185
63085
73085
32786
42686
72586
82486
End
Date
33185
. J -53085
72985
82885
42586
52586
82386
92286
n
(before lagging)
28
30
28
28
30
30
29
29

-------
northern States Power Co., Sherburne county Unit 13
Alternative Monitoring System Study 	 subsets Summary
OBS
I
3
•
5

7
-
9
10
11
12
13
14
15
16
17
Setno
1
4
6
7
8
9
10
11
12
13
14
15
16
18
21
22
23
Start
Date
10189
40189
53189
63089
73089
. 82989
92889
102889
112789
122789
12690
22590
32790
52690
82490
92390
102390
•• 	 •
start
Time
l
. 1
1
1
1
l
1
1
1
1
l
1
l
1
1
l
1
End
Date
13089
43089
62989
72989
82889
92789
102789
112689
122689
12590
22490
32690
42590
62490
92290
102290
112190
End n
Time (before lagging)
24
••--•24 -
24
24
24
24
24
24
24
24
24
24
24
24
24
24
24
715
f Jkmf
672
720
9 <•» \f
720
« A V
680
V V W
720
1 A W
720
696
720
720
720
720
698
720
704
720
720

-------
Niagara Mohawk,  oswego unit §6
Alternative Monitoring system Study 	 Subsets summary
OBS
setno
                 start
                  Date
          2       13190
          5-"Vi».:'--50l90'
Start
 Time

  0
 -0
 End
 Date

30190
V53090 •>•"
 End
Time

 23
(before lagging)

       690
       671

-------

-------