Guideline Series (OAQPS No. 1.2-014) Guidelines for the Evaluation of Air Quality Trends


         GUIDELINE SERIES
                    OAQPS  NO.  1.2-014
           GUIDELINES FOR THE EVALUATION
               OF AIR QUALITY TRENDS
                         — —    ~~    — —
                                                 fin
US. ENVIRONMENTAL PROTECTION AGENCY
   Office  of Air Quality Planning and Standards

      Research Triangle Park, North Carolina

-------
                                 45OR74OO4

                                    FEBRUARY 1974
             Guideline Series
             OAQPS No. 1.2-014
      GUIDELINE  FOR THE EVALUATION OF
             AIR QUALITY TRENDS
    U.S.  ENVIRONMENTAL PROTECTION AGENCtY
OFFICE OF AIR QUALITY PLANNING AND STANDARDS
    MONITORING AND DATA ANALYSIS DIVISION
RESEARCH TRIANGLE PARK, NORTH CAROLINA
27711

-------
                    TABLE OF CONTENTS

                                                        Page


PREFACE                                                   i


1.  INTRODUCTION                                        '  1

    1.1.  Purpose                                         1
    1.2.  Usefulness                                      1
    1.3.  Limitations                                     1


2.  DATA REQUIREMENTS AND SELECTION              .         3

    2.1.  Minimum Requirements                            3
    2.2.  Form of the Data                                3
    2.3.  Data Selection for Trends Analysis              4
3.  CONTRIBUTING FACTORS TO TRENDS                        5


4.  STATISTICAL METHODOLOGY                               6

    4.1.  General Discussion                              6
    4.2.  Statistical Parameters                          7
    4.3.  Time Periods                                    8
    4.4.  Specific Methods                                9
          4.4.1.  Graphical Analysis                      9
          4.4.2.  Correlation Techniques                 10
                  4.4.2.1.  Daniel's Test for Trend      12

                  4.4.2.2.  Parametric Correlation       14
                            Technique
          4.4.3.  Regression Techniques                  15

                  4.4.3.1.  Simple Linear Model          15
                  4.4.3.2.  Exponential Model            16

          4.4.4.  Test for Trend in Proportion Of
                  Observation Above A Standard           16

5.  ASSESSING REGIONAL TRENDS                            18


6.  INTERPRETATION OF TRENDS                             19


REFERENCES                                               25

-------
                LIST OF FIGURES AND TABLES
FIGURE 1  THREE YEAR RUNNING AVERAGES OF TOTAL
          SUSPENDED PARTICULATE - New Haven, Conn.
12
FIGURE 2  THREE YEAR RUNNING AVERAGES OF TOTAL
          SUSPENDED PARTICULATE - Tucson, Ariz.
12
TABLE 1   SUMMARY OF APPLICATION OF STATISTICAL        11
          PROCEDURES FOR CLASSIFYING TRENDS

TABLE 2   QUANTILES OF THE SPEARMAN TEST STATISTIC     22

TABLE 3   NORMAL DISTRIBUTION                          23

TABLE 4   PERCENTILES OF THE t DISTRIBUTION            24

TABLE 5   CHI-SQUARE DISTRIBUTION                      25

-------
                           PREF/vCE
     The Monitoring and Data Analysis Division of the Office
of Air Quality Planning and Standards has prepared this re-
port entitled "Evaluation of Air Quality Trends" for use by
the Regional Offices of the Environmental Protection Agency.
The purpose of this report is to provide guidance information
on current air quality trend evaluation techniques.  Adherence
to the guidance presented in the report will, hopefully, en-
sure mutually compatible ambient air quality trend evaluation
by all States and Regions and will also facilitate trend in-
terpretation.  Further, any risks involved in policy.decisions
concerning National Ambient Air Quality Standards should be
minimized.  This report will serve on an interim basis until
more specific and detailed guidance on this subject is
presented.

-------
                 TRENDS ANALYSIS GUIDELINE
1.  INTRODUCTION
1.1.  Purpose                                   ''
     The  purpose  of this guideline is, to outline^procedures
that can  be  employed by the air pollution data analyst   :.
to  evaluate  trends in air quality^:- Trends will Ije generally
considered as  the broad long-term movement in the-ioverali time
sequence  of  historical air quality measurements.  ,ItvwiHrbe
examined  in  two ways.   First will be 4-n the form-of la^treftd
line or curve  over time.  Second will be ai.statisfcioal1
categorization of the general direetipn of the movement over
time, i.e./  upward/  downward, or no .change,  Associated-with
the second approach can be estimates of the rat^ of change, of
deterioration  or  improvement in the air quality?t. Moist tartejnd >
analysis  can be performed upon aggregate,measurers ;of ,aiir 'i   '
quality estimates such as averages.  For some,pqljlu'tants,
however/  the behavior pf the short-terpi air; quaifcitytfrisuch as
maximum 1-hour concentrations/ is important.  The jbe.h-avior'
of  short-term  air quality estimates do .not necessarily lend
themselves to  the same kind of statistical ;trea1sn»a^fev-jarid as
such/ are treated using different technaquejs.
1.2.  Usefulness.                          .,
    Evaluation of the  long-term trend in a sequende t>f air
quality parameters such as annual means is-impdrta&tf 1ft'order
to  assess the  relative effectiveness of control1 s^kt^giei1and
to  determine the  impact    of emission growth or reduction
on  the air quality over time.
1.3.  Limitations
    The evaluation of  air quality* trends is largely a subjec-
tive procedure.   Various statistical techniques are available
to  facilitate  the evaluation/ but insight and auxiliary know-

-------
ledge are often necessary for the final determination.  The
methods of trends analysis presented in this guideline are
primarily descriptive.  They are designed to consider the
trend in a single pollutant over time.   They will be useful as
a data reduction tool which will transform a collection of
air quality measurements or summary statistics over time
into a simpler form which can then be more easily interpreted.
In this manner the complex problem of analyzing the long-term
relationships among many different air pollutants monitored
at various monitoring locations in a given area or Air Quality
Control Region (AQCR) can be examined.
    The classification of a long-term trend into a single
category such as up or down is subject to certain constraints
and assumptions.  These include the time frame of interest,
assumptions about the seasonal behavior of the pollutant and
the type of statistical variability inherent in the measure-
ments.  The individual techniques presented depend on such
considerations in varying degrees.
    The techniques that are discussed are retrospective in  .
nature, that is to say they describe the historical record
of air quality measurements.  No attempt will be made here
to forecast or predict future air quality from past* experienced
air quality.  Such techniques do exist, but for successful
application, they should not be based on historical air quality
measurements alone.  A diffusion model or modified rollback
procedure1 may be used for air quality projections, thereby
accounting for emissions, regulations, growth and meteorology.

-------
2.  DATA REQUIREMENTS AND SELECTION
2.1.  Minimum Requirements
    In order to analyze the trend in a pollutant, a time
sequence of measurements or summary statistics over several
years is required.  Because of seasonal fluctuations, a
trend should not be determined from one year's worth of measure-
ments.  The data can be in any aggregate form of air quality
measurements (hourly, weekly, monthly, quarterly, or annual
estimates).  Ideally, the time series should not have any
significant gaps in the continuity of the series although
some gaps can be tolerated.  Temporal balance is essential.
For example, a single quarterly estimate may be omitted from
the sequence of many quarterly estimates if there is not a
pronounced seasonality.  Missing data can introduce bias in the
determination of the trend.  This pan be minimized with the
availability of some prior knowledge of the data or some
auxiliary information on meteorology and emissions.  In any
e.vent such omissions should be clearly indicated.  If the
data satisfy the validity criteria as .outlined  in the
Guideline for the  Evaluation of Air Quality Data,  there
should not be any  problem.  Appropriate procedures useful
for analyzing trends when entire annual estimates are not
available will be  discussed in Section 4 on statistical
techniques.
2.2.  Form of the  Data
    An appropriate analysis can be based on several  forms  of
the data.  A preliminary analysis may be performed on the
data at hand, usually available in air quality  publications.
For example, evaluation of the trend can be based on summary
statistics such as annual/quarterly averages  and percentiles.
If a more detailed analysis is desired, the original raw data
may have to be utilized.  There is generally  a  trade off
between the type of aggregate measure utilized  because the
larger the interval for the aggregate, the more precision
and stability is contained in the estimate, but the  fewer
the number of time sequenced estimates are available for the

-------
                              4
trend analysis.  It is sometimes advantageous to deal with
certain estimates  (such as daily or annual) in order to
remove predictable factors such as diurnal and seasonal
variation respectively,
2.3.  Data Selection for Trends Analyses
    When considering the analysis of the time series of
measurements for a pollutant, certain precautions should be
observed.  In order to minimize the introduction of bias into
the evaluation, the data should be a product of the same
analytical (chemical and instrumental) methodology at the
same sampling site location for the entire time period under
consideration.  A common instance of this problem is a minor
instrument modification or movement.  If one is willing to
relax this rule in order to create the only possible complete
record of data for the analysis, then one must be willing to
accept the possibility of creating an apparent trend when none
exists, or indicating no trend when one does exist.
    Any change in the overall trend, especially an abrupt
shift or alteration coincident with the modification, must
be considered suspect.  For completeness and maximum accuracy,
all modifications to the placement or type of monitoring
equipment should be investigated, recorded, and considered in
the trend evaluation.  The possibility of bias can be overlooked
when the air monitoring specialist insures that there should not
be any discontinuity in the data.  However, it must be kept
in mind that often the reason for instrument change was that
something was wrong with the previous instrument or methodology.

-------
3.  CONTRIBUTING FACTORS TO TRENDS
    There are both determinant and random factors which
affect the trend of air quality measurements.  The determinant
factors include emissions, meteorological variables, and other
factors having a predictable influence.  Random factors are pri-
marily sampling and analysis errors, transient meteorological
phenomenon and random fluctuations in emissions.  With appropriate
auxiliary information on the environs of a monitoring site,
the reliability of apparent trends can be appraised.  For
example, environmental change such as urban renewal in the
vicinity of the sampling site can create the impression of
an apparent trend caused by area wide deterioration.  More-
over, in light of such localized change, the representativeness
of that particular sampling site and its corresponding trend
for an entire city or AQCR would have to be questioned.
    An unusually cold winter may cause an annual average to
be unusually high, possibly contributing to an apparent trend
in the preliminary trend evaluation.  In this case, auxiliary
meteorological information on degree days, chill factor, etc.,
together with the original raw data would be necessary to con-
firm the suspicion.

-------
4.  STATISTICAL METHODOLOGY
    4.1.  General Discussion
    Statistical techniqxaes are desirable for an objective
description and classification of the trend.  They are
necessary to sort out the real change in air quality that is
distinguishable from the inherent random variability in air
quality measurements.  Although the statistical techniques
are objective in the sense that they are reproducible and
anyone applying them correctly will come up with the same
result, they are nevertheless subject to error.  These are
the standard type I and type II errors of hypothesis testing
                                  3 4
discussed in texts on statistics.  '
    Statistical techniques can be descriptive or inferential.
Descriptive statistics provide estimates of unknown parameters
such as means, variances and rates of change.  These are based
on a set of empirical data drawn from the entire population
of possible values.  Inferences can be made about the popu-
lation from which the data were sampled by judging the statis-
tical significance.  In general, this involves making certain
assumptions aboxit the population, such as the distribu-
tion being log-normal.  Then the value of calculated test
statistic, derived from the sample of data, is compared to a
specific quantile of the assumed distribution of the test
statistic.  These are usually available in tabulated form.
This quantile defines the significance or a level and
specifies a critical value or pair of critical values of the
test statistic.
    The statistical significance can be utilized in more than
one way.  The traditional or classical approach is to pre-
select the a level and its corresponding quantile.  Then a test
of hypothesis is performed such as testing if no trend has
occurred.  If the test statistic falls in the predetermined
critical region defined by the extremal values of the test
statistic, the hypothesis is rejected.  By implication, if
the value of the statistic does not fall in the critical
region,  the hypothesis is accepted.  For example, in trend
analysis, the assumed underlying distribution of the test

-------
statistic of the air quality parameters corresponds to that
of a random variable without trend, that is, the null hypothesis
is there is not any trend.  A rejection of the hypothesis
is interpreted as the existence of a trend.  Then the trend
is    or is not significant at the particular a level.  Any
other possible information in the test statistic is then usually
ignored.
    A second utilization of the statistical significance does
not involve a preselected a level, per se.  The statistical signi-
ficance of the test statistic is defined by the significance
level associated with the tabulated value equal to or just
below the calculated statistic.  Then the resultant signi-
ficance level can be compared to a preselected level to classify
the test parameter but in addition, can be used to judge the
relative strength of the result compared with other cases.
    Conventionally, preselected levels of 0.10, 0.05 or
0.01 are utilized.  These would usually correspond to quantiles
of 0.95, 0.975 and 0.995 respectively of a two sided statistical
test for both upward or downward trends.  The smaller the oi or
significance level, the less likely a trend would be declared
erroneously.  One might then say the trend is highly signi-
ficant.  Levels like 0.10 would be used as preliminary in-
dicators of trend whereas smaller levels would be used to more
vigorously test for significant trends at an individual site.
The likelihood of correctly accepting a pattern as a nori-
trend is determined by the power of the individual statistical
techniques.
4.2.  Statistical Parameters
    Air quality data may have different meanings when reviewed
by different aggregate measures, although they frequently vary
together.  For example, a sequence' of the average of maximum
daily 8-hour or,1-hour concentrations can depict similar trends
in direction but perhaps not similar in magnitude.

-------
Nevertheless, it is useful to examine the trend in various
aggregate measure?;, especially those relating to the air
quality standards.  In this manner the progress with respect to
achieving each standard can be assessed.  Such useful parameters
are annual means and percent of observations exceeding
the short-term standards.  Once the parameter is selected,
a statistical test is not always necessary because the trend
may be obvious, but may be convenient for documentation
purposes.
4.3.  Time Periods
    The time frame of the data under consideration can seriously
affect the classification of the overall trend.  For instance,
if concentrations decreased sharply in the 4-years from .1960
to 1963 but remained level in the 8-years from 1964 to 1971,
then the 12-year trend from 1960-1971 would probably be down-
ward, whereas, the trend 8-years from 1964-1971 would result
as no change.  On the other hand, if concentrations.experienced
an increase from 1964-1971, its trend would be classified up-
ward, whereas the overall trend from 1960-1971 might still
be classified downward.  Therefore, it can be seen that the
classification of trend is clearly dependent on the time.frame
under consideration.
    The time frame for evaluation should be selected jLn an
objective manner.  Usually the availability of data is the
determining factor, but the interval can be preselected
based on knox^ledge of the temporal pattern of emissions.
It is desirable to perform the trend evaluation over several
different time intervals in order to obtain a more complete
description of the overall pattern and;to avoid the afore-
mentioned problems.  In the first Annual Trends Report,
long-term trends were considered during the periods 1960-
1967, 1964-1971, 1960-1971, and 1968-1971.  It was not un-
common for the trend determined by evaluating the data in
one time period to differ from the trend in another time
period at a single location.

-------
4.4.  Specific Methods
    There are some very sophisticated methods providing time
series analysis of air quality data.  These have been presented
in some recent publications.6,7  Although the methods can provide
much information, they can be difficult to use and generally
require assistance of a computer.' Some of'the simplier ap-    ;
proaches utilized in the Federal trends reports  Will be
presented in this guideline.  The techniques are oriented towards
examining the concentration or frequency of occurrence of air
quality measurements.          ...'•,

4.4.1.  Graphical Analysis
    When performing a trend analysis, it is extremely desirable
to look at the data in graphic form.  Usually plotting
quarterly or annual statistics over time will be sufficient
to depict the basic temporal pattern.  At this point the
determination of the trend may be intuitively clear.  In
order to facilitate the interpretation of the overall pattern,
it can be helpful to determine an objective trend line for the
data.  This can be simply obtained by calculating a.moving
average of the observations.  This will provide,a smoother
and simpler representation of the original data.  For quarterly
estimates, a moving annual average consisting of four quarterly
estimates will eliminate the seasonal fluctuations and remove
much of random variation as well.  When considering annual
estimates over several years, a three-year moving .average will
smooth out much of the year-to-year variation.  In-specific
instances other averaging schemes may be considered.  The1 se-
lection of the appropriate moving average, is subject to personal
judgement.  VThen employing the moving average, estimates of
the trend line at the beginning and end of the data time series
are usually omitted.
    Other curve smoothing techniques such as the Whittaker-
Henderson smoothing formula have previously been employed in
the analysis of air quality trends, but they can be more
difficult to apply since they generally require the use of a
computer.

-------
                           10
Example  la;  Figure 1 depicts a trend line for annual geo-
metric Total Suspended Particulate  (TSP) monitored at the
NASN site in New Haven, Connecticut from 1960-1971.  The
curve was obtained by computing a 3-year moving average of the
annual estimates and plotting each point at the middle year
of each  3-year group.  It characterizes the trend as reversing
dix-ection during the 12-year period.
Example  Ib;  Figure 2 depicts an analagous trend line for TSP
at the Tucson, Arizona NASN site.  In this instance, the
trend line depicts a long-term downward trend which has
stabilized in the latter years.

    The  trend lines thus formed provides a nice descriptive
tool for the evaluation of the overall trend.  Since subjective
bias may creep into the interpretation of the trend, objective
techniques are desirable to classify the overall pattern and
quantify the amount of change.  The following constitutes a
variety of statistical techniques which have been useful for
this purpose.  Several techniques may be appropriate to analyze
a given set of data.  It may be desirable to employ more than
one since occasionally they can produce different conclusions
due to some of the different assumptions on which they are
based.  It is not uncommon, however, for several sets of
assumptions to seem equally reasonable.  It is at this point
that subjective judgement of the auxiliary information contributes
to assessing the various formal results.  For convenience,
Table 1 summarizes the typical usage of the particular statistical
procedures.  In each case, the procedure assumes at least that
the observations or the air quality parameters could have
occurred with equal likelihood.
4.4.2.  Correlation Techniques
    These techniques consider the statistical significance of
the correlation of pollutant observations or summary statistics
with the sequence in which they were observed.  Since the time
interval between observations is not considered, missing ob-
servations can bo ignored.

-------
     TABLE 1  SUMMARY OF APPLICATION OF STATISTICAL PROCEDURES FOR CLASSIFYING TRENDS
     Type of Analysis
      Form of Data
     TechnicuG
Trend in short-term air quality
Trend in long-term air quality
       (averages)
Estimation of rate of change
in long-term air quality
Specific quantiles or maxi-
mum value per year or of
a given quarter/season

Percent observations greater
than specific concentration
between two time periods

Annual averages or average
level for specific quarter/
season over several years

Same as Above
Spearman Correlation
                                                                       Chi-Square
Spearman Correlation
        or
Parametric Correlation*

    Regression*
''Additional assumption required that observations are normally  (log-normally) distributed."

-------
140

125
              FIGURE 1 - THREE YEAR RUNNING    RAGES OF TOTAL SUSPENDED PARTICULATE
                                     NEW HAVEN, CONNECTICUT
100
 75
 50
 25
                                                                                                    ^TANNUAL GEOMETRIC MEAN
                                                                                                     6THREE YEAR AVERAGE
     60
110
61
                         62
                    63
                                             64
65
66
                                                   YEAR
                                                                            67
                                                                      68
                                        69
              FIGURE 2, - THREE YEAR RUNNING AVERAGES OF TOTAL SUSPENDED^PARTICULATE
                                        70
                                                                                                                    71
100
_L5_
                                                                                                            L GEOMETRIC .VEAN
                                                                                                       THREE YEAR AVERAGE

-------
                              13
    They can be  applied  to any  set of aggregate measure of
pollutant values,  subject to the assumption that they are
equally likely and independent.  Therefore, if seasonality is
suspected,  annual  estimates should be used or individual
seasonal estimates should be considered  separately.
    Two types of these procedures are presented.  The first
is  nonparametric,  meaning no further assumptions are necessary.
It  examines for  a  consistently  changing  series.  The second
is  parametric, requiring the additional  assumption, frequently
encountered, that  the data or their logarithms are normally
distributed.  It is sensitive to a constant absolute or
percentage  change.
4.4.2.1.  Danielfe  Test for Trend using the Spearman Rank
          Correlation
    In order to  utilize  this procedure,  at least four observa-?
tions should be  available.  Given observations X-^,  ..., X
and their corresponding  relative ranks R(X,) ,  ..., R(X  >, the
test statistic is  the Spearman  Rank Correlation Coefficient:
                      K,   •«•     To
                              n.Cn^-1)
          where T = Z[R(xi)-i]2/ that is, the summed squares
of the differences between each values rank and its sequential
order, i, in the  series of n observations.  The absolute
value of  p is  compared with a critical value w  in Table 2,
if n <30, and with w =X /J n-1, if n  >- 30, where X  is the
                    P  p  i                        p
p quantile of a standard normal random variable obtained from
Table 3.  If  |p|> w  then a trend is declared significant
at the a=2p significance level.  A positive value of p in-
dicates an upward trend while a negative value of p in-
dicates a downward trend.  It can be noted that the estimate
of the Spearman rank correlation coefficient p is merely the
usual product moment correlation of the ranks of the ob-
servations with the order in which the observations were
taken.

-------
                              14
 Example 2a,;   Applications of Daniel's Test ror Trend on Tucson
 TSP Oats 19G4-1971.   The following table provides the annual
 geometric meeins,  their relative values and the index over
 time.
xi
RCX^
i
128
8
1
118
7
2
80
3
3
89
5
4
70
1
5
78
2
6
96
6
7
88
4
8
If ties had occurred,  the  ranks  can  be  determined  by averaging
the ranks among  the  tied observations,  or  preferably utilizing
the data estimates to  the  next available place,  even if  it  is
not a significant digit.
    T = £  (R(X .)-i)2
      = (8-1)2 + (7-2)2 +  (3-3)2 +  (5-4)2  +  (1-5)2 + (2-6)2
        +  (6-7)2 +  (4-8)2
      = 49 + 25  + 0 +  1 +  16 + 16 4-  1 + 16
      = 124
      _ !      6T
   r>  — J- —
            n(n2-!)

      = 0.476
    The .90 quantile of the Spearman  test  statistic  is  .5000
for n=8.  Therefore, apparent downward  trend would not  be
accepted even at the ot= 0.20 significance  level.
    Using the entire twelve year record of data,  p=-0.769.
This is greater than the  .995 guantile  for n=12.  Therefore,
the 12-year trend can clearly be classified downward.
    The Spearman coefficient on the data from  1968-1971 is
p=+0.80.  This pattern is upward but  is only significant at
the 0.20 level.  The technique is not very powerful  at  such
small sample sizes.

-------
                                15
     The above technique is primarily useful for classifying the
 temporal pattern as upward or downward and indicating thecon-
 sistency of the pattern by the statistical significance level.
 4.4.2.2.  Parametric Correlation Technique
     Let X^, i=l,n be a sequence of observations or their
 logarithms.  The test statistic is
     T = x/n-2  Jc S     whey.0     c = 1  (n2-!)
                        where
- c
                 32                   12
                                .           /•     x  ,-
                                 3 = -   d -- )  i
                                      .nc       2
                                  a2= !_  £ (X^X) 2
                                      n
 T  is  compared to the p quantile of Students t statistics with
 (n-2)  degrees of freedom provided in Table 4.  If  JT|> t then
 the trend is declared significant at the a=2 (1-p)  signifipance
 level.   A positive value of T indicates an upward  trend,
 while a negative value of T indicates a downward trend.
 Example 2b;   Application of Parametric Correlation Technique
 to Tucson,  Arizona TSP Data 1964-1971.
       = J^2 J7  §/N/a2-c32
       »  1   (n2-!)  = 1  (64-1) =5.25
       12           12
       Z ,.  n+lv  X..
=     J-    '( -3.5 In  (128) -2.5  In  (118)  -1.5  In (80)
   8(5.25)   -0.5 In  (89) +0.5 In  (70)  +  1.5  In (78)  +
             +2.5 In  (96) +  3.5  In  (88) }
= -1.985/42 = -.047

 2 _  1 I  (X.-X)2 = 1  £  X.2  -  (Z X.)  = .03715
      "5"     x       o"    -1         -1-
      8             8          -_
then T = y   v/5.25  (-0.047/  »/. 037-5. 25 (. 002)
       = -1.65

-------
                              16

 This value lies between the .90 gnd .95 quantile of the
 students t statistic.  Therefore, the trend can be considered
 significant at the 0.20 level but not the 0.10 level.
      Doth the Spearman and the parametric correlation  techniques
 failed to detect a trend during 1964-1970 because of the year-to-
 year variability in the annual estimates.
     Considering the entire 12-year period, the value of the
 test statistic T=-3.810.  This is significant at the .01 level,
 and the trend can therefore be classified downward.  The value
 of the corresponding test statistic for the 4-year interval
 1968-1971 is T=+2.15.  This is only significant at the 0.20
 level.  Note the similarity between these results and  those
 obtained by using the simpler non-parametric analogue.
 4.4.3.  Regression Techniques
     For this technique,  the temporal distance between  observa-
 tions is considered.  Its primary application is to produce
 estimates of the constant rate of absolute or percent  change
 (growth or decay)  over time.
 4.4.3.1.  Simple Linear Model
     To estimate a constant absolute change, b,  corresponding to
 the model X=a+bT,
     A
     b = £(1.^-1)Xi             where pollutant concentration, X.^
         /f(TT-~T) ^/z'Uj^X) 2    exists at time V
 The estimate of "a" is   a  = x - bT .
 The  statistical  significance of the estimate of  b as compared with
 an  assumed  value b0 can  be  tested by computing
               B  -  (b-b0)A/s2/I(T-T)2"

where  s2 =  {T. (X-X) 2-IE (T-T)X] 2/Z (T-T) 2] >/ (n-2)
and comparing  B with the Student's  t  statistic, t, at the .p quantile,
with n~2 degrees of freedom.  If  IB I > t  then the rate, of change
is significantly different than b0   at the a=2p significance level.

-------
17

In a similar manner, a confidence interval can be created
^ ^ _ *- -—~~ "•—* "- -L- -~T ™
about the estimate b. The interval is defined as b ± t v42/i] (T-T)2,
This interval contains the "true" rate of change, b, with
Probability 1-a.

4.4.3.2. Exponential Model
To estimate the percent rate of change, r corresponding
T
to the model X--ar., calculate and test significance of log (r)
by substituting log (X.) for X. in the formulae of the previous
-section. The rate of change is usually presented as a change
..of (r-1) x 100% per unit of time.
Example 3; The above regression techniques are applied to the
TSP data for New Haven, Connecticut.
The estimates of absolute and percentage rates of change
are presented for the time intervals 1960-1971, 1964-1971, and
196P-1971.
Rates of Change
Absolute Percent
0.26
-2.24
+ 7.00
1960-1971
1964-1971
1968-1971
+0.27
-3.46
+ 9.26
This again demonstrates that the choice of time interval can
play an important role in the determination of an estimated
rate of change.
4.4.4. Test for Trend in Proportion of Observations Above
A Standard (Chi Square Analysis)

This technique is useful to test for a change in the
extreme value or short-term statistics. It compares the
percent of observations above a given threshold concentration,
such as a 1-hour standard, between two time periods. It is
desirable to consider independent observations. Therefore,
for hourly data one should consider at most one observation

-------
                               18
 per day,  e.g.,  the maximum observation per day or the obser-
 vation  of a particular hour.   In general,  observations
 derived by intermittent sampling can be considered independent.
                       No.  Obs = Standard.  No.  Obs > Standard
      TIME  PERIOD I
      TIME  PERIOD II
                                   n.
                                                              N
 The  test  should only be  used if there are at least five
 observations  in each of  the  four cells..
 Let  p,-;b/n, bo  the  proportion of observations in time period I
 that are  above  the  standard.   Similarly,  p^d/n- for time
 period  II.
      One can test  (i) for any change between the two time;
 periods disregarding whether it is  an increase or a decrease, •
 e.e., P1=P2 or  (ii)  specific direction of change between the
 two  time  periods, say P-,£P2*
      The test  statistic
          T=
T is defined as:
   2
                n,n2(a+c)(b+d)
Consider a change  to have  occurred  if  T  exceeds  the  Chi  Square  .
statistic   of Table 5 at  the  1-a quantile with  1  degree of
freedom.
      If only one  direction of change  is of  interest,  for example,
has there been improvement, then consider improvement  to have
occurred if T exceeds the  Chi  Square statistic     at the (l-2ot  )
quantile.  In either case  the  significance level is  approxi-
mately  a.
Example 4:  The following  table represents the number  of days
whose, maximum 1-hour oxidant concentration exceeded  the  1-hour
standard at a particular location during 1964-1967 and 19G8-1S71.

-------
                               19
                       s t c<.K c"!.;; r cl    > s t o n cJ a r d
1964-1967
1968-1971
662
714
154
111
•I n, = 816
1 L
1 • n, = 825
TOTALS 1376
                  (ad-bc)
                                      265    .          1641
                   (b!-dT~

       =   (1641) [(J62) (111)  -  (354) (714) ] 2
             (816)  (C25)  (1376)  (265)
       =   8.9

At the level of  significance  of  .05,  the calculated value  of  T
exceeds the tabulated  statistic  at .90 quantile = 2.706.  It can
therefore be concluded that short-term  oxidant  levels  have
significantly decreased  in  recent years at  the  particular
site.
5.  ASSESSING REGIONAL TRENDS
    Trends can be discussed in terms  of measurements at  a
single location  over time or  collectively at a  group of  sites
over time for the purpose of  assessing  national or regional
trends.  The collective  analysis has  been employed in  the
Federal Trends Reports^.
    In general,  the assessment can be done  in two ways.  The
recommended approach is  to  determine  the trend  at each indi-
vidual site and  then summarize the results  for  the group of
sites.  The trend at each individual  site can be classified
as upward, downward, or  no  change.  This may be done over  a
variety of time periods, but  it is important to separately
consider the came time interval for each site.  The summary
would then be in the form of  the number of  upward trends,
downward trends and no change.  In order to consider trends
at various concentration levels, the  analysis may be considered
for separate groups of sites with typically high or low  concen-
trations.

-------
                              20
    An alternate approach is to consider a composite index
of the data from all the sites at each point in time, such as
a composite average over time.  This composite form of the
data lends itself to a convenient graphical summary, however,
there are some limitations.  In general, the composite can be
dominated by a few individual sites.  Say for example, the group
of sites is diverse and constitute a wide range of concentration
levels.  Then a composite average can be dominated by the
behavior of the sites with the highest concentration levels,
thereby hiding the behavior at the sites with the lower concen-
tration levels.  Also, the rate of change of the composite
may not represent the typical rate of change of the individual
sites.  A zero rate of change may in fact be a product of an
equal number of increasing and decreasing patterns.  Another
source of error may be the non-independence or misrepresen-
tation of the sampling stations.  For example, within a
particular AQCR the vast majority of the sampling locations can
be concentrated within a single urban area, while the remaining
sites are distributed throughout the remainder of the region.
An equal weighing of the sampling information within the AQCR
may actually favor certain well monitored districts and as
such, misrepresent the entire AQCR.  Moreover, many of the
sites in that single urban area may provide equivalent or in a
sense redundant information in terms of trends or concentration
                                                         v
levels.  A logical solution may be to form a weighted average
of the sites within the AQCR according to spatial location or
by combining information within separate homogeneous groupings
such as business, industrial, residential, etc.
    It can be seen it is imperative to investigate and con-
sider the pollutant behavior and relative circumstances at
individual sites in the evaluation of regional trends.
6.  INTERPRETATION OF TRENDS
    The classification of the trend of an air pollutant is
a description of its historical buhavior.   Thir. can be done

-------
                             21
by ip.ci-ans of a fitted curve, an estimate of the rate of change
or a qualitative description such as upward or downward.  This
classification is only a starting point.  The reality of the
so-called trend and the possible explanations depends on many
factors, each of which must enter into the final analysis.
    First of all, steps should be taken to ensure that the
trend is not a product of changes in instrumentation, metho-
dology, site location, etc.  If these arc the case, experience
may dictate the relative effect of any of these factors.
    Secondly, if the historical data record is only a partial
sampling of the entire time period studied, perhaps derived
by intermittent sampling, then,the implication of the
                                                            i
apparent trend must be considered.  That is, is the historical
record representative of the true air quality history or was
it influenced by unrepresentative transient phenomenon?  This
evaluation may involve the investigation of the reality and
representativeness of the extreme measurements which are
causing the apparent change in air quality.
    Thirdly, the representativeness of the trend at a particu-
lar site of a larger area must be considered.  A site located
in the central business district of an urban area may not be
representative of the entire city  nor its AQCR.  This
qualification  applies to the  sites  of the National   •
Air Surveillance Network.
    Fourth, the trend is merely a representation' of past air
quality.  Without accompanying data on meteorology and
emission patterns, the trend should not be extrapolated to
predict future concentration levels or continued direction of
change.

-------
                                        22
TABLE   2              QUANTILES  OF TIM; SI-I:AHMAN TKST STATISTIC"
/.
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30'
p = .9M
.8000
.7000
.6000
.5357
.5000
.4667
.4424
.4182
.3986
.3791
.3626
.3500
.3382
.3260
.3148
.3070
.2977
.2909
.2829
.2767
.2704
.2646
.2588
.2540
.2490
.2443
.2400
.950
.8000
.8000
.7714
.6786
.6190
.5833
.5515
.5273
.4965
.4780
.4593
.4429
.4265
.4118
.3994
.3895
.3789
.3688
.3597
.3518
.3435
.3362
.3299
.3236
. .3175
.3113
.3059
.975

.9000
.8286
.7450
.7143
.6833
.636-1
.6091
.5804
.5549
.5341
.5179
.5000
.4853
.4716
.4579
.4451
.4351
.4241
.4150
.4061
.3977
.3894
.3822
.3749
.3685
.3620
.990

.9000
.8857
.8571
.8095
.7667
.7333
.7000
.6713
.6429
.6220
.6000
.5824
.5637
.5480
.5333
.5203
.5078
.4963
.4852
.4748
.4654
.4564
.4481
.4401
.4320
.4251
.995

_
.9429
.8929
.8571
.8167
.7818
.7455
.7273
.6978
.6747
.6536
.6324
.6152
.5975
.5825
.5684
.5545
.5426
.5306
.5200
.5100
.5002
.4915
.4828
.4744
.4665
.999



.9643
.9286
.9000
.8667
.8364
.8182
.7912
.7670
. .7404
.7265
.7083
.6904
* .6737
.6586
.6455
.6318
.6186
.6070
.5962
.5856
.5757
.5660
.5567
.5479
    For n greater than 30 the approximate quantilcs of/> may be obtained from

                                             *»
                                      " ": Vn  - 1
    where .«•,, is the p quantilc of a standard normal random variable obtained from
    Table I.

    SOUKCE.  Adapted from Cilasscr and Winter (1961), with corrections.
     "The entries in  this table aie selected qminliles >v,,  of the Spearman rank correlaiion
    coefficient p when  used as a lest statistic. The lower quantilcs may be obtained from (In-
    equation
                                     *•„=-»',-„
    The critical region corresponds to values of p smaller than (or greater than) but not induJ-
    ing the appropriate ^iiantile. Note that the median of p is 0.

-------
                               23
TABLE
«•„
-3.7190
-3.2905
-3.0902
-2.5758
-2.3263
-2.1701
-2.0537
- .9600
- .8808
- .7507
- .6449
- .5548
- .4758
- .4395
- .4051
- .3408
- .2816
- .2265
- .1750
- .1264
- .0803
- .0364
-.9945
-.9542
-.9154
-.8779
-.8416
-.8064
-.7722
-.7388
-.7063
-.6745
-.6433
-.6128
-.5828
-.5534-
-.5244
-.4959
3
/'
.0001
.0005
.001
.005
.01
.015
.02
.025
.03
.04
.05
.06
.07
.075
.08
•OS
.10
.11
.12
.13
.14
.15
.16
.17
.18
.19
.20
.21
.22
.23
.24
.25
.26
.27
.28
.29
.30
.31
NOUMAI. It
wv
-.4677
-.4399
-.4125
-.3853
-.3585
-.3319
-.3055
-.2793
-.2533
. -.2275
-.2019
-.1764
-.1510
-.1257
-.1004
-.0753
-.0502
-.0251
.0000
.0251
.0502
.0753
.1004
.1257
.1510
.1764
.2019
.2275
.2533
.2793
.3055
.3319
.3585
.3853
.4125
.4399
.4677
. .4959
ISTHIUUTION"
f W>
.32 .5244
.33 .5534
.34 .5828
.35 .6128
.36 .6433
.37 .6745
.38 .7063
.39 .7388
.40 .7722
.41 .8064
.42 .8416
.43 .8719
.44 .9154
.45 .9542
.46 .9945
.47 .0364
.48 .0803
.49 .1264
.50 .1750
.51 .2265
.52 .2816
.53 .3408
.54 .4051
.55 .4395
.56 .4758
.57 .5548
.58 .6449
.59 .7507
.60 .8808
.6.1 .9600
.62 2.0537
.63 2.1701
.64 2.3263
.65 2.5758
.66 3.0902
.67 3.2905
.68 3.7190
.69

/'
.70
.71
.72
.73
.74
.75
.76
.77
.78
.79
.80
.81
.82
.8.1
.84
.85
.86
.87
.88
.89
.90
.91
.92
.925
.93
.94
.95
.96
.97
' .975
.98
.985
.99
.995
.999
.9995
,999V

SOURCE.  Abridged from Tables 3 and 4, pp. 11} -112, Pearson and Hartley (1962).
  • The entries in this table arc quantites wt, of the standard normal random variable M.
selected so WZ.wJ^p andP(W >  wj - 1 -p.

-------
                                      24
TABLE  4
                                                 PERCENTILES OF THE  t DISTRIBUTION
t
df
1
2
3
4
5
6
7
8
-9
10'
11
12
13
14
15
16
\7
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
120
X
tec
.325
.289
.277
.271
.267
.265
.263
.262
.261
.260
.260
.259
.259
.258
.258
.258
.257
' .257
.257
.257
.257
.256 .
.256
.256
.256
.256
.256
.256
.256
.256
.255
. .254
.254
.253
*.70
.727
.617
.584
.569
.559-
.553
.549
' .546
.543
.542
.540
.539
.538
.537
.536
.535
.534
.534
.533
.533
• .532
.532
.532
.531
.531
.531
.531
.530
.530
.530
.529
.527
.526
.52-1
f.o
1.376
1.061
.978
.911
.920
.906
.896
.889
.883
.879
.876
.873
.870
.868
.866
.865
.863
.862
.861
.860
.859
.858
.fc58
.857
.856
.856
.855
.855
.854
.854
.851
.848
.845
.842
*M
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363 "
1.356
1.350
1.345
1.341
1.337
1.333
1.830
1.328
1.325
1.823
1.321 '
1.319
1.318
1.316
1,315
1.314
1.313
1.311
1.310
1.303
'»
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.6S4
1.296 1.G71
1.289 1.C5S
1.282 1.645
t.nt
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.181
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.015
2.042
2.021
2.000
1.9SO
1.9(50
t»
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.4S5
2.479
2.473
2:467
2.462
2.457
2.423
tr»
63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.704
2.390 2.660
2.35S 2.617
2.326
2.576
A'..-l l.y pr rm;..i..n fr.im 1 ,,ir,,,!n,-t,..n ' . NMjMir.J .Jim.V''< CM ,-| i |,y \v. J
Cur.'i-u'iy, Inc.  l.niti.* origin. ,tly frum '. ,i.U- III of .v..li.^i. uJ 7u.v>4 hy K. A.
                                                ;.n.i K. J. Maw. y. it .. ro,,yri,:!il. lv -.7. MrOraw-llitl II...A
                                                »r..l K. Y*l>«, It* 14. Oliver »n.i K.,\.|. I t.l.. lx>ndon.

-------
                              25
TABLE 5  QUANTILES OF A CHI SQUARE RANDOM VARIABLE WITH
                      ONE DEGREE OF FREEDOM
              Quantile, p
.750 ,
.900
.950
.975
.990
.995
.999
1.323
2.706
3.841
5^024
6.635
7.879
10.830

-------
                             26
                          REFERENCES
-1'  Federal Rog.inter,  3_£,  No.  158,  page 1M90, August 14, 1971.

2.  "Guidelines  for  the.  Evaluation  of Air Quality Data" U. S.
    Environmental Protection Agency,  Office of Air Quality
    Planning and Standards, Research  Triangle Park, N.C.,
    OAQPS No. 1.2-014, January 1974.

3.  Conoyer, T-7.  J. ,  "Practical Non-Parametric Statistics,"
    John Wiley  & Sons, ""inc., N. "Y~. ,  1971.

4.  Torrie, J. !•!.' and  Steel, R.  G. ,  "Principles and Procedures
    of Statistics, McGraw  Hill Publishing""coT^ rncT7~N^Y. , I960,

5.  "Rational Air Monitoring Program:  Air Quality and Emission
    Tr en (Ts "TuTTh'v V.G~ Rep oi:~f:~  Volumo~T^  D^  BT~iriI vir orirne n t a 1
    ProtecHToiT"/ig.-sncy7~Office  of Air  Quality Planning and
    Standards, Research  Triangle Park,  N. C.

6.  Merz, P. IT., Painter,  L. J. ,  Ryason,  P. R. , "Aerometric
  •  Data Analysis -  Time Series  Analysis  and Forecast and
    Atmospheric  Smog Diagram".

7.  Tiao, G. C., Box,  G. E. P.,  and Hamming, W. J., "Analysis
    of Los Angeles Photochemical Smog Data:  A Statistical
    Overview," Technical Report  #331, Department of Statistics,
    University of Wisconsin, Madison.

-------