V.SDV.'EST RESEARCH INSTITUTE
                                                                               	feo
                     METHODOLOGIES  FOR  DETERMINING  TRENDS IN
                              WATER  QUALITY  DATA

                                       by

                                 Kan'n  M. Bauer
                               William D.  Glauz
                                 Jairus D.  Flora
                                  FINAL  REPORT
                                  July 3,  1984

                EPA Contract No.  68-02-3938, Assignment  No.  29
                          MRI Project No. 8205-5(29)
                                 Prepared for

                Industrial Environmental Research  Laboratories
                     U.S. Environmental Protection Agency
                 Research Triangle Park, North Carolina  27711

                           Attn:  Ms. Susan Svirsky
                             Task Manager (WH-553)
               ^^

MIDWEST RESEARCH INSTITUTE 425 VOLKER BOULEVARD, KANSAS CITY/MISSOURI 64110 • 816 753-7600

-------
                    METHODOLOGIES FOR DETERMINING TRENDS  IN
                              WATER QUALITY DATA

                                      by

                                Karin M. Bauer
                               William D. Glauz
                                Jairus D. Flora
                                 FINAL REPORT
                                 July 3, 1984

                EPA Contract No. 68-02-3938, Assignment  No.  29
                          MRI Project No. 8205-5(29)
                                 Prepared for

                Industrial Environmental Research  Laboratories
                     U.S. Environmental Protection Agency
                 Research Triangle Park, North Carolina  27711

                           Attn:  Ms. Susan Svirsky
                             Task Manager (WH-553)
                               gs^^
r.;j\VZST RESEARCH INSTITUTE 425 VOLKER BOULEVARD. KANSAS CITY. MiSSOUR! 6-^C • 616 753-76JQ

-------
                                  PREFACE
     Section 305(b)  of the U.S.  Clean Water  Act  requires that the States
report biennially on  the  quality of their navigable waters.  To assist the
States in  the  preparation of  these reports,  the Environmental Protection
Agency issues  guidance information.  This  report on water quality trend
determination was prepared by  Midwest Research Institute for use by EPA as
a portion  of the  guidance information for helping the  States  in preparing
their 1984  reports.   The  authors are indebted to the several EPA reviewers
who made many  thoughtful  and  worthwhile suggestions to improve the earlier
draft.
                                     11

-------
                             TABLE OF CONTENTS
I.         Introduction 	    1

II.        Purpose	    4

III.       Important Considerations  	    6

               A.   What Is A  Trend?	    6
               B.   What Is A  Change?	    6
               C.   Seasonal Effects  	    7
               D.   What Comprises  An  Observed Series? 	    7
               E.   How Much Data?	   10
               F.   How Often?	   11
               G.   Hypothesis Testing	   12
               H.   Estimation	   14
               I.   The Normal Distribution	   15
               J.   Parametric or Distribution-Free?  	   20
               K.   Plotting	   22
               L.   Organization of this  Document	   23

IV.        Parametric Procedures	   27

               A.   Methods to Deseasonalize  Data	   27
               B.   Regression Analysis	   28
               C.   Student's T-Test  	   32
               D.   Trend and  Change	   38
               E.   Time Series  Analysis	   39

V.         Distribution-Free Methods.  ....   	   41

               A.   Runs Tests  for  Randomness	   41
               B.   Kendall's Tau Test	   45
               C.   The Wilcoxon Rank  Sum Test (Step  Trend)	   50
               D.   Seasonal Kendall's  Test  for Trend	   57
               E.   Aligned Rank Sum Test for Seasonal Data
                      (Step Trend)	   64
               F.   Trend and  Change	   69

VI.        Special Problems	  .   71

               A.   Missing Data	   71
               B.   Outlying Observations	   72
               C.   Test for Normality	   73
               D.   Detection  Limits  	   75
               E.   Flow Adjustments	   76

Bibliography  	   78

Appendix	   82

-------
                               List of Figures
Figure                              Title                             Page
 1        Composite Series	      8
 2        Sample Trend Line  	      9
 3        Two Normal Distributions  of  a  Variable X	     16
 4        Standard Normal Distribution	     17
 5        Decision Tree	     24
 6        Sample Linear  Regression  Line  	     29
 7.       Monthly Concentrations of Total  Phosphorus	     58
 8        Plot of Percentage Violations  Versus  Time 	     61
 9        Example of Plot on  Probability Paper	     74
                                     IV

-------
                               List of Tables

Table                               Title                             Page

 1        Upper Tail Probabilities for the Standard Normal
            Distribution	   19

 2        Two-Tailed Significance  Levels  of  Student's T  	   33

 3        The 5% (Roman Type) and 1% (Boldface Type) Points
            for the Distribution  of F	   37

 4        r-Tables Showing 5%  Levels  for Runs Test	   42

 5        Upper Tail Probabilities for the Null Distribution
            of Kendall's  K Statistic	   48

 6        Upper Tail Probabilities for Wilcoxon's Rank Sum W
            Statistic	   53

-------
                         SECTION I.   INTRODUCTION
     Water quality  reports  prepared by the States  and  jurisdictions under
Section 305(b) of the  Clean Water Act contain information on a wide variety
of parameters.  Because  the reports cover a relatively short 2-year period,
it is  often  difficult  to determine with  reasonable certainty whether water
quality  has  improved,  remained  constant,  or degraded  from  one reporting
period to the next.   In order to assess water quality trends, methods must be
used which differentiate a consistent trend in a given direction from natural
cyclic or seasonal changes or occasional excursions from the norm.

     Generally, water  quality  data are  affected  by seasonal  or  cyclic
effects,  an  episodic or regular  effect,  a long-term monotonic trend, and
random noise.  The  purpose  of this document is to provide a guidance in the
usage  of  statistical methods to separate out these effects  and to test  for
each of them.

     The statistical methods described in this document provide the means for
the analyst  to make this consistent trend  assessment by  using and testing
concepts  of  data  validity  and  significance.   These statistical techniques
will allow the analyst to detect  the presence of  long-term trends and to put
confidence intervals on the magnitude of trends or changes.

     Once a  specific statistical  approach is established, it can be used in
succeeding years.   As  a result,  the data-gathering process itself may  be
modified, thereby increasing the quality and value of future data.   Another
potential benefit is the discovery that historical data can have previously
unappreciated value; retrospective analyses then become possible.

-------
     In principle,  statistical  methods  can  be applied  to all the  data
collected during the  water quality monitoring process.  These data include
chemical  parameter  values for water,  sediment,  and tissue samples (e.g.,
DO,  pH,  nutrients, trace  metals,  specific  organic compounds);  physical
parameter values  (e.g.,  turbidity,  suspended solids, light penetration);
biological  parameter  values  (e.g.,  macroinvertebrate populations,  other
aquatic species);  pollutant discharge or  source  data; and derived values
which combine  several parameters  in  a composite evaluation of water quality
(e.g.,  trophic indices,  water quality  indices,  violation  statistics).
However, the water  quality analyst is advised to  apply  statistical methods
initially to   only  those  parameters  which  are of  greatest  interest or
priority  and  which offer  the  most  extensive  and  reliable data.   As the
analyst becomes  more  familiar with  statistical  methods,  he will  then  be
able to expand the scope of analysis  to most or all of the parameters.

     Suggestions for  the  scope of an initial  effort to  use statistical
methods in determining water quality trends are as follows:

     1.   The   analyst  should select for analysis  those chemical  water
quality parameters  which  are  of  greatest  importance to  the  State,  to
selected areas of the State, or to the designated use in question.

     2.   The  analyst  should consider  important  biological parameters  such
as macroinvertebrate populations  and  distributions.

     3.  A  minimum  of two or three  of the important parameters,  including
at least  one  from  each  of the categories  defined  in  (1) and (2),  should be
statistically  analyzed  by methods designed  to obtain valid  estimates  of
trends.

     4.  The analyst  will  probably find that the period of these data bases
will  need to  be  extended beyond the 2-year  period specifically  called for
in the 305(b)  reports,  so  parameters  should be  selected for  which
comparable  data were  collected  in earlier years.  Data from previous 305(b)
reporting periods could be used to augment the current period data.

-------
     These analyses  should be conducted  first for specific water  bodies
rather than statewide.   Further,  it may be necessary to initially limit an
analysis to one  part of a specific water  system,  with  later extension to
the remainder of the system.   The results of data analysis should tell the
analyst which is the appropriate and valid scope.

     As mentioned  earlier, the  analyst may  find it appropriate to  use a
statistical approach to  modify ongoing  data-gathering activities.   In
addition to  improving the quality  of  the water quality data  which are
collected, this  may reduce the cost of data  collection because data will
now be acquired more selectively.

     In summary,  there are two  basic perspectives  from which  the analyst
should view the  results  of the application of statistical methods to water
quality data:   the  importance  of the detected trends themselves, for what
they  say  about  both the  State's  water quality  and  its water  quality
programs;   and  the impact of the statistical  methods  and their results on
ongoing water quality data collection activities.

-------
                            SECTION II.   PURPOSE
     The purpose of  this  document is to provide guidance to the States and
Regions in analyzing  for  trends  in their water quality  data.   It is not a
statistical treatise.   Indeed,  some statisticians  may be concerned  at  the
looseness with which  terms  and  symbols seem to be  used;  the lack of mention
of all  the  ifs,  ands, and buts;  and the possibility that the reader may end
up indiscriminantly  using one test when another  is  more  proper or more
powerful.

     Rather, this  document  is aimed at  persons with  little background in
statistics, or even  much  in algebra.   Most of the  methods  discussed require
only simple arithmetic and the use of standard tables.  As  such,  it is a "how
to" approach, with  little or no theory.   In that  respect,  it is  rather  like
the statistical  programs  available  now on most computers, which are designed
so that  a  nonstatistician can readily feed in data,  select the statistical
method to be used,  and then  read and interpret the  answers.   In fact, many of
the methods are  available as computer programs, and  examples  of these  are
given in this document.

     To be sure, there  are  distinct dangers in this approach,  just as there
are when nonstatisticians use the "canned" computer programs.   It is always
possible to  use  an  inappropriate or incorrect test.  This  document will
therefore warn the reader of  the  major limitations and potential pitfalls to
be avoided.  Ideally, of  course,  the analyst would always  select the "best"
test; however, the  "best" test  is likely to require extra  effort, access to
special computer programs,  and the  like.  Realizing this, the  analyst may be
tempted to give up and do nothing at all, or,  worse, make arbitrary judgments
about "trends" in his data, with  no justification.  It seems better,  in such
cases, to  have some  statistical  backup, even  if it is limited, before such
determinations are caade.

-------
     Of course, if  the  analyst does have some background in statistics,  has
a  statistician  available to assist  or  as a consultant, or  has  access  to
statistical computer programs,  these advantages  should be applied.   In such
cases, the  material  in  this document should  assist  the analyst in making
even better use of his available resources.

     Finally,   for  the   reader  desiring  further  information,  all  the
descriptions of the  methods  and tests given  here include references to the
(more rigorous) statistical  literature.

-------
                   SECTION III.   IMPORTANT CONSIDERATIONS
     A.    What Is A Trend?

   .  The concept of "trend" is difficult to define.   Generally,  one thinks of
it as a  smooth,  long-term movement in an ordered series of measurements over
a "long" period  of time.   However,  the term "long" is rather arbitrary and
what  is  long for  one  purpose may be short for  another.   For example, a
systematic movement  in  climatic  conditions over a century would be regarded
as a trend for most purposes, but might be part of an oscillatory or cyclical
movement taking  place  over geological  periods of time.   In  speaking of a
trend, therefore,  one  must bear in  mind the length  of the time period to
which the  statement  refers.   For our purposes, we shall define a trend to be
that  aspect  of  a series of observations that exhibits a steady increase or
decrease over  the length  of time of  the observations, as opposed  to a
"change," described next.

     B.    What Is A Change?

     It  is important to distinguish  between the terms "trend" and "change";
they  are often  incorrectly used interchangeably in water  quality reports.
However, they are not equivalent terms  and should be carefully distinguished.
A "change"  is a sudden difference in water quality associated with a discrete
event.   For  example, suppose  for  a period  of time at a  given station,
concentrations  of some pollutants  were persistently  high.   Then  a new
treatment  facility  is  placed into  operation  and,  from  then  on,  the
concentrations of  these same pollutants are lower.   This  sudden improvement
in stream water  quality should be referred to as a change, not a  trend.  A
change is  sometimes  called a "step trend" and,  for  clarity,  the other is
called a "long-term trend."

-------
     C.    Seasonal Effects

     Observations taken  over time at  regular intervals  often  exhibit a
"seasonal" effect.  By this,  a  regular cycle is meant.  For monthly data a
seasonal effect  is a  regular change  over the year that more or  less repeats
itself in succeeding years.   This  might be associated with  different  flow
rates corresponding to the  seasonal  pattern of precipitation and melting in
the  watershed  above  a  particular observation  station.  Here,  seasonal
components will  be restricted to  refer to annual cycles, although  in other
applications with different  frequencies  of observation, cycles  could occur
quarterly, monthly or daily.

     D.    What Comprises  An Observed  Series?

     A series of observations can  be  broken into several parts or components,
as  illustrated  in Figure 1 below.   In this  figure,  each component of the
series has  been  generated separately.  The seasonal component is represented
by a sine wave plotted in part a.   A  discrete change occurred at month  18 and
is  shown  in part b,  along with  a  linear  trend  represented  by the straight
line.  Finally,  part c shows an irregular or random component.   These  parts
are  summed  to give the series as it would be observed  in part d of  Figure 1.
In practice, one observes  a series of data such as illustrated  in part d of
Figure 1.  The objective  of  trend  analysis is to  determine  whether such a
series has a significant long-term trend or a discrete change associated with
a particular event.

     It is  often the  case that  a trend  is  not immediately obvious from a
series  of  "raw"  measurements.   Figure 2  depicts  a  series  of ammonia
concentration values  measured monthly  at a single  station.  Statistical
techniques can be used to obtain the  "deseasonalized" data (more will  be said
about this  later), and the trend  line for  the  deseasonalized data.  (This
particular  example was taken from  STORET).  The  horizontal axis represents
time (months) and the  vertical  axis  is the  ammonia concentration in mg/1.
Trend analysis techniques attempt to sort out the data displayed in Figure 2
into components  such as  those  illustrated  in  Figure  1  using statistical
                                   7

-------
               (a)  Seasonal Component
       :X%%A^%^
       • •••••(••'••^•••••^•••••••••••{••••••••••••••••[•••••••••••i*t«t«X«««»*I««*»t|«««**«»*«««|*«««*v*****l*««**X"
8--
     4--
    . c.
  0
               (b)  Trend Component With Change
               (c) Random Component
               (d)  Water Quality Data
o>
<    --,
                                                   ^
               •I	{	I	H	I	I	I	I	I	I	I	I	I	I	f
       048   16   24   32   40   48  56   64
                             MONTH
                      Figure 1.  Composite Series
                                8
                                                   I £

-------
   0.720
   0.576
   0.432
c
o
§ 0.288
  0.144
       0
        0
                                                            • Observed Data
                              —— Deseasonaiized Data
                              	 Linear Regression of
                                  Oeseasonalized Data
18
27
 36
Month
45
54
63
72
                               Figure 2.  Sample trend line.

-------
techniques to test for the presence of a trend while adjusting for a seasonal
effect.   The  techniques discussed  as  trend analysis  provide methods to
determine whether the  data  show a trend, a change  (often  referred to as a
step trend),  its direction,  and  whether such a change  or trend is sta-
tistically significant—larger  than can be ascribed to random variation.

     Most of the  methods  presented later in this document deal with trends,
although some tests  dealing with  changes are  also included.   We concentrate
on simple techniques.  More detailed analyses  of such data are referred to as
time series  analysis and can become quite complex.   These  more detailed
methods are discussed briefly in Section IV-E.

     E.    How Much Data?

     Three or  four  measurements  do not make  a  trend.   Imagine a  friend
flipping a coin  and  getting "heads" four times in a row.  We might consider
that to  be an  interesting  result, but  not one that would  prompt  us to
conclude that the coin had heads on both sides.   The amount of data is just
not convincing enough.  However,  if the friend continued  flipping  the coin
and obtained 10 heads in a row, we would then  seriously suspect that the coin
was indeed  unusual.   This  assumes, of course, that  we  can  rule out the
possibility that  the  friend is somehow manipulating the outcome of the coin
tosses and thus biasing the results.

     The amount  of  water  quality data needed  depends on the frequency  of
collection and  the  period  of seasonality  considered.  Theoretically, the
amount of data needed  can be  quantified using  the laws of  probability if the
specific requirements  of the  analysis are stated.   Such  detailed  sample  size
calculations are  quite specific  to each situation.   However, a general
discussion of the considerations can give some guidance.   Referring to part a
of  Figure 1,  one can  see  that a  short series  of  six to eight  monthly
observations would give a  rather misleading impression  of the series.   It
could show a  marked  increasing trend, a decreasing one,  or one of two types
of curvature.  Clearly, to identify a cyclical pattern, the number of
                                   10

-------
observations must be large enough to cover two complete cycles and preferably
more.  Thus, with monthly data and a seasonal effect,  a  minimum of 24 to
30 months of data would be needed.   On the other hand,  if  data are aggregated
to an annual basis, one might  be able to  identify a trend based on as few as
5  or 6 years.   However,  such  an  identification would not be able  to
distinguish between a discrete change and a long-term trend,  nor whether the
observed trend might be part of a longer cycle.

     The more  data,  the  more  components of a  series  could be identified.
One  rule  of thumb is that there should be 10 observations  (some authors
prefer 20) for each component to be tested for or adjusted for.   In addition,
for cyclical effects, the series should cover at least two full  periods.   The
2 years of data reported in 305(b) would be just barely sufficient to apply a
technique that  includes seasonality.  It would be preferable  to include data
from previous reporting periods, provided that they are compatible.

     F.    How Often?

     The sampling scheme,  i.e., the frequency of data  collection,  and  the
number  of  sites and  parameters,  dictates to a  great  extent the type of
statistical procedures which should be  used to analyze the data.  Most water
quality characteristics (physical,  chemical,  biological)  are collected on a
monthly basis; others are obtained once a year.

     Examples of water quality characteristics  measured in  the EPA's basic
water monitoring program include:
                                   11

-------
                                               Sampling frequency
                Characteristic                In rivers and streams
          Flow                                       monthly
          Temperature                                monthly
          Dissolved oxygen                           monthly
          pH                                         monthly
          Conductivity                               monthly
          Fecal collform                             monthly
          Total Kjeldahl nitrogen                    monthly
          Nitrate plus nitrite                       monthly
          Total Phosphorus                           monthly
          Chemical oxygen demand                     monthly
          Total suspended solids                     monthly
          Representative fish/shellfish              annually
            tissue analysis
The  important  point is  that  the trend  analysis  must consider  sampling
frequency when  determining the appropriate type of  trend analysis.   If a
cyclical component  is  suspected,  the  frequency must be a relatively small
fraction of the period in order to estimate the cycle.   In addition,  data for
at  least  two cycles would be needed.   On the other  hand,  if data are
collected at intervals  as  long as or longer than  a  possible cycle,  it is
important to ensure  that data collection times are at the same point in the
cycle.  Otherwise,  spurious trends  might appear or the variability  of  the
data might be  substantially  increased.   Care should be taken to ensure  that
data collection is  under the  same conditons each  time a sample is obtained.

     G.    Hypothesis Testing

     In  hypothesis  testing,  a statement  (hypothesis) to be  tested is
specified.   It  is generally  stated  with enough detail so that probabilities
of any  particular sample can be calculated  if  it is true.  The hypothesis to
be  tested  is referred  to  as   the  "null" hypothesis.  An example is the
hypothesis that a set  of data are  independent and identically distributed
according to the normal distribution with mean zero and variance
                                   12

-------
one.  This would  correspond  to the hypothesis of no trend.   (For most water
quality parameters a  different mean and variance would be appropriate.)  A
second hypothesis—the  alternative—may be  specified.   This represents a
different situation  that one  wishes  to detect,  for  example,  a trend or
tendency for the concentration of a pollutant in water to decrease with time.

     To test a  hypothesis, one observes a sample of data and calculates the
probability of  the data  assuming  that the  hypothesis is true.   If the
calculated probability  is  reasonably  large,  then the data do not contradict
the null  hypothesis,  and it  is  not rejected.   On the other hand,  if the
observed data are  very  unlikely  under the null hypothesis, one  is faced with
concluding either  that  a very unlikely event has occurred,  or that the null '
hypothesis is wrong.  Generally, if the observed  data  are less  likely than a
prespecified level, one agrees to conclude that the null hypothesis  is wrong
and to reject it.

     In testing a  hypothesis  one can make two types of errors.  First, one
can reject the  null hypothesis although it is true:  this is called  an error
of  the first  kind or a  Type  I error.   The probability of a'Type I error is
often denoted by the letter a.  An example of a Type I error would be stating
that a trend existed when the data  were actually  random.  On the other hand,
one can erroneously accept the null hypothesis when in fact it is false; this
is called an error of the second kind or a Type II error.   The probability of
a Type II error is often denoted by the letter (3.  An example  of a  Type II
error would  be the failure to detect  a real trend in water quality.   To
decide whether  to reject  a  null hypothesis,  probability levels usually
expressed as percentages are  used.  These probability  levels are also called
significance levels and are commonly chosen to be 1, 5, or 10%.   In addition,
confidence levels   are defined as 100% minus the  significance levels, e.g.,
99, 95, or 90%, respectively.

     Consider again the friend with the coin.  If the coin and the friend are
unbiased, there is an equal chance  that on any given toss the coin will come
up  heads or  tails.  The probability,  usually denoted by p, that it  will be
heads is thus  50% or 1/2. The probability of getting heads four times in a
row is:
                                   13

-------
          (l/2)(l/2)(l/2)(l/2) = 1/16, or 6.25%.

That is, the odds are only 1 out of 16 that this  result would occur purely by
chance.   If the  friend  repeated the experiment 16 times, only once (on the
average) would he get heads 4 times in a row.

     If one agrees  to  reject the hypothesis that p  = 1/2 if he gets four
heads in a  row,  then the significance  level of  0.0625 is the mathematical
probability of failing to accept the null hypothesis  that the coin has both a
head  and  a  tail that  are equally  likely  to  come  up (based on  this
experiment).   An a-value  of 0.0625  could be  considered statistically
"significant."  If one decides to reject only if heads appeared 10 times in a
row, the probability of the friend getting heads 10 times  in a row is 1/2
multiplied by  itself 10 times, which  is  approximately  0.001, or 1 chance out
of a 1,000.   Such a finding would usually be considered "highly"  significant,
indeed.

     H.    Estimation

     In addition to testing a hypothesis (of no trend against the alternative
that a trend exists for example), one may wish  to estimate the magnitude of a
trend.   Such  a  measure  might be a change  in concentration  per year.   Two
aspects of  statistical  estimation  need to be kept in  mind.  First, one may
obtain a point  estimate,  which  is a  statistic that  gives the "best" single
value for the  size  of  the trend.  However, this  by itself is of little use.
It  needs  to be  coupled with a  test to determine whether that  value is
significantly different  from  zero.   An additional  formulation is to give a
range or an interval estimate for the value in  question.  Such an interval  in
statistics is called a  confidence interval.   This  is an interval calculated
from the data  in such  a manner  that  it would  contain the true but unknown
value of  the  measure   in  a specified  proportion  of the samples.   The
proportion  of  the  samples that would yield an  interval that contains the
                                   14

-------
measure is  called the confidence level.  As mentioned previously, it relates
to  the  significance level, a,  in  that it is 100% minus  the  significance
level.

     In general, one would want to estimate the magnitude of any trend and to
place confidence limits on the magnitude of the trend.   If the sample size is
small, and/or  the variance  is quite large, these confidence  limits will be
very  wide,  indicating  that the trend is not  well  estimated.   If  the
confidence  limits are quite close, one can be  confident  that the trend is
well estimated.

     I.    The Normal Distribution

     The results of performing  an  experiment such as flipping a coin or mea-
suring the  pH  of a  water sample can be tabulated as  it is repeated over and
over again.   If the frequency of each observed result is then plotted against
the result  itself,  the  graph is called the frequency distribution, or dis-
tribution for  short.  A very important type of  distribution  often  used in
conjunction with trend  analysis (and  many other types of  statistics)  is
called the normal distribution.

     The normal  distribution (also  called  the  Gaussian distribution) has
dominated statistical practice  and theory for centuries.   It  has  been  ex-
tensively and  accurately  tabulated,  which makes it convenient  to  use  when
applicable.   Many variables such as  heights of men, lengths of ears of corn,
and weights of fish are approximately normally  distributed.  In some cases
where the  underlying distribution is  not normal,  a  transformation of the
scale of measurement may  induce approximate normality.  Such  transformations
as the square  root  and  the  logarithm of the variable are commonly used.  The
normal distribution may then be applicable to the transformed values, even if
it is not applicable to the original data.
                                   15

-------
     In many  practical  situations  the  variable of  interest is  not  the
measurement itself  (e.g., the weight of  a fish),  but the  average  of the
measurements (e.g., the  average  weight  of 100  fish  in  a survey).  Even if
the  distribution  of  the original  variable  is far  from  normal,  the
central-limit theorem states that  the distribution of sample averages tends
to become  normal  as  the sample size increases.  This is  perhaps  the  single
most important reason for the use of the normal distribution.

     The normal  distribution is  completely determined  by  two quantities:
its average or  mean  (u) and its variance  (a2).  The square root  of cr2,(o),
is called  the  standard  deviation.   The  mean locates the  center of the  dis-
tribution  and  the standard deviation measures  the spread or variation  of
the  individual  measurements.   Figure 3 depicts two normal  distributions,
both with mean 0, but with different standard deviations.
                .4
                .3
                        -3   -£
-I     0    I
 Value  of  X
           Figure 3.  Two normal distributions of a variable X.
                      Solid line:   u = 0, o = 1
                      Dotted line:  u = 0, a = 1.5
     The values  of  the mean and variance  of water  quality  parameters  could
be  practically  anything, depending upon the parameter,  when and where  it
was  measured,  etc.   In order to simplify  calculations it  is desirable  to
change (transform)  the data to obtain a mean  of  0  and a  variance of 1.   To
do so, compute a new variable
                                   16

-------
where X  is  the  original  variable with mean |j and variance o*.   Then Z has a
mean of  0 and a variance of 1.  The quantity Z is called the standard normal
deviate, and has what is known as the standard normal distribution, which is
extensively tabulated.   (From Z,  X can be computed as X - aZ + u.)

     In statistical testing one  compares a computed value with values in a
table.   There are basically two types of tests—one-sided and two-sided.  In
the coin flipping example where we were interested only in the result involv-
ing all  heads,  one  would use a one-sided test.   However, if the whole argu-
ment were repeated  without  reference  to heads specifically, but only to the
fact that the same outcome was achieved four times or ten times in a row, the
odds would  all  change.  The chances of getting four  like results  in a row is
only (l/2)(l/2)(l/2)=l/8,  because the first  flip is  immaterial;  it only
matters that the  last  three match the first one, whatever  it was.  In this
case one would  use  a two-sided test.  One often  refers to the one-sided  test
as  using one tail  (end)  of a distribution (see  Figure 4 below),  and the
two-sided test  as  looking  at both tails of the  distribution.  (By the way,
the term, two-sided,  really has  nothing to do with the two sides of a coin,
nor does a tail of the distribution refer to the tail side of a coin.)
-4r-3l  -2
                                -.Ij   Ot
                                 Value of Z
                  Figure 4.  Standard normal distribution
                               (u = 0, a = 1)
                                   17

-------
     Table 1 below  shows  one way  of tabulating  the  standard  normal
distribution.   The following  is  an  explanation of how to use the table for
testing.   Each number in the table corresponds to the shaded area in Figure 4
for a particular, positive value  of Z.

     The water quality analyst can use the table either for a two-sided test
(e.g., is there  a  trend?) or for a one-sided test (e.g.,  is the trend in-
creasing?).   The two-sided  test  would normally be used but, for simplicity,
the one-sided test will  be explained first.

     Suppose a calculated Z from some test  is  1.53,  that  the alternative
hypothesis calls for  a  one-sided test, and we want to know if Z is signifi-
cant.  We then ask, what  is the  probability  of obtaining a Z greater than or
equal to  1.53.   Referring to  Table 1,  in the  left  hand column chose the line
at 1.5; across the  top,  chose the column at  0.03, since 1.53 =  1.5 + 0.03.
The  number at  the  intersection of the line  and  the  column is 0.0630.   The
probability of obtaining a Z greater than or equal to 1.53, purely by chance,
is therefore 0.0630.   If one had previously chosen a significance level of 5%
(a = 0.05),  this value  of Z would not be  significant because the obtained
value, 0.0630, is  greater than the chosen significance level of 0.05.   That
is we could  not  say confidently that  the trend  being tested was real; the
data could well be just random numbers.

     To use the  table for a  two-sided test,  with  the same value of Z, we
first obtain the same probability from the table.   However, the probability
we have to  use is  that of  obtaining  a Z greater than or equal to 1.53, or
less than or equal  to -1.53.   This probability  is twice 0.0630 or 0.1260
since the curve is symmetrical about Z = 0.

     In summary, for  the  calculated Z of 1.53, we determined the one-sided
significance level as being 0.063 and the two-sided  significance  level  as
being 0.126.  Table 1 can also be used the  other  way around.  For a given
significance level  a of 0.025, for example, we wish to determine a value,  Z ,
                                   18

-------
                               TABLE 1
      UPPER TAIL* PROBABILITIES FOR THE STANDARD NORMAL  DISTRIBUTION
X
.0
.1
.2
.3
.4
.5
.6
.7
.8
.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
.00
.5000
.4602
.4207
.3821
.3446
.3085
.2743
.2420
.2119
.1841
.1587
.1357
.1151
.0968
.0808
.0668
.0548
.0446
.0359
.0287
.0228
.0179
.0139
.0107
.0082
.0062
.0047
.0035
.0026
.0019
.0013
.0010
.0007
.0005
.0003
.01
.4960
.4562
.4168
.3783
.3409
.3050
.2709
.2389
.2090
.1814
.1562
.1335
.1131
.0951
.0793
.0655
.0537
.0436
.0351
.0281
.0222
.0174
.0136
.0104
.0080
.0060
.0045
.0034
.0025
.0018
.0013
.0009
.0007
.0005
.0003
.02
.4920
.4522
.4129
.3745
.3372
.3015
.2676
.2358
.2061
.1788
.1539
.1314
.1112
.0934
.0778
.0643
.0526
.0427
.0344
.0274
.0217
.0170
.0132
.0102
.0078
.0059
.0044
.0033
.0024
.0018
.0013
.0009
.0006
.0005
.0003
.03
.4880
.4483
.4090
.3707
.3336
.2981
.2643
.2327
.2033
.1762
.1515
.1292
.1093
.0918
.0764
(M30)
.0516
.0418
.0336
.0268
.0212
.0166
.0129
.0099
.0075
.0057
.0043
.0032
.0023
.0017
.0012
.0009
.0006
.0004
.0003
.04
.4840
.4443
.4052
.3669
.3300
.2946
.2611
.2296
.2005
.1736
.1492
.1271
.1075
.0901
.0749
.0618
.0505
.0409
.0329
.0262
.0207
.0162
.0125
.0096
.0073
.0055
.0041
.0031
.0023
.0016
.0012
.0008
.0006
.0004
.0003
.05
.4801
.4404
.4013
.3632
.3264
.2912
.2578
.2266
".1977
.1711
.1469
.1251
.1056
.0885
.0735
.0606
.0495
.0401
.0322
.0256
.0202
.0158
.0122
.0094
.0071
.0054
.0040
.0030
.0022
.0016
.0011
.0008
.0006
.0004
.0003
.06
.4761
.4364
.3974
.3594
.3228
.2877
.2546
.2236
.1949
.1685
.1446
.1230
.1038
.0869
.0721
.0594
.0485
.0392
.0314
.0250
.0197
.0154
.0119
.0091
.0069
.0052
.0039
.0029
.0021
.0015
.0011
.0008
.0006
.0004
.0003
.07
.4721
.4325
.3936
.3557
.3192
.2843
,2514
.2206
.1922
.1660
.1423
.1210
.1020
.0853
.0708
.0582
.0475
.0384
.0307
.0244
.0192
.0150
.0116
.0089
.0068
.0051
.0038
.0028
.0021
.0015
.0011
.0008
.0005
.0004
.0003
.08
.4681
.4286
.3897
.3520
.3156
.2810
.2483
.2177
.1894
.1635
.1401
.1190
.1003
.0838
.0694
.0571
.0465
.0375
.0301
.0239
.0188
.0146
.0113
.0087
.0066
.0049
.0037
.0027
.0020
.0014
.0010
.0007
.0005
.0004
.0003
.09
.4641
.4247
.3859
.3483
.3121
.2776
.2451
.2148
.1867
.1611
.1379
.1170
.0985
.0823
.0681
.0559
.0455
.0367
.0294
.0233
.0183
.0143
.0110
.0084
.0064
.0048
.0036
.0026
.0019
.0014
.0010
.0007
.0005
.0003
.0002
*For a two-tailed significance level  of x ,  multiply probability  in
 table by 2.
Source:  Hollander and Wolfe (1973) p.  258
                                  19

-------
such that a  value  of Z obtained from our data smaller than Z  would not be
significant while  a  value  of Z equal to or  greater than Z  would be sig-
nificant.  Z   is therefore called the  critical  value  associated with the
significance level a.   In  the case of a = 0.025 and a one-sided test,  the
critical value would be 1.96 (row at 1.9 and  column at 0.06).   As before, the
probability of Z being less than or equal to  -1.96 is also 0.025.   We can
thus  say that the probability of Z  being between  -1.96 and +1.96  is
1-2(0.025) = 0.95,  or 95% of the values of  Z  are between -1.96 and +1.96.
Thus, ZG = 1.96  is the critical value for a  =  0.025 (one-sided) or a = 0.05
(two-sided).   If we  want  to determine,  directly, the critical  value for or =.
0.025 (two-sided), we  look in Table 1 for 0.0125  = a/2  and find Zc = 2.24
(row at 2.2 and column at 0.04).

     J.    Parametric or Distribution-Free?

     Suppose there are two years of monthly water quality data and we want to
test whether the water quality in one year differed from that in the other,
based on  the level of a certain pollutant.  The  classical method to test for
differences in the concentration  of the pollutant in the two years would be
to  compute a two-sample t-test.  This test,  like  all  statistical tests, is
based on a number  of assumptions about the underlying distribution from which
the  data were  drawn.   For the two-sample t-test,  these assumptions  are:
(1) the  errors  are  independent,  (2)  the  variances  are  the same,  (3) the
distribution is normal, and (4) the null hypothesis is that the two means are
the  same.  Procedures such as those that specify the form of the underlying
distribution (e.g.,  normal)  up  to  a few  parameters are  referred  to as
parametric procedures.  Many of these are based on the normal distribution or
on sampling distributions  (e.g., t-, F-distribution) derived from it.

     An  alternative  approach  is to base the test statistic on less detailed
assumptions  about  the  underlying distribution.   For example,  if the null
hypothesis merely  specifies that  the distribution of the water quality mea-
sure was  the  same  in the two years and was continuous, then a nonparametric
or  distribution-free test  can be used.   The most  widely used such test is
                                   20

-------
the Wilcoxon-Mann-Whitney two-sample rank test (see Section V-C).   Because it
does not depend on the assumed form (e.g., normal) of the underlying distri-
bution, inferences based on the Wilcoxon rank sum test may be more reliable.
On the other  hand,  a distribution-free test will  typically have lower power
to detect specific alternatives than will the appropriate parametric test if
all the assumptions  of the parametric test hold.  However, in most cases the
difference is negligible.

     In order for  some  of the commonly applied parametric statistics to be
valid, the data should  be approximately normally distributed, or capable of
being transformed so that they become so.   A check for normality is discussed
in Section VI.  Unfortunately, most  water  quality data are  often  far from
being  normally  distributed.   There are two  drawbacks to transforming the
data.  One is that it may be difficult to select a suitable transformation.
The second is that the transformation may induce a scale that is difficult to
interpret, or might change some of the other assumptions.

     If the classical parametric  statistical  methods are not valid for the
data,  we  must rely  on  the less  familiar  nonparametric methods.   These
procedures require fewer assumptions, so they have wider applicability.   Most
commercial statistical computer program packages such as SAS, BMDP, and SPSS
include some  nonparametric methods  for data analysis, although these often
are  restricted  to hypothesis  testing applications.  (They  may indicate
whether or  not  a  trend  exists,  but  not provide  estimates or confidence
intervals of its magnitude).

     In addition to  the  assumption  of normality (or  another specific dis-
tribution),  statistical procedures—particularly  parametric procedures—are
based  on  a  number of other assumptions.   The most  common  of  these  are
constant variance, linearity of  a trend,  and  independence  of errors.   The
distribution-free procedures are less sensitive to violations of the assump-
tion of equal variances.   Further, the distribution-free procedures generally
deal only with  a  monotonicity  requirement for a trend rather than a specif-
ically linear one, so they may be better for testing.  However, procedures
such as regression can  easily  accommodate various forms of  a trend such as
                                   21

-------
polynomials in estimating  the  magnitude  of the trend.   Both types of proce-
dures assume  independence  of  errors and can be sensitive to correlation of
errors.   Correlated errors  call  for more complex time series analysis that
specifically incorporates the error structure.

     Many of  the methods described  in this document are nonparametric.  How-
ever, some of  the  more common parametric tests are also presented.   If the
data appear to be  normally distributed or can be transformed to  be so, the
user may prefer these more widely known tests.

     K.   Plotting

     Before using  any  statistical  method to test for trends, a recommended
first step is  to  plot the data.  Each point should be plotted against time
using an  appropriate scale  (e.g.,  month or year).   Figure 2, presented
earlier, is an example of such a plot.

     In general, graphs  can provide highly effective illustrations of water
quality trends, allowing the analyst to obtain  a much grealer feeling for the
data.  Seasonal fluctuations or sudden changes, for example, may become quite
visually evident,  thereby  supporting the analyst in his decision of which
sequence of statistical  tests  to use.  By the  same token, extreme values can
be  easily  identified  and  then  investigated.    Plots  of water  quality
parameters, expressed as raw concentrations or  logarithms,  loadings, or water
quality index, can easily  be plotted against  time using STORET.  Also, SAS
features a plotting  procedure, less sophisticated, however; for  the  reader
familiar with  SAS  GRAPH,  a series  of computer graphics is  available  if the
appropriate hardware is on hand.

     Another area  where  plotting plays  an important and instructive role is
when testing  for  normality of data.  A relatively simple plotting procedure
is available  to the  user,  in  the case when the data base is not  too  large,
since the plotting is done by hand.   All  one needs is probability paper.   The
use of probability paper is illustrated with an example in Section VI.
                                   22

-------
     L  Organization of this Document

     The normal-theory or parametric procedures are discussed in Section IV,
while  the  distribution-free procedures  are  grouped  into  Section V.   In
Section IV, after methods to deseasonalize  data are presented, a  test for
long-term trend is  discussed,  then a test for step trend (change), followed
by a test  for  both  types of  trend.  Finally, time series analysis  is briefly
discussed in the last subsection.

     Section V is organized in a similar fashion.   After discussing tests for
randomness  in general  terms,  distribution-free  methods  applicable   to
non-seasonal data are  presented first for long-term trends and step trends
(Subsection B  and C,  respectively).   Next,  tests for  long-term  trends and
step trends in the  case of seasonal  data are discussed in Subsections  D and
E,  respectively.   The  section closes  with a  brief  discussion of  the
difficulties of dealing with the data when both types of trends are present.

     Figure 5  below presents a decision  tree  indicating how to determine
which  procedures are  appropriate  based  on the characteristics of  the  data
available to  the analyst.  Thus, one can  follow this as a road map and refer
to the  sections where  the methods  to answer each question on the diagram are
discussed.

     The first step  is to  determine whether the data are on an annual  basis
or were more  frequently recorded.   If they are more  frequent  than annual,
then seasonality is  an important consideration.   In general, one  must test
for  seasonality  if  the data are  on  a  quarterly or  monthly basis.   If
seasonality is found,  it must be  removed before a test for change or  trend
can be done.  Tests for seasonality can be based on runs tests, turning point
tests, serial  correlations,  or other general tests for randomness.  Methods
for removing seasonality,  or for accounting for it in the analysis, are given
in Sections IV and V, where appropriate.
                                   23

-------
IM
                                       WATER QUALITY DATA
               ANNUAL DATA
                  NORMAL?
              Yes
No
       Yes
                             MONTHLY OR
                             QUARTERLY DATA
                                                          SEASONAL?
                                                           No
                                NORMAL?
                                                                     Yes
                                                DESEASONALIZE
Yes
No
Test for Test for
Change Trend
f-Test Linear
Regression
V. J
Parametric
Procedures
Test for Test for
Change Trend
Rank Sum Kendall's
Test Tau Test
V
Distribution-Free
Procedures
Test for Test for
Change Trend
t-Test Linear
Regression
V. J
V
Parametric
Procedures
Test for Test for
Change Trend
Aligned Seasonal
Rank Sum Kendall's
Test Test
^ J
Distribution-Free
Procedures
                                         Figure 5 - Decision tree.

-------
     . >    ^ y  y.'.,.  *  U J ->  M • ,i LI * . y L _  . •>  )1 L , i, ^L L— u  l»W  -W. J'i U i'v - ^LiiO 
-------
that there is  no  seasonality (as with annual data)  the  test procedure is
based  on  Kendall's  tau.   Application  of  this  test  is presented  in
Section V-B.   If seasona?ity is present,  then a modified  version of Kendall's
test can be used.   It is presented in Section V-D.

     A  number  of  special  problem areas  are covered  in Section VI.  These
include how to handle missing observations; what to do with observations that
appear  strange  (outliers);  how  to  test for  normality;  how to deal with
situations, sometimes fairly  common,  where the parameter value of  interest
could not be  measured because it was less  than  the  detection limit of the
measurement technique; and how to make corrections  to the data for variations
in flow (especially important with concentration parameters).

     The document  concludes  with a selected  bibliography.   The  papers and
texts  listed  are  not  all  referenced directly, but provide additional
information on one or more of the tests discussed here.  For general coverage
of statistical  testing, the texts of  Snedecor and Cochran (1980) and Johnson
and Leone  (1977) are  suggested;  the books by  Siegel  (1956),  and by  Hollander
and Wolfe (1973),  provide useful presentations of many nonparametric methods.

     Also  included  in the  bibliography  are  references  to  several  common
packages of statistical computer programs.   Perhaps  the most versatile and
universally available to  the  States is SAS.  The SAS software is interfaced
to STORET and  may  be used by STORET users.  Documentation of how to use SAS
within  STORET  is available  through the EPA.  Where possible, references are
made  in this  document  to  the  available  procedures  in  SAS, for readers
familiar with or having access to SAS.

     Finally,  it should be emphasized that the examples used in the following
sections are  hypothetical  examples.  Their purpose  is to demonstrate the
different statistical  techniques and  how  to arrive at  a test statistic
necessary to perform  a  specific test.  The reader should not be lead to be-
lieve that one  year of monthly data, for example, is  sufficient to apply a
given test, because of the possibility of seasonal  fluctuations.
                                   26

-------
IV.  PARAMETRIC PROCEDURES

     A.  Methods to Deseasonallze Data

     If  the  data exhibit a seasonal  cycle--typically  an annual cycle for
monthly  or quarterly  data—then  this seasonal effect must be removed or ac-
counted  for before testing for a long-term trend or a step trend.   The method
proposed here  is simple  and  straightforward.  Assume  that the data are
monthly  values.  To  remove  the possible seasonal  effect, calculate the mean
of the  observations  taken in  different years, but  during the same months.
That is, calculate the mean of the measurements in January, then the mean for
February, and so on for each of the 12 months.

     After calculating the 12 monthly means, subtract the monthly mean from
each observation taken during that month.  These differences will then have
any seasonal effects  removed—the differences are thus deseasonalized data.
If data were taken  on  a quarterly  basis,  one would  calculate the four
quarterly means, then subtract the mean  for  the  first  quarter  from each of
the first quarter observations, subtract the mean for the second quarter from
each of  the second quarter observations and so on.  The resulting differences
can be  used  subsequently for  testing for a long-term trend or a step trend.

     In  the  parametric framework,  deseasonalizating can also  be accomplished
by using multiple  regression  with indicator  variables  for the months.  Using
multiple regression,  as   described  in Subsection D  below,  this method of
seasonal adjustment  can  be performed at  the  same  time  that a  regression  line
is fitted to the data.

     Many other approaches  to deseasonalize  data exist.  If the  seasonal
pattern  is regular  it may be  modeled with a  sine  or cosine function.  Moving
averages can be used, or differences (of order 12  for  monthly  data)  can be
used.    Time  series  models may include rather complicated methods for desea-
sonalizing the  data.  However, the method described above should  be adequate
for the  water quality data.   It has the advantage of being easy to understand
                                   27

-------
and apply, and of providing natural estimates of the monthly effects via the
monthly means.

     B.  Regression Analysis

     A parametric procedure commonly  used to test for trends is regression
analysis.   As stated  earlier,  however,  water quality data often do not meet
the underlying assumptions such as constant variance and normally distributed
error terms,  so  regression  analysis should be used with great caution.  An
analysis of the  residuals (i.e.,  differences between the  observed and the
regression-predicted  values of the water quality  parameter) is therefore
recommended.

     With this proviso  in mind,  the following example will  illustrate the
method.  Suppose an  annual  water  quality index were available for a 7-year
period.
Year:
Year No.:
WQI:
1977
1
46
1978
2
52
1979
3
42
1980
4
44
1981
5
39
1982
6
45
1983
7
40
We can  use  the years themselves in the numerical calculations which follow,
but it  is much easier and totally equivalent to  just number them from 1 to  7
and use  the  numbers  instead.  Note that if a year is missing, then the other
years should  be numbered accordingly (i.e., the  corresponding number will be
missing also).

     First,  it  is  often  a  good idea to plot the data.   Figure 6 shows these
sample data as points on a plot of WQI  versus year,  along with two calculated
points described subsequently.

     The calculations with the data to determine the regression line, and to
test it  for  statistical  significance  (i.e.,  whether the slope and intercept
of the  calculated  line  are  statistically different from zero), can be done

                                   28

-------
  60-
  50
 X
I
  40
 §  30
a
 V
I
  20
   10
                                           I
              77
78,
79        80
    Year
81
82
83
               Figure 6 - Sample Linear Regression  Line
                                   29

-------
manually.  However, the formulas are fairly lengthy and the computations are
quite tedious  and therefore  error-prone.   These calculations  are  almost
always  done  on  a computer  or hand  calculator.   Most scientific  hand
calculators,  in  fact,  have  the formulas built in, so we only have to enter
the data, pairs  of year  and WQI, into the calculator and then read out the
answers.
                                        i
     The equation for the regression line is:

                                 Y = a + bX

where X is the year number (or year), Y is the WQI;  a is called the intercept
of the line,  and b is called the slope.   Upon entering all  the X and Y values
(the exact entry procedure  depends  upon the specific hand calculator being
used), we read out the values of a and b.

     Using the sample  data,  we get a - 49 and b = -1.25.  Thus, the fitted
regression line  is:

                              Y = 49 - 1.25X .

The regression line  can  be  drawn by plotting two arbitrary points from this
regression equation and  connecting  them with a straight line.   For example,
choosing X = 1,

                         Y = 49 - 1.25(1) = 47.75
and for X = 6,
                         Y = 49 - 1.25(6) = 41.5 .

These points are also  plotted  in Figure 6  (as small o's).  The straight line
through these two points  is  the calculated regression line.
                                   30

-------
     The fitted regression  equation  can be used to predict the value of the
water quality  index  Y  which corresponds to a given  value of X.  The dif-
ference between the  observed  value of Y and the predicted value of Y for a
given X is  called the  residual at this  point X.   In our example, we have

  Year No.:             1       2       3      4       5       6       7
  Observed WQI:       46      52      42      44      39      45      40
  Predicted WQI:       47.75   46.50   45.25   44.00   42.75   41.50   40.25
  Residual:            -1.75    5.50   -3.25   0      -3.75    3.50   -0.25

The  residual  measures  the discrepancy  between  the observed value and  the
value obtained from  the  regression line.   If the points were perfectly  lined
up,  each residual  would  be zero and we would have a perfect fit.   Note that
the  residuals  always add up to zero  (rounding  off aside).  It  is these re-
siduals that  should  be analyzed to test whether they are normally distrib-
uted.  The  normal distribution  assumption can  be graphically  checked by
either using  probability paper as  explained in  Section VI-C or  by performing
a test of  fit using a computer program such as the  UNIVARIATE  procedure in
SAS.

     In this  example,  the WQI appears to  have  a decreasing trend.  But, we
must then ask, is the slope of the regression line statistically significant?
In  other words,  is the decrease in the water quality index, by an amount of
1.25 per year, significantly  different  from zero?  To test for  this, compute
the  ratio  of the  slope, b,  to  its  standard deviation and  test  using a
t-table.   This testing procedure is automatically done in SAS as shown in the
appendix.   When using  a  hand calculator, however, we use an equivalent test
via  r, the sample correlation coefficient.   The  Student's variable t with n-2
degrees of freedom is
                              t =  r
                                   31

-------
In the example  data,  r = -0.619 and n = 7, and we obtain t = 1.762.  Using
Table 2 below and  reading  at the intersection of the line headed by 5 (7-2
degrees of freedom) and the column headed 0.050 (two-sided significance level
of 5%), we  read a critical value of 2.571.  Since t of 1.762 falls between
-2.571 and  +2.571, we  conclude that the correlation coefficient,  r,  and
consequently the slope, b, are not statistically different from zero,  or that
the apparent downward trend is not statistically significant.

     More details on regression analysis are available in textbooks by Draper
and Smith (1981),  first chapter, and Chatterjee and  Price  (1977)  who also
provide extended discussions on the analysis of residuals.

     Regression  analysis  can be  performed  with SAS using  one  of  several
procedures  with their  appropriate  options.   One SAS procedure would be
PROC REG which  features an extensive output (see SAS User's Guide:  Statis-
tics, p.  40).  An example with output is included in the appendix.

     C.  Student's T-Test

     Student's  t-test  is  probably the  most widely used  parametric test to
compare two sets of data to determine whether the populations from which they
come have means that differ significantly from each other.   This test would
commonly be  used to  determine if a  change has occurred, as  reflected in data
obtained before  and  after some event such as the coming on-line of a sewage
treatment plant.   Student's  t-test, being a parametric procedure,  makes as-
sumptions about  the  underlying  distribution of the population from which  the
data are a  sample.   The basic assumption required to formally  develop this
test procedure  is that  the  population be  normally  distributed; however,
moderate departures  from normality will not seriously  affect  the  results.
When using  a  t-test,  one should assure that this  basic assumption is not
violated.   Mathematical  transformations of the  data--e.g.,  log transform,
exponential  transform,  etc.—can often  be helpful  in order  to  arrive at a
normally distributed sample.   In comparing two samples,  an additional  require-
ment besides  normality is  that the two distributions have  equal variances.
                                   32

-------
                           TABLE 2

      TWO-TAILED*  SIGNIFICANCE  LEVELS  OF STUDENT'S T
                                            /T\
   Degrees
Probability of a Larger Value. Sign Ignored
VI
Freedom
1
2
3
4
•* 5

6
7
8
9
10

II
12
13
14
IS
16
17
18
19
20

21
22
23
24
25

26
27
28
29
30


35
40
45
50
55

60
70
80
90
100
120
X
1 0.500
^T.ooo
0.816
.765
.741
.727

.718
.711
.706
.703
.700

.697
.695
. .694
.692
.691
.690
.689
.688
.688
.687

.686
.686
.685
.685
.684

.684
.684
.683
.683
.683


.682
.681
.680
.680
.679

.679
.678
.678
.678
.677
.677
.6745
; 0.400
' 1.376
1.061
0.978
; .941
; .920

.906
.896
.889
.883
.879

• .876
: .873
.870
.868
.866
: .865
.863
.862
.861
.860

i .859
.858
; .858
; .857
.856

.856
.855
.855
.854
.854


.852
.851
.850
.849
.849

.848
.847
.847
.846
.846
.845
.8416
0.200
3.078
1.886
1.638
1.533
1.476

1.440
1.415
1.397
1.383
1.372

1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325

1.323
1.321
1.319
1.318
1.316

1.315
1.314
1.313
1.311
1.310


1.306
1.303
1.301
1.299
1.297

1.296
1.294
1.293
1.291
1.290
1.289
1.2816
0.100
6.314
2.920
2.353
2.132
2.015

1.943
1.895
1.860
1.833
1.812

1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725

1.721
1.717
1.714
1.711
1.708

1.706
1.703
1.701
1.699
1.697


1.690
1.684
1.680
1.676
1.673

1.671
1.667
1.665
1.662
1.661 :
1.658
1.6448 '
0.050
12.706
4.303
3.182
1776
C2SP

2.447
2.365
2.306
2.262
2.228

L20I
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086

2.080
2.074
2.069
2.064
2.060

2.056
2.052
2.048
2.045
2.042


2.030
2.021
2.014
2.008
2.004

2.000
1.994
1.989
1.986
1.982
1.980
1.9600
! 0.025
125.452
I 6.205
! 4.176
: 3.495
! 3.163
1
j 2.969
! 2.841
1 2.752
2.685
: 2.634
[
i 2.593
' 2.560
• 2.533
! 2.510
| 2.490
! 2.473
. 2.458
! 2.445
j 2.433
' 2.423
i
! 2.414
2.406
2.398
• 2.391
: 2.385
i
! 2.379
: 2.373
. 2.368
i 2.364 .
i 2.360
'

. 2.342
! 2.329
i 2.319
: 2.310
; 2.304

: 2.299
2.290
2.284
2.279
2.276
: 2.270
1 2.2414
0.010
63.657
9.925
5.841
4.604
4.032

3.707
3.499
3.355
3.250
3.169

3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845

2.831
2.819
2.807
2.797
2.787

2.779
2.771
2.763
2.756
2.750


2.724
2.704
2.690
2.678
2.669

2.660
2.648
2.638
2.631
2.625
2.617
2.5758
0.005 !
1
14.089 i
7.453 j
5.598
4.773 ;
!
4.317 i
4.029 :
3.832 i
3.690 •
3.58i :

3.497 :
3.428
3.372 •
3.326 !
3.286 !
3.252 ,
3.222 :
3.197 ;
3.174 :
3.153 i

3.135 !
3.119
3.104 :
3.090 :
3.078

3.067
3.056 <
3.047 i
3,038
3.030


2.996
2.971
2.952
2.937
2.925 ;

2.915
2.899
2.887
2.878
2.871
2.860
2.8070 •
0.001

31.598
12.941
8.610
6.859

5.959
5.405
5.041
4.781
4.587

4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.850

3.819
3.792
3.767
3.745
3.725

3.707
3.690
3.674
3.659
3.646


3.591
3.551
3.520
3.496
3.476

3.460
3.435
3.416
3.402
3.390
3.373
3.2905
*for a one-tailed significance level a, read in column 2a.
 df = n-1
Source:  Snedecor and Cochran (1980) p. 469

                               33

-------
This means that the  distributions  of the two populations are  identical  in

shape, although they may differ in  location (in  other words,  only their means

may differ).


Procedure


     The following is a  series of  18 concentration  measurements  for  total

chromium, 8  before and  10  after implementation  of  a pollution  control

measure.
                      "Before"                 "After"
                   Concentrations          Concentrations
                         99                      59
                        111                      99
                         74                      82
                        123                      51
                         71                      48
                         75                      39
                         59                      42
                         85                      42
                                                 47
                                                 50
     First it  is  necessary to calculate the mean and standard deviation of

each of  the  data  sets.   The mean (or  average)  is  just the sum of all the

values divided by the number of values, n.  Thus, for the "before" data, the

mean, mD, is:
       D


                  mfl - (99 + 111 + ...  + 85)/8 = 87.1 ug/£


The  standard deviation  involves  using  the sum of the squares of each of the

values.  Specifically,:
               /nB (992 + 1112 + ...  + 852) - (99 + 111 + ...  +

              V                         n0 (n0 - 1)
                                   34

-------
where ng is the number of "before" data points (8).   Carrying out all  the cal-
culations gives:
                            SB = V48T8 = 22.0

Thus, we can compute:
          "Before"   mB = 87.1 ug/£;   Sg = 22.0 ug/£;   S| = 481.8;   ng = 8
          "After"    mA = 55.9 ug/£;   SA = 19.5 ug/£;   S* = 380.1;   nA = 10.

The t-test for a step trend in this data set is simply a method of estimating
whether the means  mB  and m. of the two partitions ("before" and "after") of
the data  set  differ significantly at a chosen level  of significance a.   The
test statistic t is:
                            t _    "B -
                                Sp
where S , the pooled standard deviation, is computed as:
assuming S§ and S? are not statistically different.   To test whether the "be-
fore" concentrations are on the average higher than the "after" concentrations
(i.e., one-sided test) at the a-level of significance, we compare the computed
t with tabulated values  of the Student's t distribution at probability  level
(1-a) with (nB+nA-2) degrees of freedom.

          In the example,
                         7 • 481.8 * 9
                                   16

                                   35

-------
thus:
t =   87.1 - 55.9
                                      .
                         20.6 Vl/8 + 1/10
The critical t at a = 0.05 (5%) with (8 + 10 - 2) = 16 degrees of freedom for
a one-sided test  is t  = 1.746  (see Table 2 above).  Since the computed t of
3.18 is  larger, one concludes that, indeed, the "before" concentrations were
higher on the  average  than the "after" concentrations,  or  that  there is a
significant decreasing step  trend  in  the data at  the  95% confidence level
when the new treatment facility went into operation.

     In the above example, it was assumed (without proof) that the variances,
S|  and  S?,  were  not  statistically different.   This assumption  can  (and
should) be checked by using an F-test.   Calculate:


                             F - SB _ 481.8 _ , 2? .
                             f ~ Sj[   380.1 ~ *""

Note that due  to  tabulation  restrictions, F is always  computed as the larger
variance over  the smaller.   That  is,   if  S.  were  greater  than  $„,  then
F = SA/SB-

     If  the population variances  are  the same, one would expect F =  1.  To
test whether the  calculated  F  is statistically greater than 1, an F table  is
used.   Using Table 3 below, which is applicable at the a = 0.05 level  of sig-
nificance,  look up  the  critical F value in the column headed by 7 (= nD-l)
                                                                       D
and the row labeled 9 (=n.-l).  The critical F value in this example is 3.29.
If  the computed  F were greater than 3.29,  then we would say with 95% confi-
dence that  the two variances were different,  and  so the Student's t-test
could not be used.  In our case F  is less than the critical value, we there-
fore cannot confidently  say  that they  are different,  and so  we accept the
assumption of equal variances in the "before" and "after" groups.

     In  the case  of unequal  variances, when  the Student's  t-test cannot be
used,  only an  approximate  t-test such  as the  Behrens-Fisher test (Snedecor
and Cochran (1980) p.  97) can be used to compare the two means.

                                   36

-------
                                                          TABLE  3
                              THE 5% (ROMAN TYPE) AND 1% (BOLDFACE TYPE) POINTS FOR THE DISTRIBUTION OF F
Co
n « i

r
2|
.v
4,
5

ft;
!
ft
> 9;
10:
II;
12;
"1

1
161
4,052
18.51
98.49
10.13
34.12
7.71
21.20
661
16.26
5.99
13.74
5.59
12.25
5.32
11.26
5.12
10.56
4.96
10.04
4.84
9.65
4.75
9.33
4.67
9.07

2
200
4,999
19.00
99.00
9.55
30.82
6.94
18.00
5.79
13.27
5.14
10.92
4.74
9.55
4.46
8.65
4.26
8.02
4.10
7.56
3.98
7.20
3.88
6.93
3.80
6.70

3
216
5,403
19.16
99.17
9.28
29.46
6.59
16.69
541
12.06
4.76
9.78
4.35
8.45
4.07
7.59
3.86
6.99
3.71
6.55
3.59
6.22
3.49
5.95
3.41
5.74

4
225
5,625
19.25
99.25
9.12
28.71
6.39
15.98
5 19
11.39
4.53
9.15
4.12
7.85
3.84
7.01
3.63
6.42
3.48
5.99
3.36
5.67
3.26
5.41
3.18
5.20

5
2JO
5,764
19.30
99.30
9.01
28.24
6.26
15.52
5.05
10.97
4.39
8.75
3.97
7.46
3.69
6.63
3.48
6.06
3.33
5.64
3.20
5.32
3.11
5.06
3.02
4.86

6
234
5,859
19.33
99.33
8.94
27.91
6.16
15.21
495
10.67
4.28
8.47
3.87
7.19
3.58
6.37
3.37 <
5.80
3.22
5.39
3.09
5.07
3.00
4.82
2.92
4.62
1
7
2J7
5,928
19.36
99.36
8.8*
27.67
6.09
14.98
488
10.45
4.21
8.26
3.79
7.00
3.50
6.19
d5:
5.62
3.14
5.21
3.01
4.88
2.92
4.65
2.84
4.44

8
239
5.981
19.37
99.37
8.84
27.49
6.04
14.80
482
10.29
4.15
8.10
3.73
6.84
3.44
6.03
> 3.23
5.47
3.07
5.06
2.95
4.74
2.85
4.50
2.77
4.30
flj df in Numerator
V 10 II 12 14 16 20
241 242 243 244 245 246 248
6,022 6.056 6,082 6,106 6,142 6.169 6.208
19.38 19.39 19.40 19.41 19.42 19.43 19.44
99.39 99.40 99.41 99.42 99.43 99.44 99.45
8.81 8.78 M.76 8.74 8.71 8.69 8.66
27.34 27.23 27.13 27.05 26.92 26.83 26.69
600 5.96 5.93 5.91 5.87 5.84 5.80
14.66 1454 14.45 14.37 14.24 14.15 14.02
4 78 4 74 4 70 4 68 4 64 4 60 4 56
10.15 10.05 9.96 9.89 9.77 9.68 955
4.10 4.06 4.03 400 3.96 3.92 3.87
7.98 7.87 7.79 7.72 7.60 752 7J9
3.68 3.63 3.60. 3.57 3.52 3.49 3.44
6.71 6.62 6.54 6.47 6J5 6.27 6.15
3.39 3.34 3.31 3.28 3.23 3.20 3.15
5.91 5.82 5.74 5.67 5.56 5.48 5.36
3.18 3.13 3.10 3.07 3.02 2.98 2.93
5.35 5.26 5.18 5.11 5.00 4.92 4.80
3.02 2.97 2.94 2.91 2.86 2.82 2.77
4.95 4.85 4.78 4.71 4.60 4.52 4.41
2.90 2.86 2.82 2.7V 2.74 2.70 2.65
4.63 454 4.46 4.40 4.29 4.21 4.10
2.80 2.76 2.72 2.69 2.64 2.60 2.54
4.39 4.30 4.22 4.16 4.05 3.98 3.86
2.72 2.67 2.63 2.60 2.55 2.51 2.46
4.19 4.10 4.02 3.96 3.85 3.78 3.67

24
249
6.234
19.45
99.46
8.64
26.60
5.77
13.93
4 53
9.47
3.84
7.31
3.41
6.07
3.12
5.28
2.90
4.73
2.74
4J3
2.61
4.02
2.50
3.78
2.42
359

30
250
6,261
19.46
99.47
8.62
2650
5.74
13.83
4 50
9.38
3.81
7.23
3.38
5.98
3.08
5.20
2.86
4.64
2.70
4.25
2.57
3.94
2.46
3.70
2.38
351

40
251
6,286
I«M7
99.48
8.60
26.41
5.71
13.74
446
9.29
3.77
7.14
3.34
5.90
3.05
S.ll
282
4.56
2.67
4.17
2.53
3.86
2.42
.3.61
2.34
3.42

50
252
6.302
19.47
99.48
8.58
26-35
5.70
13.69
444
9.24
3.75
7.09
3.32
5.85
3.03
5.06
2.80
451
2.64
4.12
2.50
3.80
2.40
356
2.32
3J7

75
253
6423
19.48
99.49
8.57
26.27
5.68
13.61
442
9.17
3.72
7.02
3.29
5.78
3.00
5.00
2.77
4.45
2.61
4.05
2.47
3.74
2.36
3.49
2.28
3 JO

100
253
6.334
IV.49
99.49
8.56
26.23
5.66
1357
440
9.13
3.71
6.99
3.28
5.75
2.98
4.96
2.76
4.41
2.59
4.01
2.45
3.70
2.35
3.46
2.26
3.27

200
254
6.352
19.49
99.49
8.54
26.18
5.65
1352
438
9.07
3.69
6.94
3.25
5.70
2.96
4.91
2.73
4J6
2.56
3.96
2.42
3.66
2.32
3.41
2.24
3JI

500
254
6.361
(9.50
9950
8.54
26.14
5.64
13.48
4 37
9.04
3.68
6.90
3.24
5.67
2.94
4.88
2.72
4.33
2.55
3.93
2.41
3.62
2.31
3.38
2.22
3.18

<*
254
6J66
19.50
9950
8.53
26.12
5.63
13.46
4.36
9.02
3.67
6.88
3.23
5.65
2.93
4.86
2.71
4.31
2.54
3.91
2.40
3.60
2.30
3J6
2.21
3.16

"2
1
2
3
4
5

6
7
8
9
10
II
12
II
        MI:  degrees of freedom 1n Numerator of F
        n2:  degrees of freedom 1n Denominator of f
        Source:  Snedecor  and  Cochran (1980) p. 480

-------
     A final note:  many  hand calculators will automatically compute means
and standard deviations, making the above calculations quite easy.   Also,  the
t-test can be performed using SAS.   The procedure PROC TTEST may be used (see
SAS User's Guide:   Statistics,  p.  217).   An example output is shown in the
appendix.

     D.   Trend and Change

     Over a  long period  of  time,  a  situation could arise where  both  a
long-term trend and a step trend due to the implementation of a new treatment
facility might be  present in the series  of data.   In such a case, the  two
methods described  above (regression analysis and t-test procedure), would be
combined to test for both types of trends.  Practically,  this is done by per-
forming a multiple  regression analysis where the dependent variable is the
water quality parameter of interest,  one independent variable  is  the  time
(e.g., month)  and the  second independent variable is a  "dummy" variable
taking the  value 0  before the known  change and  1 after  the change.   In
addition, if  a  seasonal effect is suspected (this  might  be detected when
plotting the data),  then  the data need  to be  deseasonalized by  introducing
12 indicator variables.  The first one would take the value 1 for January and
0 otherwise; the  second would be 1 for  February and 0 otherwise,  etc.  The
regression equation can be written as:

          Y = a1X1 + a2X2 + . . . + a12X12 * bC * cT

where     a. is the effect of month i, i = 1.....12
           D is the change effect, and
           c is the slope for the long-term trend.
Algebraically, the above equation is equivalent to

          Y - a.    - aX  - ... - aX   = bC + cT.
The left-hand  side  of this equation is just the deseasonalized data, while
the right-hand  side  is the contribution due to change and trend.  The  next
                                   38

-------
step  is  to  estimate all 14 parameters  (a,,  ....  a,-,  b and c)  via  least
squares method and  to  test whether b and c (the coefficients for change and
trend, respectively) are significantly different from zero.

     A complete  example,  based  on  the data plotted  in Figure 1 of Sec-
tion III, is shown in the appendix using the multiple regression procedure of
SAS.

     E.  Time Series Analysis

     Time series  analysis is  a  set of parametric procedures  that can be
applied to  a  series of ordered observations of a quantitative measure, such
as  a  water  quality  indicator,  taken  at  regular points in time.  Although  not
essential,  it  is  common for the points to  be  equally spaced in time,  for
example,  monthly.  The objective in time series analysis is  to determine from
the set  of  data  the pattern of  change  over time (e.g., time trends,  sea-
sonal ity,. cyclical  variations, etc.).   The various  measured patterns  are
extracted one  by  one until the remaining variation  in  the  data is purely
random.  When this has been done, all the meaningful  information contained in
the original  data has been "captured,"  and  the random component that remains
is, by definition, worthless for forecasting.   In general,  the various trends
and patterns  can  then  be used to forecast  probable  future  behavior of the
series,  and confidence intervals can  be  computed for future projections.

     Various  analytical  methods  have  been  developed to decompose  a time
series into trend, seasonal, change, and irregular components.   These methods
are fairly complex,  nearly always require special computer programs,  and thus
are beyond  the scope of these  guidelines.   The  interested reader is  referred
to  standard textbooks on  time  series such as Box  and Jenkins (1970), Kendall
and Stuart  (1966),  and  Glass et  al.  (1975)  or to  the publications by Box  and
Tiao (1975) and Schlicht (1981).

     Time series analysis  is  not suitable  for use with many water quality
data bases because of missing data, values reported as below detection limits,
and changing  laboratory techniques  (see van Belle and Hughes (1982)).   Also,
                                   39

-------
time series methods generally require several years of monthly observations.
For these reasons,  the  methods  suggested here are generally more practical
for analyzing water quality data.
                                   40

-------
V.  DISTRIBUTION-FREE METHODS

     A.  Runs Tests for Randomness

     Given a series of measurements of a water quality parameter or indicator
derived therefrom, the  question  might be asked if the data vary in a random
manner or  if they  indicate a long-term trend or a seasonal or other periodic
fluctuation.  An easy  test  to  detect nonrandom components is the runs test
illustrated in the following  simple example.   For a  given  constituent,  an
average monthly water  quality  index (WQI) has been computed and categorized
as either bad (B) or good (G).   The figures are:
Month:  123456789    10    11    12
WQI:    GGG/BBBBB/G     G     G     G

In  statistical  terms,  the  above  question can  be formulated  as  a null
hypothesis:  the G's and B's occur in random order; versus an alternative:
the order of the G's and B's deviates from randomness.

     Each  cluster  of  like observations is called  a  run.   Thus,  there are
3 runs  in  the  series  of data above.   Let n^S  be the number  of  B's,  and
n2 - 7  the  number  of G's (nx denotes  the  smaller of the two numbers).   If
there are  too  many or  too few  runs, then  the assumption that  the  sample  is
random  is probably not correct.   If very few runs occur,  a time trend or some
bunching due to  lack of independence  is  suggested.   If  a  great many runs
occur,  systematic  short-period  cyclical  fluctuations seem to be influencing
the WQI data.  Tables have been developed for n^ and n2 up to 20 showing proba-
bility  levels for the runs test, and can be found in Langley (1971).   An example
showing 5%  probability  levels is included  in Table 4.  For nx = 5  and n2  = 7
(at arrow)  we  use  the table  as  follows:   reject  the  null hypothesis that  the
sample  is  a random ordering  of  G's and B's if the  observed number  of runs is
equal to or less than the smaller number in the table (i.e., 3),  or is equal  to
or greater  than  the  larger number in the table (i.e., 11).   Since we have a
series  with  three  runs,  we conclude that  the fluctuations  are not random;

                                  41

-------
                     TABLE 4
r TABLES SHOWING 5% LEVELS FOR RUNS TEST
«l
2
2
3
3
3
4
4
4
4





5
5
6
6
6
6
6
7
7
7
7
7
7
7
8
8
8
8
8
8
9
9
9
9
9
9
9
10
10
10
10
»t
2-11
12-20
3-5
6-14
15-20
4
5-6
7
8-15
16-20
5
6
7-8
9-10
11-17
18-20
6
7-«
9-12
13-18
19-20
7
8
9
10-12
13-14
IS
16-20
8
9
10-11
12-15
16
17-20 "
9
10
11-12
13
14
15-17
18-20
10
11
12
13-15
No.
«•»
2
—
2
3
—
2
2
3
4
2
3
CT~
3
4
5
3
3
4
5
6
3
4
4
5
5
6
6
4
5
5
6
6
7
5
5
6
6
7
7
8
6
6
7
7
of runs
__
__
—
__
—
—
9
_
—
_J[r.
10
10
TD
IM^^^^^^
—
_
11
12
13
_
—
13
13
14
14
15
15
- -
14
14
15
16
17
17
15
16
16
17
17
18
18
16
17
17
18
Iff III
10
10
10
11
II
II
II
II
II
II
12
12
12
12
12
13
13
13
13
13
14
14
14
14
14
14
15
15
15
15
IS
16
16
16
16
17
17
17
17
18
18
18
19
20
16-18
19
20
11
12
13
14-15
16
17-18
19-20
12
13
14-15
16-18
19-20
13
14
15-16
17-18
19-20
14
IS
16
17-18
19
20
IS
16
17
18-19
20
16
17
18
19-20
17
18
19
20
18
19
20
19-20
20
No.
8
8
9
7
7
7
8
8
9
9
7
8
8
9
10
8
9
9
10
10
9
9
10
10
II
11
10
10
11
11
12
11
11
11
12
II
12
12
13
12
13
13
13
14
of runs
19
20
20
17
18
19
19
20
20
21
19
19
20
21
22
20
20
21
22
23
21
22
22
23
23
24
22
23
23
24
25
23
24
25
25
25
25
26
26
26
26
27
27
28
   Source:  Langley (1971) p. 325



                        42

-------
that is, there  is  only a small probability of obtaining three or fewer runs
(less than once  in 20 times) if the sample was actually random.   Should the
observed number  of runs  fall between the two values given in the table, then
we can  accept the  sample as being random.  If we come across a dash (-) in
the table, it means that a 5% probability  level  cannot be  reached in the
particular circumstances, regardless of the number of runs.

     If the runs test is used for data sets where n2 is greater than 20, then
rather than using these special types of tables, a different approach is used
where a Z-statistic is computed.   The formula is:
                               r-^*l
                                           - N
                                N  M  N2 - N '

where N  is  simply (nj + n2), and r  is  the number of runs (3 in the above
example).
     The notation, I  |,  means  use the absolute value of what is within—that
is, make the resulting number positive.   For example, if n: = 6, n2 = 24,  and
there are 10 runs, then:
                                 30
                                             =   10-10.6   =  +0.6

The denominator in Z equals:    V*    30   /  I  30^ - 30   /  = 2'85'


Thus, Z = 2-^1 = 0.21.


     Reference to Table 1 (page 19) shows that to reject the hypothesis at a
two-sided significance level of  0.05  requires a Z-value of 1.96 or greater.
Since we obtained Z = 0.21,  which is less than the critical value of 1.96,  we
cannot reject the null  hypothesis of the sample being random.

                                  43

-------
     Note:  It should be emphasized that this latter approach,  which involves
computing a Z-statistic  and  comparing it with the tabulated standard normal
deviates  in Table 1,  is  only to be used with larger data sets, where n2 ex-
ceeds 20.

     The  above example was  based on two kinds  of observations (good, bad)
only.  However, the  runs test applies to any kind of data that can be cate-
gorized  into two groups.   If the data are quantitative measurements, first
find the  median.  Then  assign a plus sign whenever the observation is above
the median, and a minus  sign whenever the observation is below the median.
Then proceed as above, using the plus/minus categories in the same way as the
good/bad categories  were used.

     If  a trend is  the  alternative  to  randomness  that  is of particular
importance, then a  runs  test appropriate to  this alternative  can be con-
structed  as  follows.  Count  as  a plus each  observation  that  exceeds the
preceding one;  count as  a  minus each observation  that  is less than the
preceding one.  Then use these plus/minus categories as before.  By changing
the definition of the two categories one can arrive at different runs tests
for different alternatives to randomness.

     Many types of  data, however, have more than two categories.   For exam-
ple, a water  quality index could be  partitioned  into  "very good," "good,"
"fair,"  "poor," and  "very poor"  categories.   The above mentioned  runs  test
for two  categories  has  been generalized to cases with  any number of cate-
gories.   The  appropriate test statistics are derived Z-statistics and can
therefore be tested by using the tabulated standard normal deviates.  Details
are given in Wall is  and Roberts (1956), Chapter 18.

     The  runs  test  can  detect a wide variety of departures from randomness.
However,  if a  specific  type of departure  is  important,  a test designed to
detect that type of  departure will be more likely to detect it.   With water
quality  data taken  over  time there are three particular types  of departures
from randomness that are of interest.  One is  a  seasonal effect related to
                                  44

-------
different amounts of precipitation or weather changes over the year.  A more
interesting effect  is  a  long-term trend in which the water quality level  is
gradually improving (or  worsening)  over time.   Finally,  a third is a sudden
change in water  quality  associated  with a discrete event—opening of a new
factory  or  installation  of a  new  treatment  plant.  Of  these three,
seasonality is  generally a  nuisance.   That is,  one is  not specifically
interested  in  it but  rather must adjust for it before  the  other  effects
(long-term trend or step trend) can be  tested for.  Specific tests for these
types of trend are presented in the  following subsections.

     B.   Kendall's Tau Test

     Kendall's tau  is  a  rank correlation coefficient.  A distribution-free
test based  on  Kendall's  statistic is  commonly used  to compare  one set of
numerical data with another to see  if  they  tend to  "track" together.  In
water quality work, a  series of  readings of some  parameter can be tested
against  time  (e.g., the  series of months over which  the parameter was ob-
tained)  to  see  if  it  has any  generally increasing or decreasing tendency.
This tendency does  not need to be linear (i.e., straight line).  To  illus-
trate  the  method,  assume one  has the  following 12  monthly, average  water
quality  indices (WQIs) at a given station:
Month:    123456789   10   11   12
WQI:     21    3    5    8   21   48   37   39   26   16   35    7

Statistically speaking, one  may  wish to test the null hypothesis that there
is  no  trend  in  the data,  i.e., months and WQI values are unrelated, against
the alternative that  there  is a trend  in the data,  i.e., the two variables
are related.

     The first  step  is  to rank the months and the WQIs in order from lowest
to  highest.  Since the  months are already in order, it is only necessary to
rank the WQIs.  The  smallest is 3  (in  month  2),  so it is ranked number 1.
Similarly, the second WQI  in rank is 5, then 7,  etc.  Note that

                                  45

-------
the value 21 appears twice, so the WQIs  for months 1 and 5 are "tied."  This
is a  common  situation  with this type of data, and is resolved by averaging
the ranks.  In  this  case,  the two WQIs  in question share ranks 6 and 7,  so
each is given the average value, (6+7)/2 = 6.5.  The final  rankings are tabu-
lated below.
Month:    12345     6789    10    11    12
WQI:     21    3    5    8   21    48   37   39   26    16    35     7
Rank:     6.5  1    2    4    6.5  12   10   11    8     5     9     3
k+:       -01235565372
k-:       -11100113639
     The test  involves determining the extent  to which  the  set of WQI  values
is  ordered  in the  same  way  as  the  months.   The  following explains the
procedure.

     Take each WQI rank and count how many of the ranks to the left of it are
smaller; this  gives  the  k+ line.  Then sum up the 11 k+ values;  this yields
K+, the  number of concordant pairs (i.e*.,  pairs ordered  in the  same way  as
the months).   Then  repeat,  but count how many of  the values  to  the  left  of
each WQI rank are greater; this gives 11 k- values.  The sum of these, K-, is
the number  of  discordant  pairs.   The tie at  (21,21)  is disregarded in the
counts of K+ and K-.   Kendall's tau is then computed as:
                           tau - (K+) - (K-)
                           tau     n(n-l)/2
If there were  no ties and concordant pairs  only,  then  K+ =  n(n-l)/2,  K- =  0,
and tau = 1; if there were  discordant pairs only and  no ties, then K+ = 0,
K- = n(n-l)/2,  and tau = -1.  If K+ = K-, then tau = 0.
                                  46

-------
     In our example,

         K+ = (0 + 1 + 2  ....  + 3'+ 7 + 2) = 39,
         K- = (1 + 1 + ....  + 3 + 9) = 26,  and
     For small sample sizes, n, the significance of tau is tested by means of
tabulations of values of Kendall's K = (K+)-(K-), rather than of tau itself.
There  is  a series  of  tables  for  sample sizes ranging from n = 4  to  40
(Hollander and Wolfe, 1973).   Table  5  shows a sample of these tabulations.
For larger values  of n  a normal approximation, discussed  subsequently,  is
used.

     Our example yields  a K of (39-26)  = 13.  In Table 5,  under Column  n = 12
(number of observations), we read 0.230 at x = 12 and 0.190 at x = 14.   Thus,
the one-tailed probability  associated  with K = 13 is about 0.21 (midway be-
tween  0.230  and  0.190);  the two-tailed  significance  level  associated  with
K = 13  is  thus 2 •  0.21  = 0.42.  If we had initially chosen a desired  level
of significance  of  5% (0.05),  we would conclude that there  is no significant
trend in the data at this level since 0.42 is not small enough (it is greater
than 0.05).

     We could also use Table 5 to determine the 5% critical value for n = 12.
In Column  n = 12  we read  down to find  0.022 (closest  probability to
0.05/2 = 0.025),  then read across to x = 30.  The probability of 0.031  in the
same column yields  x = 28.   The critical K associated  with an approximate
two-sided 5%  significant  level is  therefore between 28 and 30, so one  could
use 29, (or -29 if K were negative).   Since K of 13 lies between -29 and +29,
we come to the same conclusion as above, i.e., since K is  not big enough, nor
small  enough, there is no significant trend.  Note that if there are no ties,
K can  take on only even values.  This is why  there  is  no  entry for  K=29  in
the table.
                                  47

-------
                           TABLE 5
UPPER TAIL PROBABILITIES FOR THE NULL DISTRIBUTION OF KENDALL'S
                    K STATISTIC (Subtable)
                                   1
x 4
0 .625
2 .375
4 .167
6 .042
8
10
>12
>14
16
18
20
22
24
26
28
30
•32
34
36
38
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68
70
5 8
.592 J48
.408 .452
.242 .360
.117 .274
.042 .199
.008 .138
.089
.054
.031
.016
.007
.002
.001
.000






















9
.540
.460
.381
.306
.238
.179
.130
.090
.060
.038
.022
.012
.006
.003
.001
.000




















12
.527
.473
.420
.369.
.319
.273
C23JD
Qjp
.155
.125
.098
.076
.058
.043
.031
.022
.016
.010
.007
.004
.003
.002
.001
.000












13
.524
.476
.429
.383
.338
.295
.255
.218
.184
.153
.126
.102
.082
.064
.050
.038
.029
.021
.015
.011
.007
.005
.003
.002
.001
.001
.000









16
.518
.482
.447
.412
.378
.345
.313
.282
.253
.225
.199
.175
.153
.133
.114
.097
.083
.070
.058
.048
.039
.032
.026
.021
.016
.013
.010
.008
.006
.004
.003
.002
.002
.001
.001
.001
17
.516
.484
.452
.420
.388
.358
.328
.299
.271
.245
.220
.196
.174
.154
.135
.118
.102
.088
.076
.064
.054
.046
.038
.032
.026
.021
.017
.014
.011
.009
.007
.005
.004
.003
.002
.002
20
.513
.487
.462
.436
.411
.387
.362
.339
.315
.293
.271
.250
.230
.211
.193
.176
.159
.144
.130
.117
.104
.093
.082
.073
.064
.056
.049
.043
.037
.032
.027
.023
.020
.017
.014
.012
 Source:   Hollander and Wolfe (1973)  p.  384-393

                               48

-------
     For large  sample sizes, a  normal  distribution approximation  to  the
distributions of K or tau may be considered.  Under the null hypothesis, the
expected value of tau is 0 and the variance of tau is  2(2n+5)/9n(n-l).   Thus,
the ratio:

                              7 =     tau
                                  J2(2n+5)
                                  1 9n(n-l)
is approximately standard normally distributed and Table 1 (page 19) is used
for testing.

     Our example, based on a tau of 0.197, would yield a value
                                      °-197      =0.89
                                 V
2(24+5)
     In Table 1 we  read  a one-tailed probability of 0.1867 at Z = 0.89 from
which a two-tailed probability of twice 0.1867, or approximately 0.37,  follows.
Again, the computed value of Z of 0.89 is not significant at the 5% level  and
the same conclusion is reached.  Note that from Table 5 we obtained the exact
probability of 0.42; the large sample size approximation yielded a probability
of 0.37.  The discrepancy between the two probabilities arises because the
large sample size approximation was used for a sample size of only 12.

     In case of ties,  the value of tau  as  defined above is not affected.
However, the variance  of tau when using the large sample size approximation
needs a  correction for  ties.   The  correction is lengthy in  form  and  has
little effect when  only  a few ties are  present  (see Hollander and Wolfe,
1973, p. 187).

     This test can  be  performed using the SAS  procedure, PROC CORR, with the
KENDALL option (SAS User's  Guide:   Basics,  p. 501).  An example output with
the corresponding SAS statements is presented in the appendix.

                                  49

-------
     Once a trend  is  found to be significant, the next step is to estimate
its magnitude.  A  common  measure of the magnitude of the  trend is the slope
of a  straight line fitted to the  data.   To find  the  distribution-free
estimate of  the  slope, calculate  the slopes for all possible pairs of
observations.   That is, calculate
for i=l,2,  ...,  n-1  and j=i+l, i+2	n.   There are N=n(n-l)/2 such dis-
tinct pairs.  For the example of WQI values used here,  N=12(ll)/2=66.

     Then the N  slopes  are arranged in ascending order, and the middle value
(the median)  is  the  best estimate of the  slope.   In this case, the ordered
series of slopes  becomes -28,  -18, -13,  .... 19, 20,  27.   The median is the
average of  the  33rd  and 34th values in the series, or 1.49.   Thus, we would
conclude that the WQI is, on the average,  increasing by 1.49 units per month.
However, it must be  remembered that we found no significant trend, so that
the value  1.49  is not  significantly different  from  zero.   Indeed, more
advanced procedures  also allow one to calculate confidence bounds  from these
ordered values;  in this case the approximate 95% confidence interval for the
slope  is  (-4.33, +3.85), a wide interval  that  includes both negative and
positive possibilities.  The  interested  reader  may find further details in
Hollander and Wolfe (1973).  SAS does not include routines  for estimating the
magnitude and confidence bounds of a trend using Kendall's  tau.

     C.  The Wilcoxon Rank Sum Test (Step Trend)

     This is  a distribution-free test most  appropriate for testing for a
so-called step  trend.   A step trend might be evident when some major event
such as placing  a new treatment facility into operation occurred during the
data collection period.
                                  50

-------
Procedure


     The general  procedure  involves comparing the rankings of  one set of
values (e.g., readings obtained  before  a major event) with the rankings of
another set (e.g., readings  obtained after a major event).   There  need not be
an equal number of values in each set.  Because of the way the tables for W,

the Wilcoxon test statistic,  are arranged, the set with the fewer number of

values, (sample Size  n),  is always compared to the larger set (sample size

m), and  not  vice  versa.   Thus n  is always the smaller of the  two sample
sizes.


     Consider again the example  used to demonstrate Student's t test, namely
the series  of 18 concentration  measurements for  total  chromium,  8 before
(denoted by X.) and 10 after (denoted by Y.)  implementation of a pollution
control measure.
              "Before"
          Concentrations
              (uq/l)        Ranks

               99            15.5
               11            17
               74            11
               23            18
               71            10
               75            12
               59             8.5
               85            14
   "After"
Concentrations
    (uq/£)        Ranks

     59             8.5
     99            15.5
     82            13
     51             7
     48             5
     39             1
     42             2.5
     42             2.5
     47             4
     50             6
     Let us first  consider  the following test situation:   because we do not

expect (or are not interested in) a worsening of the water quality due to the
new, improved  facility, we  will  test the null hypothesis that the "before"
and "after" concentrations are equally high or low (i.e.,  there is no change)
against the one-sided alternative that the "before" concentrations are higher
than the "after"  concentrations  (i.e.,  there is an improvement in the water
quality).
                                  51

-------
     The first step,  as  with many other distribution-free procedures,  is to
determine the  ranks of  the  concentration values.  The  observations are
ordered as a  single set  of data  (i.e., disregard "before" and "after"); if
ties are present, use average ranks.   Wilcoxon's test  statistic is simply the
sum of the ranks of the  values in the group of smaller size (here the before
group with n = 8):

                      W = (15.5 + 17 + ....  + 14) = 106.

For a one-sided  test  at  the a = S% level of significance, we reject the  null
hypothesis if W  is  greater than  or equal  to  the critical value,  W ,  associ-
ated with m,  n,  and or, and we accept the null hypothesis otherwise.  Table 6
gives the one-sided levels of significance for n = 8 and several  values of m.
Note that we  might  not always find the exact a that we have chosen,  because
the significance levels  are  discrete and so are only tabulated for integer
values of m and  n.  In our example,  for a = 0.051 (approximately 5%) we  read
that x = 95 (= Wc)  for n = 8 and m = 10  (see  arrows).   Since W of 106  is
greater than 95,  we reject the null hypothesis of no change and conclude that
the concentrations  have  significantly  decreased after implementation of the
improvement measure.

     Also, Table 6  shows for  n =  8 and m = 10 the probability  of obtaining a
value of W  greater  than  or equal to 106  to  be 0.003.  This means that the
probability of obtaining a W of 106 or more  under the null hypothesis  is
0.003, which  is  small enough to reject  the hypothesis  in  favor  of the
alternative.

     Next let us consider another testing situation.   Suppose a new indus-
trial discharge  began, and the same "before" and "after" data as above were
obtained.  Here, we will test the null hypothesis  of no  change against  the
alternative that the  water quality has degraded, i.e., the "before"  concen-
trations are  lower on the average  than  the  "after" concentrations.    (For
these data, we already know this is not true.   However, we will go through
the analysis process anyway to illustrate the procedure.)
                                  52

-------
                    TABLE 6
        UPPER TAIL PROBABILITIES  FOR
       WILCOXON'S RANK SUM W STATISTIC
                 (Subtable)
          m = 9
                                                  1
                                 m = 8   m = 9   m = 10
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90



.520
.480
.439
.399
.360
.323
.287
.253
.221
.191
.164
.139
.117
.097
.080
.065
.052
.041
.032
.025
.019
.014
.010







.519
.481
.444
.407
.371
.336
.303
.271
.240
.212
.185
.161
.138
.118
.100
.084
.069
.057
.046











.517
.483
.448
.414
.381
.348
.317
.286
.257
.230
.204
.180
.158
.137
.118



91
92
93
94
— * 95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
.007
.005
.003
.002
.001
.001
.001
.000
.000
.000
















.037
.030
.023
.018
.014
.010
.008
.006
.004
.003
.002
.001
.001
.000
.000
.000
.000
.000








.102
.086
.073
.061
(JosJ)
.042
.034
.027
.022
.017
.013
.010
.008
.006
.004
.003
.002
.002
.001
.001
.000
.000
.000
.000
.000
.000
Source:  Hollander and Wolfe (1973)  p.  272-282
                        53

-------
     The computation of W  is  unchanged (i.e., W= 106).  However, we  will
reject  the  null  hypothesis at  the a level  of significance whenever W is
less than or equal to the critical  value  of n(m+n+l) - W ,  where W  is de-
                                                         c         c
fined as above, and accept the null hypothesis otherwise.   Thus,  with an a of
0.051, m = 10, and  n = 8,  n(m+n+l) - Wc = (8)(19)  -  95 =  57,  and we cannot
reject the hypothesis in  favor  of a  degradation since  106 is not less than
the critical  value of 57.

     For a two-sided test  of  the null hypothesis of "no change"  against the
alternative of "a  change  at the 5%  level of  significance,"  we compare the
above W of 106 with the  following two critical values:   n(m+n+l)  - V(alt m,  n)
and W(of2, m,  n),  where  a^ + a2  - a.  Most often, we cannot perform the test
at the  exact a level, so we have to  choose c^ and a2 from the table as close
to a/2  as possible.  In  our example,  W(0.027,  10,  8)  =  98  and  W(0.022,
10, 8) = 99.   Thus  for  a = 0.022 + 0.027 = 0.049, the  two  critical  values
would be  either  (8)(19) - 98 = 54  and 99,  or  (8)(19) -  99 = 53 and 98.
Since the computed  W of 106 lies  outside the interval  54-99 or 53-98, we
reject  the  null  hypothesis in  favor of the  alternative  of  a significant
change.

     The  upper tail probabilities  associated with  smaller  x-values (see
Table 6) are not  tabulated and are generally not of interest because the cor-
responding crva?ues  are greater than 0.5.  However, they can be calculated.
If the probability that W is greater than or equal  to x is P(W >  x),  which is
not tabulated because it is greater than 0.500, then calculate:

               P(W > x)  = 1 -  P[W > (n(m+n+l) - x + 1)]
Example:  n = 8,  m = 10, x = 75:
          P(W > 75) = 1 -  P[W >  (8(19) - 75 + 1)]
                    = 1 -  P(W >  78)
                    = 1 -  0.448
                    = 0.552

-------
     Note:   Wilcoxon's W  statistic is tabulated  in Hollander and Wolfe
(1973), pp. 272-282 for values of  m  up to 20.   Table 6 is  just a  sample of
these tabulations.  For larger values  of  m, a large  sample approximation is
generally used.  Under the null  hypothesis of no difference between the two
sets of data,  the expected value  of W  is


            E(W) = n(m+n+l)/2,  and  the variance  is
          Var(W) = mn(m+n+l)/12.
                             W •
Then the distribution of 2 =  war/u\   tends toward a standard normal
distribution and Table 1 page 19 can be used  for testing for significance  as
described in Section III.
     Let us examine the situation of ties more closely.   First,  we  use  aver-
age ranks to  compute W.   It is  clear that  if  a tie occurs among two or more
observations in the same group (e.g., "before" or "after" concentrations), it
does not affect  the  value  of W if average ranks  are  used for ties,  or  just
assigned arbitrarily.  For  example,  if  the two  concentrations of 42 in the
"after" group  had  arbitrarily been  ranked  as  2 and 3,  W would be the same.
But, if ties  occur among  observations in different groups,  W would change
depending on  how the ties  were  "broken."   Averaging, as we did  in the exam-
ples, is the usual  tie-breaking  procedure.

     The variance  of W  is  affected  by ties, regardless of whether  they are
within or between  groups.   The  variance, corrected for  ties,  is computed as
follows:
          Varc(W) = !g
£   yy-»
j=i
                                  55

-------
where g is the number of sets of ties and t. is the size of tied set j.   Then
                                           v
compute Z as above with Var (W) replacing Var(W).

     Although the  large  sample approximation is not applicable in our exam-
ple, because m  is  only 10, we  use it to demonstrate the computation of the
correction for ties.   We observe:

     2 values of 42           (ties within "after" group)
     2 values of 59           (ties between groups)
     2 values of 99           (ties between groups)

Thus g = 3 and tj = 2, t2 - 2 and t3 = 2.  Compute:
               t.(t^-l) = 2(22-l) + 2(22-l) + 2(22-l) = 18
                J  J
and

                    /O\/T ft'X  I                  TO      I
                                                         = 126.27
                              (8+10+11 -- — _ r  =
                             KB+IJ+I;   (8+10) (8+10+1)
     The Wilcoxon  rank sum test is sometimes  referred  to as the Wilcoxon
two-sample test.   An  analogous test is the Mann-Whitney test which is based
on an  equivalent  test statistic, U.  In the  case  of no ties between con-
centration values,  U  = W -  n  (n  + l)/2, and therefore tests  based on W and U
are  equivalent.   One-tailed probabilities  for U  can  be found in Siegel
(1956), pp, 271-277.   If  there are  ties, a correction  is necessary in the
computation of  the Mann-Whitney  U-statistic.   For details see Hollander and
Wolfe  (1973) and Siegel (1956).

     The Wilcoxon  rank sum test can be performed  using SAS.  The procedure
PROC NPAR1WAY with the Wilcoxon option may be used (see SAS User's Guide:
Statistics, p. 205); an example output with the appropriate SAS statements is
presented in the appendix.
                                  56

-------
     The above procedure  demonstrated  how to use the Wilcoxon rank sum test
to test for the presence of a step trend.   It is also of interest to estimate
the magnitude  of such  a  change.   A  method of estimation based  on the
Wilcoxon test is presented next.  '

     The first step  in the estimation procedure  is  to calculate the  dif-
ferences formed  by subtracting  each observation in the after group  from each
observation  in  the before group.  Denote  these  differences  by D.., where
                                                                 • j

          D.. = X. - Y., for i = 1, .  . ., n and j = 1	m.

In the  example,  m = 10 and n = 8, so that a total of 80 differences must be
calculated.  Next, order  the  nm differences from least to greatest and take
the median  (the  middle value  of the ordered differences) as  the point esti-
mate  of the difference  in the two groups.  This  is the estimate of  the
change.

     Since  the   "after"  observations   were  subtracted from the "before"
observations, if the point estimate is positive, the interpretation  would be
that the introduction of the pollution control  measure resulted in a decrease
in the  concentration.   Applying the calculations to  the  example  data results
in a point estimate of 29 ug/£ as the decrease in concentration that occurred
when the new pollution control measure was introduced.  This  is significantly
different from zero, as concluded by the test.

     D.  Seasonal Kendall's Test for Trend

     This  is a test  procedure proposed by Hirsch et al.  (1982) which uses a
modified form of Kendall's tau.  In brief,  if  there are several years of
monthly data, Kendall's  K (number of concordant-disconcordant  pairs), pre-
sented  earlier,  is computed for each of the  12 months, and the  12  statistics
are then combined to provide a single overall test for trend.   This method is
discussed by van Belle and Hughes (1982) where it is also compared to another
distribution-free test  for trend proposed by Parrel (1980) using an aligned
rank order test  of Sen (1968).
                                  57

-------
          1.   Rationale

     Figure 7 below depicts a  series of  8 years  of  monthly  data recorded for
a given water quality parameter.
                = o
                    efts
                         f
   o  "
t o
 DO Oj O
 00
a
a .
                                irs.0
                                   Tire IN rows
                                                  1B71.0   IVt.0  MO.A
            Figure 7.  Monthly  Concentrations  of  Total  Phosphorus
                          (ref.  Hirsch  et  al.f  1982)
     The  plot of the data  clearly  exhibits a seasonal movement—peaks  and
troughs recur at almost yearly  intervals.   This  feature of having a period of
a year (other periodicities may exist)  is  a common pattern  for water quality
parameters  in general as discussed earlier.  Comparing values between months
within a  year will  thus  not help in detecting a  possible  long-term trend over
the  time  period considered.   It would be more appropriate to make comparisons
between  data from the same month for different  years,  and avoid the problem
of  seasonality,  and then combine the individual  results into an overall test
statistic from which we  can draw conclusions about a trend.
                                   58

-------
          2.  Procedure

     The following demonstration  is  based on monthly measurements;  the same
procedure can be  applied  to any sampling frequency  (e.g.,  spring,  summer,
fall, winter measurements  or  average measurements),  provided that the sam-
pling scheme is  identical  from year to year.  A general case with 12 months
and n years  of  data is presented first; a simplified numerical  example will
then follow.

     Arrange the monthly water quality measurements as follows:
          Year(i)
          Number of
          Observations


1
2
n


1
*?
Xnl
nl
Month(j)
23..
V V
Jl2 Jl3
A22 A23
Xn2 ' ' •
n2 . . .

12
y
1 12
Y '
' A2,12

'. "12
where
            11
                 is  the  observation for the first month of the first year,
                 is the  observation for the first month of the second year,
                 is  the  observation  for  the 12th month of the  2nd year,
            2 12
generally, X . . is the observation for the jth month of the ith year.

     Note that the  number  of observations need  not be the same from month to
month,  i.e.,  there  may be  5  January measurements  (n,  =5),  6 February
measurements
                 = 6), etc.
     Next, make  a  second table of numbers of concordant (K+) and discordant
(K-) pairs treating each month separately, using the procedure described with
Kendall's tau.  When finished, we have the following array:
                                  59

-------
                                   Month
       123
                                         12
number of:
 K+
 K-
      r\^ ™   o™"   *
                                                concordant pairs
                                                disconcordant pairs
 K
Kl   K2   K3
(K+) - (K-)
     Then sum the  12  monthly statistics to obtain  K =  K^ +  K2 +  . . . +  K12.
If the sample measurements are truly random (no trend), this  statistic  has  a
mean of  0 and  a variance Var(K) = Var(K,) +. . . +Var(K,2).  The  variance of
each monthly statistic K. is computed as:
         Var(K.) =  [n.(n.
              J       J  J
                                           - Z  t.(t.
                                      J      t
where t- is the size of the ith set of ties in the jth month.

     Then  compute  the standard  normal  deviate  Z  with a  continuity
correction of one unit as:
VVar~K
0
K+l
II l\ ^ U
if K = 0
if 11 ^ n
             I =
                     VVar K

and use Table  1,  page 19 of the  standard  normal  deviates to determine the
significance of Z.

     Hirsch et al.  (1982) have shown  that the normal approximation works
quite well  with as few  as  3 years  of complete data.  For  fewer years of
records, the exact  distribution  of  K,..., K,^, and therefore of K, has been
derived by Kendall (1975).
                                  60

-------
          3.  Numerical  Example

     The  following  is an oversimplified  example used for demonstration pur-
poses  only.   For simplicity,  we will  assume  that there were  only four
observations per  year, taken on a  quarterly basis.  Note that  no data  were
available in the  first quarter of  the  fourth year.   The example shows how
such missing data may be handled.   Consider the array of percent violations
of a water quality  standard  for a given parameter:

                                              Quarter(j)


Year(i)

1
2
3
4
1
12.3
10.5
12.6
~
2
11.5
10.9
10.9
9.8
3
11.6
9.5
9.5
9.5
4
15.3
14.8
13.9
14.0
           Number of
           Observations
these data are plotted  in  Figure  8 below.
              15-
              14-
              13-
             8
            I
            I «H
             c
             3
              II-
              10-
              f-
                     YMT I
Y««2
                                           Year 3
                  T  1  1   I   1  I  1   I   I   I  )   I   I   I  I   I
                  1  234   S  6  7   8   »  10  II  12  13  14  15  16
                                   Tim* in Quortcn
              Figure 8.  Plot  of  Percent Violations Versus Time
                                   61

-------
     From the data  we  calculate the number of concordant and disconcordant
pairs within each quarter, again  ignoring ties in the computation of each K+
and K-.   We obtain

K+
K-
K=(K+)-(K-)
1
2
1
1
Quarter(j)
2 3
0
5
-5
0
3
-3
4
1
5
-4
Thus, K, = 1 and n, = 3; no ties
      K2 = -5 and n£ = 4; 1 set of 2 ties (10.9, 10.9)
      K2 = -3 and n3 = 4; 1 set of 3 ties (9.5, 9.5, 9.5)
      K. = -4 and n. = 4; no ties,

and   K -  K,  +  Kp + K, + K. = -11.  The four variances, corrected for ties,
are:
           ) = [3(2)(11)-0]/18 = 66/18;
     Var(K2) = [4(3)(13)-2(1)(9)]/18 = 138/18;
     Var(K3) = [4(3)(13)-3(2)(11)]/18 = 90/18; and
     Var(K4) = [4(3)(13)-0]/18 = 156/18 .

From here, Var(K)  = (66+138+90+156)/18 = 450/18 = 25.  Since K is negative,
we compute Z as
                   Z  =    K"1"1
Since Z is less than -1.96, the lower critical Z corresponding to a two-sided
5% confidence  level,  (Table 1), we  reject the null  hypothesis of  no trend  in
favor of the alternative that a trend is present.
                                  62

-------
     Once we have identified a significant trend in a series of water quality
measurements, we  might be interested in  determining  the  magnitude of the
trend.  For  a set of stations at which trends have been detected,  one could
then compare the  different  trend slopes for a given water quality indicator
and identify those  stations  where the trend slope  is  larger than average.

     One way of  computing  the magnitude of a trend would be to compute the
slope b, of  the  regression line  of the water quality measurement versus time
as we did  earlier in Section IV.  This technique,  however,  is recommended
only with  caution since  the underlying assumptions for regression analysis
are  often  violated  when dealing  with water  quality measurements.   A
distribution-free method for computing the magnitude of a trend has been sug-
gested by Hirsch et al. (1982).  This method estimates the magnitude of trend
by means of  the  seasonal Kendall slope estimator, B,  computed as  follows.

     Considering  again the more  general data arrangement above, compute d. ..
quantities for each month (season) as follows:

                                ' »        f°r al) <
where k =  1,2.  . ,  12 and 1 < i < j < n.  For monthly data, there will be a
total of  12(n)(n-l)/2 such differences.   In general,  with n years  and  m
measurements per year,  the number of differences  will  be mn(n-l)/2.  The
slope estimator, B, is the median of these d. ..  values  (i.e.,  half the d... 's
exceed B and half  fall  short of it;  if  the number of  differences is even,
then take  the average of the two middle ones).  The estimator, B, is  related
to the seasonal Kendall  test  statistic S such that if  S is positive, then B
is positive or  zero;  if S is negative,  then B  is negative or zero.   The S
statistic  is simply  the  number  of positive  d. .. values minus the number of
negative d.-k values and B is  the median of these d... 's.

     As a  computation example  for the second quarter (j = 2)  and 4 years of
data (n =  4), from the  4 measurements X12,  X22,  X32>  X42 compute the dif-
ferences:
                                  63

-------
          d!22 = (X22 " X12)/1 = (10'9 "
          d!32 = (X32 " X12)/2 = (10'9 "
          d!42 " (X42 " X12)/3 = (9'8 "
          d232 = (X32 " X22)A = (10-9 "
          d242 = (X42 " X22)/2 = (9'8 " 1
          d342 = (X42 " X32)X1 = (9'8 " 10-9>/:L = •1-1

Note that for the first quarter, there will  be only 3 (=(3)(2)/2) differences
since one year  had missing data, while  for each of the remaining three quar-
ters, there will be (4)(3)/2 = 6 differences.

     Continuing the above calculations for all 21 pairs for the example data,
one  finds  that the  median  of  the  differences is -0.5.   Thus,  we would
estimate the  trend as  a  decrease of  0.5 percent violations per quarter.  Re-
call that  this was  found to be significantly different from  zero at the
two-sided 5% level using the seasonal Kendall's test.

     The value, B, as a measure of trend magnitude, is quite resistant to the
effect of extreme values in the data, unlike the slope of the regression line
as computed  in  Section IV.   It  is also  unaffected by  seasonality because the
slope is always computed between values that are multiples of m months (e.g.,
12 months) apart.

     A  discussion of  the  seasonal   Kendall's  test for  trend  and  other
statistical  procedures applied to total phosphorus  measurements  at NASQAN
stations has  been published  by Smith et al.  (1982).  This  document also
contains a  FORTRAN  subroutine to perform the  seasonal  Kendall procedures.

     E.   Aligned Rank Sum Test for Seasonal Data (Step Trend)

     This test  is a  method for  testing  for a  step trend  when the data  exibit
seasonality.  This seasonal  effect  is  usually  clearly visible  after the data
have been  plotted.   In  some cases,  the analyst knows or  suspects that a
specific water quality parameter may be affected by seasonality.

                                  64

-------
     When data  are  seasonal,  the Wilcoxon rank sum  test  must be modified
before testing for a step trend.   Again, assume that a discrete event such as
the opening of  a  new factory or the installation  of a new pollution control
system has been identified.  The question is whether this event has produced
a significant change  in  some measurement of a water quality parameter.   The
following  is  an  outline  of  the  computation  of  the  appropriate
distribution-free test statistic.

     Assume measurements  of a water quality indicator have been taken monthly
at a  fixed station  over  several years.  More generally, the data can be col-
lected for m  seasons  per year.   The m  times n measurements, where m is the
number of months (seasons) and n is the number of  years of collection,  can be
arranged as follows:
             Year

1
2
mean
123..
XXX
jii Ji2 Ji3
A21 A22 A23* *
*
1^23
• X • £ • J
m
' Jim
A2m
*
• X.m
mean
Xl
X
?•
X..
The * indicates the time at which a major event has happened.

     For example,  X--,  is  the measurement in the first month (season) of the
second year.   The  symbol  X~.  denotes the average monthly measurement in the
second year, while X., would be the average of the January measurements over
the n years.

Procedure

     1.   Within each  month  (column)  subtract the monthly average from each
measurement in the n  years.   This will result in an array of deseasonalized
data.  For  example,  in January,  calculate the  n  differences  (X,,  - X.,),
                   (x
                     nl
These monthly differences will then have an
average value of zero.
                                  65

-------
     2.  Rank  all  the nm differences from 1 to nm, regardless of month and
year; this will produce the matrix of aligned ranks:
             Year

1
2
n
mean
1
R21
*
Rnl
R.l
2
R12
K22
.

3
R13
K23* '
.
.
m
' R11"
' R2m
•
• Rnm
. Rm
mean
Rl
p
R2.
Rn
R..
     3.  Now  sum  the  ranks  of  all  the observations taken before the event  in
question.  Let this sum of ranks be W.   To construct the test we will  use the
fact that  for large sample  sizes the distribution of W will be approximately
normal.  Let  b.  be the number of  observations  from  month i that occurred
before the change event, and let a. be the number after the change event.  In
the example, b,=2, b2=2, b-=l, .... b =1, considering that the event occurred
in the third month of the second year.
          4.  Next calculate
               m
          E -
                      •i
E  is  the  sum of the  products  of  the  number  of  pre-event  observations  in  each
month and the mean rank of that month.  E is the expected value of W.

          5.  Now calculate
          V =
 m      a1b1     n.
1=1   W1* ill (R'J'~R-J)
          = aj+bj 1S tne number  of observations  for  month  i.
where
have the same  number of observations this is  n.
                                                If all months
                                     V is the variance of W.
                                  66

-------
          6.  Then the test is based on:
              U-F
          7 — ** c  »
            Vr
which is tested using Table 1, (page 19).
     Consider again the example used in Kendall's seasonal  test.   Assume that
the measurements are  taken  quarterly rather than monthly (for simplicity of
illustration) and that  the  event in question occurred with the beginning of
the third  quarter of  the second year  (denoted by  *).   Note that no obser-
vation was available  for the first quarter of the fourth year.  This example
illustrates that the  procedure can  be used when there are missing data and
shows how to apply the procedure in  this case.
     Year
     Mean

1
2
3
4

1
12.3
10.5
12.6
-
11.8
Quarter
2 3
11.5
10.9
10.9
9.8
10.8
11.6
9.5*
9.5
9.5
10.0
4
15.3
14.8
13.9
14.0
14.5
Next, "deseasonalize" the data by subtracting from each observation within a
quarter the mean value for this quarter.   We obtain:
     Year
     Mean

1
2
3
4

1
0.50
-1.30
0.80
-
0
Quarter
2 3
0.72
0.12
0.12
-0.98
-0.02
1.57
-0.53
-0.53
-0.53
-0.02
4
0.80
0.30
-0.60
-0.50
0
     Note:  The means are not all exactly zero due to rounding errors.
     The  15  new observations  are  then ranked  and quarterly mean  ranks
are computed.   Again,  average  ranks are used to break  ties.   The table of
aligned ranks is as follows:
                                  67

-------

1
2
3
4

1
11
1
13.5
8.5
Quarter
2 3
12
8.5
8.5
2
7.75
15
5
5
5
7.5
4
13.5
10
3
7
8.38
     Year
     Mean
     W is the  sum  of ranks over all four quarters of year one and the first
two quarters  of year  two.   (W = 11 + 12 + 15  + 13.5 + 1  + 8.5 = 61).   The
expected value of W, assuming no change,  is the sum of the average ranks over
the  same  period  (E = 8.5 + 7.75 + 7.5 + 8.38  + 8.5 + 7.75 = 48.38).   The
variance consists  of  four terms, one for  each  quarter.   Each term is the
variance of  the ranks within  that quarter.   For example, for the first
quarter  b =2, a,=l, n,=3,  and   the   variance  of  the   ranks  is
(11-8.5)2 + (1-8.5)2 + (13.5-8.5)2 = 87.5.   So  the  first  term  in the
variance, corresponding to i=l, is:
                 87.5 = 29.17 .
Proceeding in the  same  way for the next  three  quarters and summing gives
V=80.26.  Then
              \80.26
                         =1.41.
Comparing this value  of Z to the critical value of 1.96 at the two-sided 5%
level (Table 1), shows  that  the change is not  significantly different from
zero because 1.41 falls between -1.96 and +1.96.

     The positive sign  for Z indicates that the "before" period had higher
ranks (after deseasonalizing) than the "after"  period, thus, the direction of
change was  from  high  to low.  However, in this case the change could be due
to random fluctuations.
                                  68

-------
     More details  on the aligned  rank suro test can be  found in Lehmann
(1975), p. 132-141.  Unfortunately, this  procedure  is  not available through
SAS.

     F.  Trend and Change

     It is very difficult in nonparametric statistics to  deal  with a data set
containing both a step trend and a long-term trend,  or to distinguish between
the two trends  in  a series  of data values.   In  general  one should only be
interested in a step trend when there  is a definite  external event that would
be  likely to  result  in a change.  That  is, a step trend  is  only present when
an  event  occurred  at a known point in  time  and  influenced the data.  The
testing procedures for change could be  misled by the presence of a  long-term
trend.  Likewise the tests  for trend could indicate an apparent trend in the
presence of  (only)  a step trend.  If there  is a change  event,  determining
whether it  is significant and whether there  is a trend as well, or  only one,
is  a  difficult  task.   The importance  of  plotting  the data must be
re-emphasized.

     In the parametric case, as discussed in Section IV,  one can use multiple
regression  analysis  to test for the presence  of both a step trend and a
long-term  trend in  the  same data  series.   Unfortunately, there  is no
distribution-free procedure that  is as  well developed and as  easy to apply.
One  could try both  types  of tests  (for change and  trend).   If one is
significant  and  the other  not,  then  the answer is  reasonably  clear.   If
neither is  significant, then the data appear to be random.  However, if both
are significant, then  both  types of nonrandomness may be  present,  or only
one.  To determine whether both a change and a trend are  present or only one,
and if only one, which, will require a  series of analyses,  and advice from  a
statistician should be sought.

     To test for a long-term trend in the presence of a step trend,  one could
use the rank procedure as  in subsection  D.  The  "before"  and "after" data
would  be considered as two groups and separate means calculated.   Differences
                                  69

-------
between each  observation and  its  group  mean would be calculated and  the
Kendall's test applied.  In effect this would be the seasonal Kendall's test
with only two  "seasons"—before  and  after.   To test for  the presence  of a
step trend when a long-term trend is known to exist would require estimating
the slope and  calculating  the difference between each observation  and the
trend line, then applying the Wilcoxon rank-sum test to the  differences in a
manner similar to the procedure of Subsection E.  A detailed presentation of
these procedures is beyond the scope  of this document.
                                  70

-------
VI.  SPECIAL PROBLEMS
     As mentioned throughout  the  preceding sections, water quality data do
not, in general, exhibit all  the desired properties  necessary for the use of
parametric statistical procedures.  We  have already mentioned the fact that
most water quality measurements show seasonal (or cyclical) effects.   We have
then suggested methods to deseasonalize the data when using parametric proce-
dures  (i.e.,  multiple  regression) or  distribution-free  methods  (i.e.,
seasonal  Kendall's  test,  aligned  rank  sum test).   (A seasonal adjustment
method has also  been  proposed in Schlicht  (1981), although within the more
general setting of time series analysis.)

     Other problems inherent  to  "real  life" data bases  are  those of miss-
ing data  (incomplete  records) and  extreme  or outlying observations.  Another
fact mentioned throughout Section IV  is the assumption of normality of the
deseasonalized data or residuals obtained from regression analysis.   In addi-
tion,  a problem  specific to water  quality  measurements, especially when con-
centrations of pesticides, trace metals, etc. are estimated,  are measurements
below  detection  limit.   One  final  and  important point  is the  problem of
flow changes  in  rivers  and  streams which will  affect concentrations of most
constituents considered as potential water quality indicators.

     A.   Missing Data

     The  basic assumption we  make in treating missing data,  from a statis-
tical  point of view,  is that  data  are  missing because of  mistakes, such as
lost records, and not because the  analyst  simply wants to  ignore conflicting
or compromising data.   In other words, any missing datum is assumed to follow
the same  pattern as the recorded  observations.  A simple method to fill  in a
few missing data values is to replace them by the sample mean.  More involved
methods deal  with least  squares  estimates; these  methods are available
through standard statistical program packages such as SAS,  BMDP and SPSS, and
are explained, for example,  in Johnson and Leone (1977).

-------
     The  application  of  the  seasonal  Kendall's  test  for trend  is not
restricted to complete  data  sets,  nor is it necessary  to have full  years of
data.  As we mentioned  earlier, the seasonal Kendall's test statistic can in
fact be computed with  incomplete  data.   It is  also suggested (van Belle and
Hughes (1982)) that the season length be adjusted, if necessary, to obtain a
reasonable record within  each  season.   When using parametric procedures, a
few missing data will  reduce the  sample size and  slightly affect the means
and  standard  deviations.   A large number  of missing data,  however, might
affect these statistics considerably.

     B.   Outlying Observations

     Most often, outlying or extreme observations, also called outliers, can
be easily detected either when looking at a plot of the data or even earlier,
when closely examining  the  raw data sheets.  Several  logical  steps can be
taken when outliers are found.  The most basic first step is to double check
suspicious observations for transcription errors.   If the entry was an error,
correct it if possible.  If the proper entry cannot be  retrieved but an error
is certain, then delete the datum.  There  are also statistical procedures to
test whether a suspected  outlier(s) is in  fact an  extreme observation.  Such
methods are presented  in  detail  in ASTM Standard E178-75 entitled "Standard
Recommended Practice for  Dealing  With Outlying Observations."  Also, tests
for  outliers based  on  ranks are presented  in  the "Handbook of Tables for
Probability and Statistics" (1966).

     Extreme observations can be actual and caused by flow changes, tempera-
ture changes, etc.   It is therefore recommended that ancillary data such as
time of day, water  temperature and rate of discharge at the time of sample
collection be collected at  the same time  as the  water quality data.   Many
outliers  can  then be  explained  and/or corrected.   Most rank  tests,  as
described earlier, are  little  affected by the magnitude of the observations
and  therefore by  outliers.   In parametric  tests,  however,  outliers affect
means  and variances,  and  may therefore invalidate the  resulting  tests and
conclusions.
                                  72

-------
     C.   Test for Normality

     Given a  set of measurements of a variable X, one wishes to know whether
the variable comes from a normal distribution.   The following simple plotting
procedure, if the data base is  not too extensive, can be used.  Consider the
following  example  of  12 data  points.   These n=12  data points  can  be
rearranged in ascending order:

          1      X     (i/n+l)xlOO%
1
2
3
4
5
6
7
8
9
10
11
12
-1.45
-1.35
-0.78
-0.62
-0.01
0.04
0.22
0.49
0.72
1.45
1.79
2.50
7.7
15.4
23.1
30.8
38.5
46.2
53.8
61.5
69.2
76.9
84.6
92.3
Should a  value  of X occur more than once, then the corresponding value of i
(cumulative frequency)  increases  appropriately.   The maximum value of  i  is
always n,  the total  number  of  data points.  The pairs  (X,  (i/n+l)100)-values
are then plotted on probability paper using an appropriate scale for X on the
horizontal  axis.   Figure 9 shows the results.  The vertical axis for  the
values of  (i/n+l)xlOO%  is already scaled from 0.01 to 99.99.   If the data
came from  a normal  distribution,  then the plotted  points would fall  on  a
straight line.  In practice, a straight line can be drawn by hand through the
points and  a  judgment can be made as to  the normality of the data.   Also,
rough estimates of  the  mean, X, and standard deviation, S, can be made from
this plot.  The horizontal line drawn through 50 cuts the plotted line at the
value of X, and the horizontal line through 84 cuts it at the value of X + S;
these two numbers then yield the value of S by subtraction.

     More rigorous statistical tests are available  for testing for normality,
such as the Kolmogorov-Smirnov test (see Hollander  and Wolfe, 1973) or the x2
goodness  of fit  test  (see Snedecor and Cochran, 1980).   The  drawback of
using these tests is that they tend to easily reject the hypothesis
                                  73

-------


	 L- 1 ! ,



1




, |






--M— 4:-:
! - -t
_j_) 	 j_ i... 4.

' |
"^ 1 t '
i I . i-|


t | 1
! j i

i *
! ' , !
-H- 	 hH-

: i i i
1








i
"^""T^
i
I











il • I i i 1 i ! 1 1 1


4--M--J-I 	 	
x Values 1 T
- - -06
v/.u
0.2

.0

:: 0.0
ft -1.4
" 1.4
t£ -1-3
0-T
.7

04

17
1 ./
t= 0.7
2
2
1
4
5
5
5
8
9
9
2
:: 2.50
r J — 1_ _ — r-L

i 1 1
i

• i



	








I

T~
7- • i 1 ' ' ! '

i — i M ' 	 i '












t


































i=:=a±: =






:|::_§J::
::: ::::: ::
i i



i ; !
l l '

i
1
. ! I1
t i 1

	 , i/
_i — p_j — j/' I —
T
* . .
' •
/
_z 	 T..

















-2.0 -1.0
T 	 ; , 	 ! 	 T4 	 T




) |
i /

^
"_7 ~~^~ j/ i
m mil MI I m|i m hi
i i T j— i-,-1 — j - j ; — ; ... . t, .. i. j y. — ^^ — i— .
7 1
i

\\ i / i

	 _t___JU_ 	 j 	

S-^Tfr-^T — T" 	 u

| 1 • lA '111 ; i ' i ' _,._,_.
. . I j r? ,' ! ' 	 j . ,
/ i .11 	 	 : . , . ... .
yf - - J' i


y t i u "t
JA 	 _aE_, 	 i 	 i 	
\/ \
fa----.1]' ' '; 4=:-^-^-
m. i ,-H |! U4-1 i i"l ' i:^=f
	 ~i4 "~i 	 ' '


1* 1
( if ) 1
•" t
iLL , *

^ s - 1.23 r
II I 1*1
	 j 	 1 	 1 	 1 	 j. -H--P- -p|-
	 . Ji ,
II 1 ._.._.
	 il 	 plJ-Ll^I 	
	 — 1__ 	 l. 	 l _ T
T I





	 	
	 l 	 __ 	 	
t 	
	 1 	 	
of i.o t 2'°
... _p . . — ....
: t
1 1 l 1 ! 1 1 1 1 L
"T'l 	 8
T?_j_ l
A T 1 T T „
f 0>


T .
:::::±::::::::::g
|=^=|==.===it



S'


:::||::::|:::::8
::: ::::: ::::::::::
.^-T--)- 4-L- 	

i ~~^


i ^




^ 	 S"":::S
-a . _4_ l i -t—f- _ - - -4- - -

	 p-U- -i- 	 . ^







w
f - H 	 	

jj_ _LJJ — J — LLLL P*

it*
e




i Hi '' ill 1 1 1 i 1 1 1 Is


*4

                                                        t
                                                         n

                                                         i
                                                         c
                                                         I
                                                         o
                                                        X
               x = 0.25  x+s = 1.48
Figure 9.   Example of plot  on  probability paper.
               74

-------
of normality when the sample sizes are fairly large, although a visual exam-
ination of a probability plot tends to support the hypothesis.  On the other
hand, when sample sizes are relatively small, as might be the case when only
two-years worth of data are available, then the more rigorous tests tend not
to detect deviations from normality.  Thus, a probability plot might be more
helpful.  It should be  noted  here that SAS provides a probability plot, as
well as a test  for  normality, through the UNIVARIATE procedure (see example
output in the appendix).

     D.   Detection Limits

     Sometimes in testing for metals, organic compounds,  etc., the analytical
test will not be  sensitive  enough to quantify the amount of  the particular
substance.  The  amount is  below the detection limit of  the  test, so is
reported simply as  "less  than"  that limit.  These data,  also  referred to as
"censored" data, are not  to be confused with missing data,  because they are
not  "missing"—they  are known to  be in a certain range, but the precise
values are not known.

     There are, however,  some ways to compute values that can be substituted
for  "less  than"  values.   Such options  would  include deleting the sample,
filling in with zeros,  substituting the actual detection limit for the datum,
or filling in with a random number between zero and the detection limit based
on the  underlying distribution of  the data.  The  first three  methods are all
biased  in some way.   The last one,  on  the other  hand, requires  information
about  the distribution.  A simple  random number between  zero  and the
detection limit  is  sometimes used for substitution,  implying a  uniform
distribution.

     Rank tests  can handle "less  than"  values  in some  cases  with less
difficulty than parametric tests.  If the  limit of detection is constant, all
"less than" values  for a particular constituent  are  considered  as "ties."
Alternatively, rank tests that  treat censored data explicitly have been de-
veloped (e.g., Gehan 1965).   Note that detection limit values for different
constituents are handled individually.
                                  75

-------
     E.   Flow Adjustments

     Caution must  be exercised when interpreting trends  found to be sig-
nificant by  any  of the previously described statistical  procedures,  espe-
cially when  the  measurements  used are specific constituent concentrations.
It is  common knowledge that for most constituents,  concentrations change as
the flow changes, which in  turn  introduces considerable variability into the
measurements.  Flow  conditions  can  vary naturally due to climatic factors,
and artificially due to stream regulation and manipulation by man.

     One way to  correct for changing flow is to determine the relationship
between flow and concentration  of the considered constituent.  However, no
uniform equation exists  since this  relationship may vary from site to site
and from constituent to constituent.   Hirsch et al. (1980)  suggested some
nonlinear equations  characterizing  relationships  between  concentrations and
flow in cases  where the increased discharge of a constituent is due to pre-
cipitation,  snowmelt, or reservoir release.   In another case, quadratic equa-
tions  are proposed  to relate concentrations and flow  when  the constituent
load may  increase dramatically  with  an increase in discharge because of
runoff during a storm event.

     When the  effect of increased discharge  is a simple dilution  effect, the
relationship between concentration  and  discharge  can  be characterized by

     X = Xi  + \2/Q , or
     X = \!  + [\2/(l + \jQ)]  , for example,
where X  is  the concentration, Q is the discharge flow, and the coefficients
\! and X2  are equal to or  greater  than zero, and \3 is greater than zero.
Generally  the coefficients in these equations  can  be estimated via  least
squares methods (e.g., regression analysis).
                                  76

-------
     The sequence  of  procedures suggested by Hirsch  et  al.  (1980) can be
summarized as follows:

     1.  First find the  best fitting relationship between flow and concen-
tration using regression methods.

     2.  Compute  the  series  of flow-adjusted concentrations whenever the
relationship determined in Step 1 is significant.

     3.  Apply the seasonal Kendall test for trend.

     4.  Compute the  magnitude  of the trend, if significant, using the sea-
sonal Kendall slope estimator.

     For an example of these procedures, the interested reader is referred to
Smith et al. (1982), who applied all three methods—seasonal  Kendall test for
trend, flow adjustment, and seasonal Kendall slope estimator—to measurements
of total phosphorus concentrations.
                                  77

-------
                               BIBLIOGRAPHY
ASTM Designation:   E178-75.   1975  "Standard  Recommended  Practice  For Dealing
With Outlying Observations."

Bell, Charles B., and  E.  P.  Smith.  1981.  Water Quality Trends:  Inference
for First-Order Autoregressive Schemes. Tech.  Rep.  6,  SIAM  Instit.  for Math.
in Soc., Biomath. Group, Univ. of Wash., Seattle.

Box, George E.  P. ,  and J.  M.  Jenkins.   1970.   Time  Series Analysis.   Holden-
Day, San Francisco, Ca.

Box, George E.  P., and G. C. Tiao.  1975.  "Intervention Analysis with Appli-
cation to Economic and Environmental Problems."  J.  American Statistical Assoc..
Vol. 70, pp. 70-79.

Chatterjee, Samprit,  and B. Price.  1977.   Regression Analysis by Example.
John Wiley and  Sons, New York.

Draper, Norman  R., and H. Smith.   1981.  Applied Regression Analysis.  Second
Edition, John Wiley and Sons, Inc., New York.

Dykstra, Richard  L. and T. Robertson.  1983.   "On Testing Monotone Tendencies."
J. American Statistical Assoc.. Vol. 78, pp. 342-350.

Farrell,  Robert L.   1980.  Methods for Classifying Changes in  Environmental
Conditions.  Tech.  Rep.  VRI-EPA7.4-FR80-1,  Vector Research,  Inc., Ann Arbor,
Mich.

Gehan, Edmund A.  1965.  "A Generalized Wilcoxon Test  for Comparing  Arbitrarily
Singly Censored Samples."  Biometrika 521, 203-223.

General  Accounting  Office.   1981.   Better  Monitoring  Techniques  are Needed
to Assess the Quality of Rivers and Streams.   Report CED-81-30, U.S. General
Accounting Office, Washington, D.C.

                                   78

-------
Colorado.

Handbook of  Tables  for Probability and Statistics.  1966.  Edited by Beyer,
William H.   The Chemical Rubber Co.

Hirsch, Robert M., J. R. Slack, and R. A. Smith.  1982.  "Techniques of Trend
Analysis for Monthly Water Quality Data."  Water Resources Research. Vol. 18(1),
pp. 107-121.

Hollander,  Myles, and D. A. Wolfe.  1973.  Nonparametric Statistical Methods.
John Wiley and Sons, New York.

Jernigan, Robert W.,  and J. C. Turner.  Seasonal Trends in Unequally Spaced
Data:  Confidence Intervals for Spectral Estimates.  Submitted for publication.

Johnson, Norman  L,  and F.  C.  Leone.   1977.  2 Vol.  Statistics and Experi-
mental Design in Engineering and the Physical Sciences.  Second Edition, John
Wiley and Sons, Inc., New York.

Kendall, Maurice G.,  and W.  R.  Buckland.   1971.   A Dictionary of Statistical
Terms.  Third Edition.  Hafner Publishing Company,  Inc., New York.

Kendall, Maurice G., and A. Stuart.  1966.  The Advanced Theory of Statistics.
Volume 3.  Hafner Publ. Co., New York, pp. 342.

Kendall, Maurice G.   1975.  Rank Correlation Methods.  Charles Griffin, London.

Langley, Russell A.   1971.  Practical Statistics  Simply  Explained.   Second
Edition, Dover Publications, Inc., New York.

Lehmann, Erich  L.   1975.  Nonparametric Statistical Methods Based on Ranks.
Holsten Day, San Francisco.
                                   79

-------
Lettenmaier, Dennis P.  1976.  "Detection of Trends in Stream Quality:  Moni-
toring Network Design and Data Analysis." Tech. Rep. 51, Harris Hydraul. Lab.,
Dept. of Civil. Eng., Univ. of Wash., Seattle.

Lettenmaier, Dennis P.  1976.  "Detection of Trends in Water Quality Data from
Records With Dependent Observations."  Water  Resources  Research.  Vol.  12(5),
pp. 1037-1046.

Mann, Henry B.   1945.   "Nonparametric Tests Against Trend."   Econometrica,
Vol. 13, pp. 245-259.

Sen, Pranab K.   1968.   "On A Class of Aligned Rank Order Tests  in Two-Way
Layouts."  Annals of Mathematical Statistics. Vol. 39, pp. 1115-1124.

Schlicht, Ekkehart.   1981.   "A Seasonal  Adjustment Principle and a Seasonal
Adjustment  Method  Derived from  this  Principle."  Journal of the American
Statistical Association. Vol. 76, pp. 374-378.

Siegel, Sidney.  1956.  Nonparametric Statistics for the Behavioral Sciences.
McGraw-Hill, New York.

Smith,  Richard A.,  R.  M.  Hirsch, and J.  R. Slack.  1982.  A Study of Trends
in Total Phosphorus Measurements at NASQAN Stations.  U.S. Geological  Survey
Water-Supply Paper 2190.

Snedecor, George W.,  and W.  G. Cochran.  1980.   Statistical  Methods.  Seventh
Edition, the Iowa State University Press, Ames,  Iowa.

STORET User Handbook.  1980.   U.S.  Environmental Protection Agency, Office of
Water and Hazardous Materials.

van  Belle,  Gerald,  and J.  P. Hughes.  1982.   "Nonparametric Tests for Trend
in Water Quality."   SIMS  Technical Report No. 11, University of  Washington,
Seattle, Washington (to appear in Water Resources Research).
                                  80

-------
van Belle,  Gerald,  and J. P. Hughes.  1983.  "Monitoring for Water Quality:
Fixed  Station  versus  Intensive  Surveys."  Journal Water Pollution Control
Federation.  Vol. 55, pp. 400-404.

Wall is, W.  Allen,  and H. V.  Roberts.   1956.   Statistics:  A New Approach.
The Free Press, New York.

Statistical Program Packages:

BMDP Statistical Software, 1980, University of California Press
SAS:  Statistical Analysis System, SAS Institute, Inc.,
  SAS User's Guide:   Basics.  1982 Edition
  SAS User's Guide:   Statistics.  1982 Edition
  Box 8000, Gary, North Carolina
SPSS:   Statistical Package for the Social Sciences, 1982, McGraw-Hill.
                                  81

-------
APPENDIX
     This appendix  is a collection of output examples using SAS and the data
used to  demonstrate various procedures in the preceding sections.  It is or-
ganized  in the  same fashion as the text, and the title in each output cor-
responds to the appropriate section and subsection.   The following procedures
are shown:

          1.   Regression analysis (IV-B)
          2.   Student's t-test (IV-C)
          3.   Multiple regression analysis (IV-D)
          4.   Kendall's tau test (V-B)
          5.   Wilcoxon rank sum test (V-C)

     A FORTRAN subroutine for the seasonal Kendall's test and trend magnitude
estimation is included in Smith et al.  (1982).
                                  82

-------
                1.  REGRESSION ANALYSIS  (IV-B)
   fOPTIONS  LINESIZE=100  NODATE ?
 ])/DATA  REGRESS?
   UNPUT YEAR  UQI  88  i
 5) fCARDS?
 *U977  46  1978 52 1979  42 1980 44 1981 39 1982 45 1983 40
   TITLE REGRESSION ANALYSIS » UQI VERSUS TIME(YR)?
   TITLES OUTPUT EXAMPLE FOR SECTION IV B?
   "PROC  REG ?
   MODEL UQI=YEAR  t
   OUTPUT OUT=RES
   PREDICTED=FRED
   UESIDUAL = RESin  ;
   PROC  PLOT DATA=RESf
   PLOT  PRED*YEAR='P'
         UQI*YEAR='0'/OVERLAY i
   PROC  PRINT  DATA=RES  »
   PROC  UNIVARIATE DATA=RES PLOT  NORMAL?
   VAR RESIDr
   TITLE5 TEST OF  NORMALITY FOR THE RESIDUALS?
   /*
©  t>*'
              \   ^   arto«  fi
(5)
(!)

                 /
               t>
                           83

-------
                             REGRESSION ANALYSIS  f  WQI  VERSUS  TIME(YR)




                                  OUTPUT EXAMPLE  FOR  SECTION  IV  B
DEP VARIABLE: UQI
SOURCE
|
1 MODEL
: ERROR
; C TOTAL
DF
5 D
6
' ROOT MSE
DEP MEAN
i C.V.
i
•! oo I
* 1
j VARIABLE
i
i INTERCEP
! YEAR£<,Lop
i At rA4r
! d
1 l^ohu: if
i
DF
1
£ 1
«SCto»
^Cl
/SUM OF
/SQUARES
/43, 750000
C 70.250000
114.000
3.748333
44.000000
8.518939
PARAMETER
ESTIMATE
2519*000
-1.250000
f\ .aCjUfttSo^ »S :
r l^o vA)tr-c u
MEAN
SQUARE
43,750000
14.050000
R-SQUARE
ADJ R-SQ
STANDARD
ERROR
1402.570
0,708368
: y= -ias
sdl CH77 =
y ^ - i.v
F VALUE
3.114
0,3838
0,2605 "
T FOR HOJ
PARAMETER=0
1,796
-1,765
\J«u- ^ 2.51^
1 , IW = 1,«>
S ^-(.cwrK/0 + H
PROB>F
0,1379 ^
j/oV
^^ » o.uo s.
PROB > IT! ~)
I "^
a) 0.1324 > a;
V,)0.1379J bj
OO;KN ^-ux^ ^ W7
O Hvtn
LS^ -^^-W7t)
                             V  . .

                                                                               s O

                                           no
-tu

-------
oo
      52
      SI
                             REGRESSION ANALYSIS i  WOI VERSUS TIHE(YR)

                                 OUTPUT EXAMPLE FOR SECTION IV 6
                          PLOT OF PREDtYEAR
                          PLOT OF UOI*YEAR
SYMBOL USED IS P
SYMBOL USED IS 0
                                                               YEAR
                            UGI
PRED

47,75
46.SO
45.25
44.00
42.75
41*50
40.25
RESIO

-1.75
 5.50
-3.25
 0.00
-3.75
 3.50
-0.25
                                               YEAR

-------
CD
CD
     VARIABLE=RESID
                     MOMENTS
     REGRESSION ANALYSIS  t  UQI  VERSUS TIME(YR)


          OUTPUT EXAMPLE  FOR  SECTION IV B


        TEST OF NORMALITY FOR THE RESIDUALS


                     UNIVARIATE


RESIDUALS


                           QUANTILES(DEFM)
         EXTREMES
N
MEAN
STD DEV
SKEUNESS
USS
CV
T:MEAN=O
SGN RANK
NUM "= 0
[W1NORMAL
STEM LEAF
4 5
2*1
^
0

*• o ^
•

7
1.787E-13
3.42174
0.680336
70.25
1.915E+15
1.381E-13
-1
7
0.923974
t








SUM UG TS
SUM
VARIANCE
KURTOSIS
CSS
STD MEAN
PROB>!TI
PROBXSI

PROEKU
I
\
\
\
\
\
\


7 1
1.251E-12
11.7083
-0.689156
70.25
1.2933
1
0.932647

0.482 |
* BOXPLOT\
1 ' \
4 J. 	 • \
1 + 	 + \

21 •
T T


100% MAX
75% 03
50% MED
25% 01
0% MIN

RANGE
Q3-Q1
MODE


V5+


1 +
i
•31

.

5.5
3.5
-0.25
-3.25
-3.75

9.25
6.75
-3.75








+____A__
-2
99%
95%
90%
10%
5%
1%




NORMAL







-1
5.5 LOWEST
5.5 -3.75
5.5 -3.25
-3.75 -1.75
-3.75 -0.25
-3.75 1.705E-13




PROBABILITY PLOT
*


+ *
»«^
*

1 • • l_ _»
+0 +1
HIGHEST
-1.75
-0.25
1«705E-13
3.5
5.5











. .
+ 2
                                    0(
                                                                                                    Op


                                                                                                     I
                                «w»t to  So
                                                               So
W o
                                                                                    is

-------
                2.  STUDENT'S T-TEST (IV-C)
<
I
fOPTIONS LINESIZE=100 NODATE  J
 DATA UATERQt
 INPUT TIME * CONC 005
 LABEL CONC=CONCENTRATIONJ
(CARDS i
JB 99 B 111 B 74 B 123 B 71 B 75  B  59  B  85
^A 59 A 99 A 82 A 31 A 48 A 39  A  42 A  42 A  47 A 50
 TITLE T-TEST PROCEDURE  » TOTAL CHROMIUM CONCENTRATIONS?
 TITLES OUTPUT EXAMPLE TO SECTION IV C »
fPROC TTEST »
/CLASS TIME;
IVAR CONC ?
fPROC UNIVARIATE PLOT NORMAL  J
CVAR CONC 5
 TITLES TEST OF NORMALITY AND NORMAL PROBALITY PLOT »
 /*
            uva
         -Rv,-
                                        an
                                                ,' U
  t -
    J» K
                                     ?lolr
                          87

-------
VARIABLE! CONC

TIME       N
   T-TEST PROCEDURE t  TOTAL CHROMIUM CONCENTRATIONS

            OUTPUT EXAMPLE TO SECTION IV C

                   TTEST PROCEDURE

CONCENTRATION

HEAN        STD DEV      STD ERROR    VARIANCES
A
B
10
 B
FOR HOt VARIAMtES\ARE EQUAL, F'<
.55,90000000
'87,12500000
        19.49615347
        21.95083793
6.16522506
7,76079318
             1.27 UITH 7 AND 9 DF
                                                       g
UNEQUAL
EQUAL
-3.1503
-3.1946
  DF  PROB > IT!

14.2  (D  0.0070
16.0  (J)  0.0056
              PROB > F'» 0.7234
                                                             i
                                                                   0( .
                                                                        kvu.oixl  OM
                                                               \x

-------
00
                              T-TEST PROCEDURE t  TOTAL CHROMIUM CONCENTRATIONS




                                       OUTPUT EXAMPLE TO SECTION IV C



                                TEST OF NORMALITY AND NORMAL PROBALITY PLOT



                                                 UNIVARIATE
    VARIABLE-CONC
CONCENTRATION
                    MOMENTS
                           QUANTILES(DEF»4)
                                                                                          EXTREMES
N
MEAN
STD DEV
SKEUNESS
USS
CV
T1MEAN«0
SGN RANK
NUM "= 0
lut NORMAL
STEM LEAF
12 3
10 1
8 2599
6 145
18
69.7778
25.5839
0.653072
98768
36.6648
11.5714
85.5
18
0.920973





SUM MOTS
SUM
VARIANCE
KURTOSIS
CSS
STD MEAN
PROBXTI
PROBXS!

PROEKU





4 22780199
2 9
	 +•
NULTII

	 + 	 +•
»LY STEM. LI

	 +
IAF BY 101
18 100% MAX
1256
654.536
-0.612432
11127,1
6.03018
0*0001
.000212983

0.1661
* BOXPLOR
1 1 \
1 1 \
4X_____X \

3 *— +— *
8X_____X

1 !
K*+01
75% 03
50% MED
25% Ql
0% MIN

RANGE
Q3-Q1
MODE


130+
I
I
V i
\ I
\ 30 +
\ *
123
88.5
65
47.75
39

84
40.75
42







*
-2
99%
95%
90%
10%
5%
1%




NORMAL




*** **
+ +
-1
123 LOWEST HIGHEST
123 39
112.2 42
41.7 42
39 47
39 48




PROBABILITY PLOT
*
t T
+* *****
+ + ** *
** **

+0 +1 +2
85
99
99
111
123










+

                         lsV ajL  VvorwioiU'k
                                                                                    0.166 u>
                                                                             (

-------
               3. MULTIPLE REGRESSION ANALYSIS (IV-D)
 OPTIONS LINESIZE=100 NODATE J
''DATA MULTIPLEJ
 INPUT TIME  CONC J
 IF MOD(TIMErl2)=l THEN Xl=l JELSE Xl=0»
 IF MOD(TIME»12)=2 THEN X2=l JELSE X2=OJ
 IF MOD(TIME»12)=3 THEN X3=I JELSE X3=OJ
 IF MOD(TIME»12>=4 THEN X4 = l JELSE X4=OJ
 IF MODCTIMEi12)=5 THEN X5=l JELSE X5=OJ
 IF MOD(TIME»12>=6 THEN X6=l JELSE X6=0»
 IF MOD(TIMErl2)=7 THEN X7=l JELSE X7=0>
 IF MOD=0 THEN X12=l JELSE X12=OJ
 IF TIME LE 18 THEN C=0 J ELSE C=l »
 CARD3J

 TITLE MULTIPLE REGRESSION ANALYSISJ
 TITLES OUTPUT EXAMPLE TO SECTION IV D  J
 PROC PRINT  J
 FORMAT CONC 5.3  J
 PROC REG J
 MODEL CONC=X1 X2 X3 X4 X5 X6 X7 X8 X?  X10 Xll X12  C  TIME/NOINT
 PROC PLOT 5 PLOT CONC*TIME=C/HAXIS=0 TO 72 BY 2 HREF=18J
 /*
  DaK
+
                                                  00

                             90

-------
                                    MULTIPLE REGRESSION ANALTSI8
DBS
TIME
CONC
XI
X2
OUTPUT EXAMPLE TO SECTION IV D



 X3    X4    XS    X6    X7    XB
                                                                        X?
                                                                       X10
                                                                                     Xll
                                                                             X12
1
2
3
4
5
6
7
8
9
10
11
12
13
14
IS
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
r 4B
49
SO
SI
32
1
2
3
4
5
6
7
8
9
10
11
12
13
14
IS
16
17
IB
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
4B
49
30
31
32
0.704
0.636
0.288
0.576
0.422
0.198
0.414
0.057
0.028
0.50S
0.113
0.406
0.414
0.624
0.540
0.343
0.618
0.372
0.198
0.123
0.139
0.010
0.117
0.265
0.256
0.366
0.342
0.233
0.487
0.141
0.223
0.010
0.124
0.019
0.133
0.080
0.296
0.370
0.328
0.292
0.284
0.333
0.230
0.031
0.089
0.054
0.085
0.086
0.216
0.182
0.333
0.290
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0 .
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0 <
0 (
0 (
0 (
0 <
0 <
0 (
0 <
0 <
0 <
0 (
1 <
0 (
0 (
0 <
0 <
0 <
0 <
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
> 1
> /
> /
> /
) (
> 1
> \
> V &
) \ &<
) /
)
)
)
)
)
)
)
*J
1
i
































                                                                                                                             0

-------
DBS
                                    MULTIPLE REGRESSION  ANALYSIS



                                   OUTPUT  EXAHPLE  TO  SECTION  IV D
TIME
CONG
XI
                              X2
            X3
                                          X4
                                                      X6
                                                            X7
                                                                 X8
                                                                       X9
X10
                                                                                    Xll
                                                                                     X12
33
54
35
56
57
58
59
40
61
62
63
64
65
66
67
60
6?
70
71
.1 72
10
no
53
54
55
56
57
58
5V
60
61
62
63
64
65
66
67
68
" 6V
70
71
72

0.272
0.193
0.174
0.158
0.096
0.056
0.117
0.201
0.283
0.320
0.234
0.272
0.145
0.189
0.095
0.075
0.032
0.046
0.071
0.010

0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
o
0
1
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0

1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0

0
1
0
0
0
0
0
a 	
0
0
0
0
0
1
0
0
0
0
0
0

0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0

0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0

0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0

0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0 .
o'
1
0
0

0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0

0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1 1




















I


-------
'10
                                         MULTIPLE REGRESSION ANALYSIS

                                        OUTPUT EXAMPLE  TO SECTION IV D
     OEP VARIABLE:  CONC

SOURCE
MODEL
ERROR
U TOTAL
ROOT
DEP
C.V.
NOTE: NO

VARIABLE
*X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
Xll
VX12
/C
'/TIME
SUM OF
DF SQUARES
14 5.742797
58 0,559039
72 6.301836
MSE 0.098176
MEAN 0.240778
40,77468
INTERCEPT TERM IS
PARAMETER
DF ESTIMATE
1
1 *»
i 7
1 j?
i *
i ^,
1 ov*
1
1 1
1 £
1
'0.496496
0.552580
0.488331
0.508082
0.511333
0.378917
0.389222
0.243806
0.257057
0,285308
0.277559
1 \0. 347477
1 -0.144327-*.
1 -0.0012509J
MEAN
SQUARE
0,410200
0,009638597

R-SQUARE
APJ R-SQ
£*• 0uA\u&Jl

F VALUE
42.558

O.
^R
0.9113*^
0.8914 *S
ol fir ft oF UJUpv^V V

PROB>F
0.0001





USED, R-SQUARE IS REDEFINED.
STANDARD
ERROR
0,044398
0.044520
0.044657
0.044810
0,044978
0.045162
0.046439
0.046555
0.046686
0.046833
0.046994
0.047169
0.040917
), 0008483678
T FOR HO:
PARAMETERS PROB
11.183
12.412
10,935
11,339
11,368
8,390
8,381
5,237
5,506
"6,092
5,906
7,367
-3,527
-1,474

> m
o.oooT
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001/
0.0008'
0.1458 4
                                                                                  * I  t*
                                                                                  J  JU1T Rav

-------
CONC
 0.8
 0.4
 0.5
 0.4
 0.3
 0.2
 0.1
 0.0
                                   HULTIPLE REGRESSION ANALYSIS




                                   OUTPUT EXAMPLE TO SECTION IV D




                               PLOT OF CONC*TIHE    SYMBOL IS VALUE OF C
                                                                                           r1
                                                                                                       (J
                          11111222223333344444S5SSS66AAA77
                                                 TIHE

-------
                   4.  KENDALL'S TAU TEST (V-B)
 ^OPTIONS LINESIZE=100 NODATE  9
 DATA WATERQ?
 INPUT MONTH UQI 86?
 ORD=_N_J
 CARDS J
 . 1 21 2 3 3 5 4 8 5 21 6 48  7 37  8 39  9  26  10  16  11  35  12  7
 TITLE KENDALL"S TAU TEST?
 TITLES OUTPUT EXAMPLE TO SECTION  V B»
/PROC CORR DATA=WATERQ KENDALL?
IVAR MONTH UQI 9
 *                                                              i
 *THE FOLLOWING STEPS ARE TO  COMPUTE  THE  N*(N-l)/2 SLOPES       9
 *THIS WILL ALLOW TO OBTAIN THE ESTIMATE  OF  THE  TREND  MAGNITUDE*
 *IF THE TREND IS STATISTICALLY SIGNIFIGNANT                   9
 *                                                              9
 DATA DIFFS i SET UATERQJ
 RETAIN X1-X50 -1
        Y1-Y50 -1J
 ARRAY  XX (I) X1-X50J
 ARRAY  YY 
-------
t  Corr-Mlouv ?rocudUK
               ^ - Vor^,

          11.11/1=

                            96

-------
VARIABLE



MONTH

UQI
                    u*'
                    KENDALL'S TAU TEST

              OUTPUT EXAMPLE TO  SECTION  V  B

       MEAN         STD  DEV          MEDIAN
 6.50000000

22.16666667
 3.60SSS128

15.02623968
 6,50000000

21.00000000
   MINIMUM



1.00000000

3.00000000
    MAXIMUM



12.00000000

48.00000000
        KENDALL'S  TAU  TEST

  OUTPUT EXAMPLE  TO  SECTION  V  B

          OBS    MEDIAN

           1     1.48571
  «, Tt»»S is

              KENDALL  TAU  B  CORRELATION COEFFICIENTS / PROB >  IRI UNDER HO:RHO=0  /  N =  12

                                               MONTH      UQI
                                   MONTH     1,00000  0.19848
                                              0,0000   0,3716

                                   UQI       0,19848  1.00000
                                              0.3716   0.0000
                                                                          •A

                                                             o.os
                                                                          (

-------
                         5.  WILCOXON RANK SUM TEST (V-C)
  ("OPTIONS LINESIZE=100 NODATE  f
AjDATA UATERQ?
(ff INPUT TIME PERIOD $ CONC 6B  ?
  '-LABEL CONC=CONCENTRATION?
  ("CARDS F
(2\h B 99 2 B 111 3 B 74 4 B 123 5 B 71  6  B  75  7 B 59 8 B 85
  (9 A 59 10 A 99 11 A 82 12 A  51 13 A 48  14 A  39 15 A 42 16 A 42 17 A 47 18 A 50
     DC PLOT DATA=UATERQ?
  [PLOT CONC*TIME=PERIOD/HREF=9?
   TITLE UILCOXON RANK SUM TEST  (STEP TREND)?
   TITLES EXAMPLE OUTPUT TO SECTION V C?
  'PROC NPAR1UAY DATA=UATERQ  UILCOXONt
    :LASS  PERIOD?
   VAR CONC ?
   /*
     Q   3>o.U
                                     98

-------
  ,1
       130  *
      120 4
      110  4
       100 4
    0  90 4
    H
UDli C
»l E
    N  80 *
    T
    R
    A
    T  70 4
    1
    0
    N
       60 4
       SO 4
        40 4
        30  4
        20
                                    UILCOXON RANK  BUN  TEST  (STEP  TREND)

                                       EXAMPLE  OUTPUT  TO  SECTION  V C

                                   PLOT OF CONCtTIHE    SYMBOL  IB VALUE OF PERIOD
                                                                           A                    A

                                                                                     A    A
                                   Y	4	4	4	4	4	4	4	4	4	4	4	4	4.
                                   3    6    7    B    9    10    11    12    13   14   IS    16    17    18
                                                       TIME

-------
                                 UILCOXON RANK SUM  TEST  (STEP TREND)

                                    EXAMPLE OUTPUT  TO  SECTION V C

                    ANALYSIS FOR VARIABLE CONC CLASSIFIED BY VARIABLE PERIOD

                                     AVERAGE SCORES UERE USED FOR TIES


                                     UILCOXON SCORES (RANK SUMS)


                   LEVEL

                   B  -
                   A  i
N
8
10
SUM OF
SCORES
106.00
65.00
EXPECTED
UNDER HO
76.00
95,00
STD DEV
UNDER HO
11.24
11 .24
MEAN
SCORE
13.25
6.50
                            UILCOXON 2-SAMPLE TEST (NORMAL APPROXIMATION)
                            (UITH CONTINUITY CORRECTION OF .5)
g                      ^*r S=  106.00     2- 2.6252     PROB >!Z!=0.0087
                            T-TEST APPROX.  SIGNIFICANCES .0177
                                                                              ^*U
                            KRUSKAL-UALLIS  TEST (CHI-SQUARE APPROXIMATION)
                            CHISQ-   7.13    DF=  1    PROB > CHISQ-0.0076
           (* VO
                                                                                •J
                                                                                 4

-------