United States
                        Environmental Protection
                        Agency
                          Office of Water
                          Washington, B.C. 20460
             841-F-9*3-009
             September 1993
wEPA
Paired Watershed
Study  Design
    INTRODUCTION
    The purpose of this fact sheet is to
    describe the paired watershed approach for
    conducting nonpoint source (NFS) water
    quality studies. The basic approach
    requires a minimum of two watersheds -
    control  and treatment - and two periods of
    study -  calibration and treatment.  The
    control  watershed accounts for year-to-year
    or seasonal climate variations, and the
    management practices remain the same
    during the  study.  The treatment watershed
    has a change in management at some point
    during the  study.  During the calibration
    period,  the two watersheds are treated
    identically  and paired water quality data
    are collected (Table 1).  Such paired data
    could be annual means  or totals, or for
    shorter  studies (<5 yr), the observations
    could be seasonal, monthly, weekly, or
    event-based.  During the treatment period,
    one watershed is treated with a best
    management practice (BMP) while the
    control  watershed remains in the original
    management (Table 1).  The treated
    watershed should be selected randomly by
    such means as a coin toss.  The reverse of
    this schedule is possible for certain BMPs;
    the treatment period could precede the
    calibration  period.  For example, the study
    could begin with two watersheds in two
    different treatments, such as "BMP" and
    "no BMP". Later both watersheds could
    be managed identically to calibrate them.
    Since no calibration exists before the
    treatment occurs, this reversed design is
    considered  risky.
                                                 Table 1. Schedule of BMP implementation.
                           Period
      Watershed
Control        Treated
                          Calibration
                          Treatment
no BMP
no BMP
no BMP
BMP
                          The basis of the paired watershed approach
                          is that there is a quantifiable relationship
                          between paired water quality data for the
                          two watersheds, and that this relationship
                          is valid until a major change is made in
                          one of the watersheds.  At that time, a
                          new relationship will exist.  This basis
                          does not require that the quality of runoff
                          be statistically the same for the two
                          watersheds; but rather that the relationship
                          between paired observations of water
                          quality remains the same over time except
                          for the influence of the BMP.  Often, in
                          fact, the analysis of paired observations
                          indicates that the water quality is different
                          between the paired watersheds.  This
                          difference further substantiates the need to
                          use a paired watershed approach because
                          the technique does not assume that the two
                          watersheds are the same; it does assume
                          that the two watersheds respond in a
                          predictable manner together.

                          EXAMPLE
                          To illustrate the paired watershed
                          approach, data taken from a study in
                          Vermont will be used.  The purpose of the
                          study was to compare changes in field
                          runoff (cm) due to conversion of
                          conventional tillage  to conservation tillage.

-------
  Selection of Watersheds

  1. Watersheds should be similar in size, slope* location, soils, and land cover.
  2. Watersheds should be small enough to obtain uniform treatment over the entire watershed.
  3. Watershed outlets should have a stable channel and cross section for discharge monitoring, and should not
     leak at the outlet.
  4. Each watershed should be in the same land cover for a number of years prior to the study so that they are
     at a steady-state.;
  Advantages    ,                                                     ,

  1. Climate and hydrologic differences over years are statistically controlled.
  2. Can attribute water quality changes to a treatment.
  3. Control watershed eliminates need to measure all components causing change.
  4. Watersheds need not be identical,
  5. Study can be completed in shorter time frame than trend studies.
  6, Cause-effect relationships can be indicated,

  Disadvantages

  1. Response to treatment likely to be gradual over tiros which influences the variance.
  2. Study vulnerable to catastrophes such as hurricanes.
  3. Shortened calibration may result in serially correlated data.
  4. Variances between time periods may not be equal due to drastic treatment.
  5. Minimal change in the control watershed  is  permitted.
  6. Requires similar watersheds in close proximity.
The west watershed was the control and
was 1.46 hectares (ha) in area.  The east
watershed was the treatment field and was
1.10 ha.  Conventional tillage was
moldboard plow whereas conservation
tillage was a single disk harrow.  The
calibration period was  one year during
which 49 paired observations of storm
runoff were "made.  The treatment period
was three years during which 114 paired
observations  of runoff were made..  Data
were log-transformed to approach
normality based upon the Wilks-Shapiro
(W) statistic.  The equality of variances
between periods was tested using the F-
test.  Residual plots were examined to
check for independence of errors.   The
statistical package SAS was used for all
analyses.

CALIBRATION
The relationship between watersheds
during the calibration period is described
by a simple linear regression (Figure 1)

-------
between the paired observations, taking the
form:
treated{ =  b0
                 b^controfy + e    (1)
where treated and control represent flow,
water quality concentration,  or mass values
for the appropriate watershed, b0 and bl
are regression coefficients representing the
regression intercept and slope,
respectively, and e is the residual error.

Three important questions must be
answered prior to shifting from the
calibration period t the treatment period:
a) is there a significant relationship
between the paired watersheds for all
parameters.of interest, b) has the
calibration period continued  for a sufficient
length of time, and c) are the residual
errors about the regression smaller than
the expected BMP effect?

Regression significance. The significance
of the relationship between paired
observations is tested using analysis of
variance (ANOVA).  The test assumes that
the regression residuals: are normally
distributed,  have equal variances between
treatments,  and  are independent.

Hand calculations to test for the
significance of the relationship are shown
in Snedecor and Cochran (1980, p.  157)
                                               (Table 2).  The values for Table 2 are
                                               calculated from:
                                (2)
                                                                 n
                                                 P2
                                                 sx  --
                                                                    n
                                                          n - 2-
                                (3)
                                                                               (4)
                                                                               (5)
                                               Also,] the regression coefficients and
                                               coefficient of determination are determined
                                               from:
                                                                              (6)
- b0  -7 -
                                                                              (7)
                                                                              (8)
Table 2. Analysis of variance for linear regression.
Source
regression
residual
total
Degrees of
freedom
1
n-2
n-1
Sum of
squares
4
Mean
squares

-------
In order to perform the calculations by
hand, initially calculate:  SXb SYj, SXjYj,
EXr2, Yj2,.X ,  Y.   The^mean squares
(MS) are determined by  dividing the sum
of squares by the degrees of freedom (df).

For the example above,  the following was
calculated by hand: SX;  = -123.403, SYj
= -180.704, SXjYj = 533.553, EX;2 =
381.713, SY;2 = 814.847, _X= -2.518
(10"x =0.003041 cm), and 7= -3.688
(10Y = 0.000205 cm). Therefore,  'Sj  =
148.441,  S = 78.463,  S? = 70.933,
and  Sj= 1.312.  Using SAS,  the
appropriate program is listed below.  This
program  was used  to generate Table 3.
Table 3. Analysis of variance for
regression of treatment watershed runoff
on control watershed runoff.
Source
df
MS
model
error
total
1
47
48
86.79 66.17
1.31

0.0001


has been taken to detect that difference,
from:
                                   (9)
  SAS PC Program

  data flow;
    title 'Total Flow (cm)1;
    infile 'fhame.dat*;
    input flowl flow2;
  logflowl=loglQ(flowl);
  Iogflow2=logl0(flow2);
  Proc reg;
    Model Iogffow2~logflowl
       ;   /PCLM;
  runj
The resulting F statistic for this example
would indicate that the regression
relationship adequately explains a
significant amount (p< 0.001) of the
variation in paired flow data.

Calibration duration.  The ratio between
the residual variance (mean squares)
for the regression and the smallest
worthwhile difference (d) is used to
determine if a sufficient, sample
where 5^ is the estimated residual
variance about the regression, d2 is the
square of the smallest worthwhile
difference,  nj and n2 are the numbers of
observations in the calibration and
treatment periods (HJ = n2 for this
calculation  because n2 is not known yet),
and F is the table value (p=0.05) for the
variance ratio at 1 and nx + n2 - 3 df.
The difference (d) is selected based on
experience  and would vary with project
expectations. If the left side of the
equation is greater than the right side of
the equation, then there are an insufficient
number of  samples taken to detect the
difference.  For the example, Sy was
1.312 (from Table 3), i^ = n2 was 49,  and
F was 3.94. A  ten percent change from the
mean was considered a worthwhile_
difference;  therefore, d = 0.10 *  X =
0.10 * log  0.003041 cm  and sj/d2 =
20.7.  The right side of Equation (9)  =
6.0; since 20.7 is greater than 6.0, there

-------
d
H
ID
fa g15^
to
    10'-
id 
-------
Table 4. Analysis of covariance for comparing regression lines.

Source       df        S,2    Sw     Sy2     b,        df         SS
                          MS
Within
Calibration n,-l
Treatment rij-1

Eq.C3)
Eq.(3)

Eq.(4)
Eq-(4)

Eq.(2)
Eq.(2)

Eq.(6)
Eq.(6)
Pooled Error
Slopes n,+nj-2


Intercepts ni+nj-1
E


E


E
Slope

Eq.(6)
difference


n,-2
JVj^
E
n, + nj-3
1
1

S/-(Ssy)VS,2
" "
E
C 2 /C "\2/O 2
Oy -\j3gy) /^X

Eq.(5)
Ed. (5)
SS/df
Eq.(5)
Slope SS - Error SS
Combined SS
- Slope SS

-
-


MS/Error MS
MS/Slope MS
combined data ni+n2-2 Sy -(S^) /SK2
analysis can be computed by hand as
shown in Table 4 (Snedecor and Cochran,
1980, p. 386).  In order to perform the
calculations by hand, the following are
determined for the example treatment data:
ZX; = -358.14, ZY; = -416.05, ZX;Yj =
1408.37, ZXj2 = 1352.54, ZY;2 =
1653.43, X= -3.1416  Y= -3.650, and n
= 114. Therefore,  Sj = 135.00, S'  -
101.32, and S* = 227.43.  The ANCOVA
is completed for the example in Table 5.
The summations symbol(Z) in Table 4 is
used to signify the addition of the column
entries above it.

Since the slopes were found to be
different, the differences in intercepts do
not have any real meaning and do not need
to be calculated.  That is, if slopes are
different, intercepts will usually be
different.  However,  the calculation for the
test of intercepts is presented to show the
method. The combined data are
determined by summing the ZX;, ZY;,
ZXjY,-, ZX;2, and ZY;2 values for both the
calibration and treatment periods and
calculating new values for Sy , 5^, and Sx.
The calculation of F for the intercept uses
the slope MS in the denominator.  The F
for the slope test uses the error MS in the
denominator. A significant difference in
intercepts but not slopes indicates an
overall parallel shift in the regression
equation.

Using SAS, an example program is listed
below.  This program contains both a test
of the treatment regression in the PROC
REG statement and a test comparing the
regression lines in the PROC GLM
statement.
  SAS PC Program

  Proc reg;
    model Iogflow2~logflowl;
  run;
  Proc glm;
    class period;
    mod. el Iogflow2=logflowl period
         - logflowl *period;
  run;
The treatment period regression was found
to be significant based on the analysis of
variance for regression (Table 7).

-------
Table 5. Example analysis of covariance for comparing regression lines.
Source
           df
                                                             df
                                                                     SS
                                                                            MS
Within
Calibration
Treatment

Slopes


Intercepts

48
113

161

70.933
227.430

298.363

 78.463
101.315

179.778

148.441
135.000
Error
283.441

1.106
0.445

0.603
Slope difference

162

311.671

178.762

283.492

- i

47
112
159
160
1
1
161

61.650
89.866
151.516
175.116
23.600
5.8453
180.961

1.3117
0.8024
0.9529
1.0945
23.600 24.77*"*
5.8453 5.34'

       indicates significance at p=0.001
       indicates significance at p=0.05
Table 7. ANOVA for regression of treatment
watershed runoff on control watershed runoff for
the treatment period.
Source
model
error
total
df
1
112
113
MS
45.13
0.80

F
56.25


P
0.0001



 Table 8. ANCOVA for comparing calibration jand
 treatment regressions.
 Source
df
                     MS
model
error
overall
intercept
slope
3
159
1
1
,1
43.99
0.95
103.09
, 5.47
23.42
46.17

108.18
5.74-
24.58
0.001
'
0.0001
0.0178
0.0001
 The analysis of covariance obtained in
 SAS output summarizes the significance of
 the overall model, compares the two
 regression equations, the regression
 intercepts, and slopes (Table 8). The
 ANCOVA indicates that the overall
 treatment and calibration regressions were
 significantly different, and that the slopes
and intercepts of the equations also were
different. The difference in slopes is
evident in Figure 2. The slight differences
in F values between the hand calculation
method and the SAS output are due to
rounding errors.

DISPLAYING AND INTERPRETING
RESULTS
The most common methods for displaying
the results include a bivariate plot of
paired observations together with the
calibration and treatment regression
equations (Figure 2).  Another useful
graph is a plot of deviations (yobserved -
Ypredicted) as a function of time during the
treatment. The predicted values are
obtained from the calibration regression
equation. For the example, the plot of
deviations indicates that for most paired
observations, the observed  value was less
than that predicted by the calibration
regression equation.  Results should be
provided of mean values for each  period
and each watershed.  The overall results
due to the treatment can be expressed as
the %' change based on the mean predicted
and observed values.  For the example,
there was a 64 % reduction in mean runoff
due to the treatment (Table 9).

-------
Figure 3. Observed deviations from predicted
       discharge.
Table 9.  Mean values by period and watershed.

                Runoff (cm) x IP'2
Calibration
Control
Treatment
Treatment
Control
Treatment
Predicted

0.30'
1.63

0.08
0.04
0.11






-64%
FURTHER READING

Bernstein, B.B. 1983. An optimum
sampling design and power tests for
environmental biologists. J. Environ.
Mgmt. 16:35-43.

Hewlett, J.D.  and L. Pienaar. 1973.
Design and analysis of the catchment
experiment. In Proc. Symp. Use of Small
Watersheds. E.H. White (Ed.). Univ.
Kentucky.

Green, R. H.  1979.  Sampling design and
statistical methods for environmental
biologists.  New York: John Wiley and
Sons.
Kovner, J.L. and T.C. Evans.  1954. A
method for determining the minimum
duration of watershed experiments. Trans.
AGU. 35(4):608-612.

Ponce, S.L. 1980. Statistical methods
commonly used in water quality data
analysis. WSDG Technical Paper WSDG-
TP-00001. USDA Forest Service.  Fort
Collins, CO 80524.

Reinhart, K.G.  1967. Watershed
calibration methods. In Proc. Intern.
Symp. on Forest Hydrology. W.E. Sopper
and H.W. Lull. (Eds.) Pergamon Press.
Oxford. P.715-723.

SAS Institute, Inc. 1986. SAS  system for
linear models. Gary, NC 27511.

Snedecor, G.W. and W.G.  Cochran. 1980.
Statistical methods. 7th Ed. The Iowa State
University Press. Ames, Iowa.

Wilm, H.G. 1949. How long should
experimental watersheds be calibrated?
Amer. Geophs.  Union Trans. Part II. 618-
622.
ACKNOWLEDGEMENT

This project was supported by U.S. EPA Office of
Wetlands, Oceans, and Watersheds under EPA
Contract No. 68-C9-0013. This fact sheet was
prepared by Dr. John C. Clausen, University of
Connecticut and Dr. Jean Spooner, North Carolina
State University, and reviewed by Mr. Steve
Dressing, U.S. EPA .
                                          8

-------