United States
              Environmental Protection
              Agency
              Environmental Research
              Laboratory
              Duluth MN 55804
EPA 600 3-80 055
June 1980
              Research and Development
vvEPA
Sampling Strategies for
Water Quality in the
Great Lakes
                  LIBRARY

-------
                RESEARCH REPORTING SERIES

Research reports of the Office of Research and Development, U S Environmental
Protection Agency, have been grouped into nine series These nine broad cate-
gories were established to facilitate further development and application of en-
vironmental technology  Elimination of traditional  grouping  was consciously
planned to foster technology transfer and a maximum interface in related fields
The nine series are

      1   Environmental  Health  Effects Research
      2   Environmental  Protection Technology
      3   Ecological Research
      4   Environmental  Monitoring
      5   Socioeconomic Environmental Studies
      6   Scientific and Technical Assessment Reports (STAR)
      7   Interagency Energy-Environment Research and Development
      8.  "Special" Reports
      9   Miscellaneous Reports

This  report has been assigned to the ENVIRONMENTAL MONITORING series
This  series describes research conducted to develop new or improved methods
and  instrumentation for the identification and quantification of environmental
pollutants at the lowest conceivably significant concentrations It also includes
studies to determine the ambient concentrations of pollutants in the environment
and/or the variance of pollutants as a function of time or meteorological factors.
This document is available to the public through the National Technical Informa-
tion Service, Springfield, Virginia  22161

-------
                                           EPA-600/3-80-055
                                           June 1980
     SAMPLING STRATEGIES FOR WATER
       QUALITY IN THE GREAT LAKES
                   by
  Raymond P. Canale, Leon M. DePalma,
         and William F. Powers
    Civil and Aerospace Engineering
         University of Michigan
       Ann Arbor, Michigan 48109
         Grant No. R803754-02-1
            Project Officer

             David M.  Do Ian
      Large Lakes Research Station
Environmental  Research Laboratory-Duluth
       Grosse  lie, Michigan 48138
   ENVIRONMENTAL RESEARCH LABORATORY
   OFFICE OF RESEARCH AND DEVELOPMENT
  U.S.  ENVIRONMENTAL PROTECTION AGENCY
        DULUTH,  MINNESOTA 55804
          1
          U.S. mL<0«t»TAL
          EDTSOI, 8, Jo  OB817

-------
                                DISCLAIMER
     This report has been reviewed by the Environmental  Research Laboratory
Duluth, U.S. Environmental Protection Agency, and approved for publication.
Approval does not signify that the contents necessarily reflect the views
and policies of the U.S. Environmental  Protection Agency, nor does mention
of trade names or commercial products constitute endorsement or recommenda-
tion for use.

-------
                                  FOREWORD
    One of the goals of the International Surveillance Program for the Great
Lakes is to detect long-term changes in the concentration of various para-
meters in each of the lakes.  In order for this program to be successful,
data of sufficient quality and quantity must be provided so that trends can
be accurately assessed.  However, each additional sample collected increases
the cost of the program.  There is, therefore, a trade-off between accuracy
and cost and a sampling strategy must be identified that will obtain the
necessary accuracy at the minimum possible cost.

    This report is the result of an effort to use basic ideas from control
theory and statistics to develop a methodology for identifying optimum
sampling strategies.  Besides the applications included in this report, this
research effort has stimulated further work on different parameters and
space and time scales.
                                     David M. Dolan
                                     Environmental Scientist
                                     Environmental Research Lab-Duluth
                                     Large Lakes Research Station

-------
                                  ABSTRACT
    The major goal of this project was to investigate the potential applica-
tions of Kalman filtering and modern optimization techniques to the design
of sampling strategies for water quality in the Great Lakes.  Two represent-
ative problems of general limnological interest with considerably different
characteristics were studied.  The first problem concerns sampling require-
ments for long-term trend detection of chloride and total phosphorus concen-
trations in Lake Michigan.  A minimum-cost sampling strategy has been deter-
mined in association with the Kalman filter and mathematical models that can
detect changes of 1.0 mg Cl/£ and 1.0 yg P/£ in Lake Michigan with 90 per-
cent probability.  The cost of this sampling program is considerably less
than monitoring programs designed without benefit of mass balance models and
Kalman filter data processing.  An investigation was conducted to determine
the sensitivity of the optimal cost to the accuracy of the mass balance
model parameters.  It is concluded that the calculated sampling cost is most
sensitive to the variance in the apparent phosphorus settling velocity, not
as sensitive to the accuracy of laboratory analyses, and rather insensi-
tive to the accuracy of the effective flow at the Straits of Mackinac and
atmospheric loads.

    The second problem concerns the potential applicability of Kalman fil-
tering techniques for concentration estimation and parameter identification
for a phytoplankton and nutrient cycling model for Saginaw Bay.  The ap-
proach has been used to obtain estimates of the accuracy of model-predicted
phytoplankton concentrations in Saginaw Bay as a function of the uncertainty
associated with model parameters and initial conditions.  It has been deter-
mined that the accuracy of uncalibrated model predictions is limited and de-
pends on the location within the bay.  However, the accuracy of the predic-
tions can be improved if the uncertainties of the model parameters and ini-
tial conditions can be reduced by laboratory experiments or the availability
of field data.  Examples are presented that show how suboptimal sampling
strategies can be developed and used to increase the predictive capabilities
of uncalibrated phytoplankton models in Saginaw Bay.

    This report is submitted in fulfillment of Grant No. R803754-02-1 by the
University of Michigan under sponsorship of the U.S. Environmental Protec-
tion Agency.

-------
                                  CONTENTS
Foreword	i i i
Abstract	    iv
Figures	vii
Tables	viii
Acknowledgment   	     ix

    1.  Introduction  	     1
    2.  Conclusions   	     3
    3.  Recommendations   	     4
    4.  Introduction  to Kalman Filtering  	     5
             Introduction   	     5
             Kalman filter equations  	     5
             A simple Kalman filter example   	     8
             Summary	    10
    5.  An Application to Lake Michigan Chloride and Total
        Phosphorus Trend  Detection  	    13
             Introduction   	    13
             Derivation of the stochastic model  	    14
             Design of measurement  strategies   	    26
             Summary	    33
    6.  An Application to Annual Plankton and Nutrient Cycles  in
        Saginaw Bay	    34
             Introduction   	    34
             Saginaw  Bay  nutrient-plankton model 	    35
             Error propagation without measurement  	    38
             Suboptimal measurement strategy generation   	    43
             Summary	    57

References	    59
Appendices

    A.  Discrete Kalman Filter and Optimization of Measurement
        Strategies	    62
             Introduction   	    62
             Discrete Kalman filter   	    62
             Measurement  strategy optimization  	    65
             Application  to nonlinear models 	    67
             Summary of the Kalman filter and measurement
             Strategy optimization  	    69

-------
B.   Optimal Measurement Strategy:  Overview of Computer
    Program	    71
         Description	    71
         Problem formulation 	    72
         Dimensions	    73
         Program operation 	    75

-------
                                  FIGURES
Number                                                                  Page

  1       Changes in estimates of the state variable and variance
           according to Kalman filtering technique 	    11

  2       Overall summary of Kalman filter technique   	    12

  3       Saginaw Bay model segments, sampling stations, and flow
           pattern	    36

  4       Saginaw Bay model biochemical reactions and forcing
           functions	    37

  5       Nominal states for model segment 5  	    40

  6       Nominal state and standard deviation for chlorophyll a
           for full and zero measurement cases	    45

  7       Nominal state and standard deviation for herbivorous
           zooplankton for full  and zero measurement cases	    46

  8       Nominal state and standard deviation for phosphate
           for full and zero measurement cases	    47

  9       Nominal state and standard deviation for nitrate for
           full  and zero measurement cases	    48

 10       Parameter estimates and standard deviation for the phyto-
           plankton growth rate  for full and zero measurement
           cases	    49

 11       Parameter estimates and standard deviations for the zoo-
           plankton grazing rates for full and zero measurement-
           cases 	    50

 12       Parameter estimates and standard deviations for the zoo-
           plankton respiration  rate for full and zero measurement
           cases	    51

 13       Parameter estimates and standard deviations for the phos-
           phorus-chlorophyll ration for full and zero measurement
           cases	    52


                                     vii

-------
                                   TABLES


Number                                                                   Page
 • ...-.-,.                                                                    	?

  1       Summary of GLRD Data	    16

  2       Apparent Settling Velocity for Small Lakes   	    18

  3       A Summary of Parameters, Constants, and Initial States   ...    24

  4       Required Number of Sampling Stations and Costs for Con-
           servative Formulation of Tributary Load Error 	    28

  5       Required Number of Sampling Stations and Cost for More
           Realistic Formulation of Tributary Load Error 	    29

  6       Optimal Measurement Strategy and Cost for Detection
           Every Year	    32

  7       Estimated Means and Standard Deviations for Model
           Parameters	    39

  8       Calculated Mean and Standard Deviations at the Peak
           Populations for Zero Measurements 	    42

  9       Calculated Standard Deviations of Parameter Values for
           Various Measurement Strategies  	    54

 10       Calculated Standard Deviations of Peak Populations for
           Various Measurement Strategies  	    56
                                     vm

-------
                              ACKNOWLEDGMENTS
    The authors would like to acknowledge the assistance of several  indi-
viduals who were involved in this study.  Mr. Daniel Scharf from the Univer-
sity of Michigan expanded the computer program used in Section 5 and helped
perform the error propagation calculations in Section 6.  Mr. Christopher
Hoogendyk also from the University of Michigan modified the USEPA phyto-
plankton model for Saginaw Bay so that it could be run on the Michigan
Terminal System.  Mr. Steven Chapra from the Great Lakes Environmental Re-
search Laboratory provided some valuable suggestions with regard to the
formulation of the model for Lake Michigan in Section 5.  Dr. Curtiss Davis
from the Great Lakes Research Division (University of Michigan) contributed
information concerning the accuracy of laboratory techniques for total phos-
phorus and chloride concentrations.  Mr. William Richardson from the USEPA
provided cost estimates of the sampling strategies in Section 5.  Finally,
special acknowledgment is directed to David Dolan and Nelson Thomas from the
USEPA who coordinated the project and provided many useful ideas concerning
problem formulation and preparation of the final report.

-------
                                  SECTION  1

                                 INTRODUCTION
    The determination of effective  sampling  strategies  for  estimating  water
quality in lakes depends upon many  factors such  as  seasonal  and  annual  vari-
ations, spatial variations, quality of  the measurement  techniques,  accuracy
of the estimates desired,  and economic  feasibility.   Because of  these  many
factors, a computer based  approach  could  prove to be  a  very useful  tool  in
the design of sampling strategies for water  quality in  the  Great Lakes.
This research is concerned with developing such  a computer  based approach
and demonstrating its feasibility on two  different  types of problems.

    The basic mathematical approach employed  in  the solution of  these
problems is an information processing technique  known as Kalman  filtering.
This technique couples knowledge of the quality  of  the  measurement  proce-
dures, mathematical models of the biochemical processes, and the adequacy
of these measurements and models (for example, variances associated with
measurement and model noise) to determine best estimates of the  concentra-
tions of the water quality variables and  the  variance of the estimates.   In
addition, if the number of variables involved is not  too large,  automatic
computer-based optimization techniques  can be employed  to define optimal
sampling strategies.

    Mathematical models of long-term (30  years)  phosphorus  and chloride  con-
centration .changes in Lake Michigan are developed along with variances  asso-
ciated with these models in Section 5.  This model  and  proposed  monitoring
programs for the lake can be used to develop  an  optimal sampling strategy
that can be completely determined economically on a large computer.  This
allows an extensive sensitivity study of  the  various factors that affect'the
measurement program.  A modified LAKE 1 (Thomann e_t a1_., 1975) model is  em-
ployed in Section 6 for the determination of  a short term (1  year) measure-
ment strategy for phytoplankton, zooplankton, and nutrients  in Saginaw  Bay.
The resultant model contains eight  concentration variables  in each of  five
segments.  Furthermore, thirty-five additional differential  equations  are
used to improve some of the model parameter estimates.  As  a result, a
system has evolved that is too large for  economical automatic computer  opti-
mization.  However, analysis of the variances developed by  the Kalman filter
makes it possible to assess the adequacy  of the  model predictions and to
develop suboptimal  measurement strategies.

    A brief heuristic introduction to the Kalman filter is  presented in  Sec-
tion 4.  More details concerning the mathematical procedures  employed to

-------
develop the Kalman filter and to derive the minimum-cost sampling strategy
subject to the variance constraints are presented in Appendix A and in
DePalma (1977).  The reader interested only in the water quality applica-
tions may proceed directly to Sections 5 and 6.

-------
                                  SECTION  2

                                CONCLUSIONS
    The results of this  study  indicate  that  Kalman  filtering  techniques  can
be applied effectively to assist with the  design  of limnological  and  water
quality sampling programs in the Great  Lakes.   This conclusion  is  based  on
encouraging results obtained for two demonstration  problems.  First,  the
Kalman filter and optimization procedures  have  been used  to design an opti-
mal long-term monitoring program for chloride and total phosphorus in Lake
Michigan.  This sampling program detects concentration  changes  of  1 mgCIA
and 1 ygPA with 90 percent confidence  and is considerably less expensive
than sampling programs suggested without benefits of mass balance  models and
Kalman filtering.  The procedures applied  to this problem can also be used
to study the value of mass balance models  for trend detection.  As model
parameter uncertainties  decrease, the model  becomes more  accurate  and use-
ful, and actual field monitoring requirements and costs decrease.   Prelimi-
nary calculations suggest that it may be appropriate to invest  research
funds to improve analytical techniques  for total  phosphorus and to obtain
better estimates for the rate of phosphorus  loss  to the sediments.

    Kalman filtering and state estimation  techniques have been  used to cal-
culate the uncertainty associated with  phytoplankton model predictions as a
function of the accuracy of model parameters and  initial conditions.   Pre-
liminary calculations indicate that errors in uncalibrated model predictions
of phytoplankton in Saginaw Bay can be  significant  even if it is assumed
that the structure and nature of the system equations are perfect.  Thus,
before accurate predictions can be made, it  is  necessary to improve esti-
mates of the model parameters with field monitoring or  laboratory  research
programs or to supplement model predictions with measured data  to  update
state estimates using the Kalman filter.  These results have direct signifi-
cance when models are used for planning purposes  such as predicting the  im-
pact of nutrient control programs in the Great  Lakes.

-------
                                 SECTION 3

                              RECOMMENDATIONS
    This study has determined an optimal, minimum-cost sampling strategy  for
long-term trend detection of chloride and total phosphorus in Lake Michigan.
This strategy involves limited tributary samples and lake monitoring only
for those years when concentrations must be determined.  It  is recommended
that this sampling schedule be initiated if the only goal of the measurement
efforts is that of trend detection.  Sampling requirements for other goals
(such as to define annual nutrient and plankton cycles) would obviously have
different characteristics.  In such cases, it may be possible to again de-
fine the most efficient sampling program if the system dynamics can be for-
mulated to accommodate Kalman filtering methodology.  Finally, it is recom-
mended that the optimal sampling be recomputed if research results or field
data become available in the future that would allow more accurate statisti-
cal assessment of the model parameters and processes.

    The Kalman filtering and state estimation techniques applied in this
study have been used to estimate the uncertainty of phytoplankton model pre-
dictions in Saginaw Bay as a function of the accuracy of model parameters
and initial conditions.  However, error propagation for only seven para-
meters (from a total of twenty) has been studied because of funding and time
limitations of this project.  Furthermore, the error covariance calculations
depend on the adequacy of a linear approximation of the system state equa-
tions.  It is recommended that the adequacy of this approximation and the
numerical methods employed be evaluated further before the results of the
calculations presented in this report are incorporated in water quality
planning projects.

-------
                                 SECTION 4

                       INTRODUCTION TO KALMAN  FILTERING
 INTRODUCTION

    The Kalman filter was derived by Kalman  (1960)  and Kalman  and  Bucy
 (1961) and has had considerable application  in  aerospace  engineering  and,
more recently, to some problems concerned with  water  quality  (Moore,  1971;
Beck and Young, 1976).  In order to simplify the presentation,  this section
describes the Kalman filter equations for a  scalar  system in which the  water
quality variable is measured at distinct measurement  times  and  the water
quality model is a differential equation.  Complete derivation  of  the equa-
tions for other cases may be found in Gelb (1974),  Jazwinski  (1970),  and
Sage and Melsa (1971), among others.  Appendix  A gives the  Kalman  filter
equations for a general discrete dynamic model  for  cases  where  only some of
the water quality variables are measured along  with the method  of  employing
the filter in the determination of minimum cost measurement strategies  for a
given level of required accuracy.


KALMAN FILTER EQUATIONS

    The Kalman filter is an information processing  technique that  combines a
dynamic model and measurements to produce an improved estimate  of  the state
of the system (for example, the concentrations  of the water quality vari-
ables).  If an adequate dynamic model for the system  does  not exist,  then
standard statistical techniques can be employed, using only the concentra-
tion measurements, to estimate the state of the system.   Thus,  Kalman fil-
tering is mainly applicable when some type of dynamic model of  the system  is
available.  However, even dynamic models that give  only seasonal or annual
trends usually justify the use of the Kalman filter.

    Suppose that a mathematical model of the biological and chemical  inter-
actions is a system of differential equations

         x = f(t,x) + w(t)                                            (4.1)

where x(t) is a n-vector that represents the concentration of the water
quality variables at time t (for example x-|(t)  = phytoplankton  concentration
at time t), f(t,x) is a function that describes the known rate  of change of
x (for example, f contains the advective and dispersive transport, loadings,
and kinetic coefficients), and w(t) is the error in the model at time t.

-------
One of the most difficult problems in mathematical modeling is the quantifi-
cation of the error term w(t) because this term is an indicator of the  value
of the model.  If w(t) could be characterized completely, then, it could  be-
come part of the f(t,x) term.  For most applications, it is only possible  to
statistically characterize the term w(t), which is usually assumed to be
zero-mean, white noise with variance Q(t).  The white noise assumption  is
conservative because it implies no time correlation in the model error  w(t).
The accuracy of the dynamic equations is implied by the magnitude of Q.
Relatively large values of Q imply that f(t,x) is relatively inaccurate.
Quantitative determination of the true value of Q is very difficult, if not
impossible.  Thus, in most applications conservative upper bounds are em-
ployed as estimates for Q.  As an alternative, liberal estimates for Q  (that
is, when Q = 0) give the estimation capabilities of a perfect model.  Note
that w(t) includes any inaccuracies associated with the forcing functions  of
the system.

    In addition to statistically characterizing the error term in Equation
(4.1), it is also necessary to statistically characterize the initial condi-
tions of the system variables at time equal to zero, t0.  Before the Kalman
filter can be employed, values for the mean (x0) and variance (P0) of the
initial state (for example, the concentrations of the water quality vari-
ables) must be available at a time prior to the first measurement.!

    The problems solved in this report involve discrete sampling (as opposed
to continuous monitoring), therefore the measurement process can be des-
cribed as

         y(ti) = x(ti) + v(ti)   (i = 1,2,. ...N)                      (4.2)

where y(t-j) is the measured value of x at ti, v(ti) is the error in the
measurement at t-j, and N is the number of measurements.  Appendix A dis-
cusses cases where only some of the concentrations are measured.  It is as-
sumed that v has a mean value of zero and a variance R.  Typically, the
measurement process is relatively well known and can be characterized ac-
curately.

    Given the information in Equations (4.1) and (4.2) suppose it is desired
to estimate the concentration of a variable in a scalar system at t] >  t0
(that is, at the first measurement time).  This estimation will utilize both
the model prediction and measurement at t].  The Kalman filter is used  to
calculate this estimate in two stages.  The first stage involves a predic-
tion of x^at t] before the actual measurement is taken, which will be de-
noted by x(tT), and the variance associated with this estimate is denoted  by
    The predicted value of the concentration is simply the  integration  of
Equation (4.1) with w(t) = 0 from t0 to f| with x(t0) = x0.  That  is, inte-
grate
 IIn this report, x refers to true values of the state variables, whereas  x
 refers to estimated values.

-------
         x = f[t,x(t)], x(t0) = X0                                     (4.3)

to t] .  This is equivalent to simply using the model  to  calculate  concentra-
tions at t] under the assumption that the model  is  perfect.   The  variance  of
the resultant concentration at t] for the scalar  case  is  denoted  as  P(t])
and is calculated approximately by integrating

         P = 2AP + Q                                                   (4.4)

with  P(t0) = P0 from t0 to ti where

         A - ^ tt,X] I                                                  (fl  r\
         A " ~    I (t,x(t)).                                        (4'bj


This  is called the sensitivity of the dynamic equation.   This  equation  is
based on the theory of stochastic estimation and  has  been  derived  by
Jazwinski (1970) and others.  If f(t,x)  is linear  in  x,  then  Equation  (4.4)
gives the true variance of x.  Other approximations for  P  are  discussed  in
Appendix A.  The approximation of Equations  (4.3-4.5)  is  known  as  the  ex-
tended Kalman filter.

    Equation (4.4) is used to calculate  the  accuracy  of  the prediction  based
on the characteristics of the system (for example,  transport  and  kinetics)
and the model error.  If the first term  in Equation (4.4)  is  positive  (for
example, during rapid concentration increases), then  the  associated  vari-
ance  increases between measurements, and  initial  condition errors  are  magni-
fied  as time increases.  Conversely, if  concentrations are decreasing
rapidly, the first term in Equation (4.4) will be  negative, and the  accuracy
of the estimate could actually increase.  A  simple  example will be presented
later in this section to illustrate this  effect.   The  second  term  in Equa-
tion  (4.4) (Q) is always positive.  Thus, it can  only  cause the accuracy of
the estimate to deteriorate as time increases.

    Integration of Equations (4.3) and (4.4) from  t0  to  t] is  the  first
stage of the Kalman filter, and x(ti) and P(tT) are the  results.   The  second
stage involves processing the measurement y(t-|),  as described  by  Equation
(4.2), to develop an improved estimate of the concentration at  ti, which  is
denoted by x(tj) and the associated concentration  variance P(t|).  The
Kalman filter equations for the scalar case  for this  stage are
         x(t|) = x(tp + K(t1)[y(t1) - x(ti)]                          (4.6)

         P(t{) = [1 - K(t1)]P(tp                                      (4.7)

                           ) + F^r1                                   (4.8)
where K(t]) is called the Kalman Gain.  K is called a gain because,  by
analogy with feedback gain in electronics, it determines how much weighting
is given to the measurement in developing the estimate for the  concentration
at time tt.  These equations have been developed by Jazwinski  (1970) and
others ana are based on determining a minimum variance estimate for the con-

-------
centrations using information from both the mathematical model and  imperfect
measurements.

    It is instructive to consider two special extreme cases of Equations
(4.6-4.8).  First, suppose that the measurement at t] is much more  accurate
than the estimate generated by the model from to to t]  (that is, P(t])  is
much greater than RI).  Then,^from Equation (4.8) the Kalman Gain approaches
one, and from Equation (4.6) x(t]) ~ y(t]).  Thus, the  new information  pro-
vided by the model has little significance.  Also, note that from Equation
(4.7) the associated variance for this case is approximately zero.  Of
course, P(t|) equals zero only for a perfect measurement defined by R-|  = 0.

    Next, consider the case where the measurement is poor and not as accu-
rate as the value predicted by the model (that is, P(tf) is much less^than
R]).  Then, from Equation (4.8) the Kalman Gain approaches zero, and x(tj)~
x(t-T).  In this case, the calculated concentration t| is approximately  the
value determined by the model, or in the limit as R] ->°° it is as if no  mea-
surements were conducted. ^Note that Equation (4.7) implies P(tt) ~ P(tp,
which must be the case if x(t|) ~ x(tp with R-| relatively large.

    The remaining measurements y(t2),...,y(tn) are processed by simply  pro-
ceeding in the same manner as the two stages noted above.  That is, x(t|)
and P(tp are employed as initial conditions for the next time step, and
Equations (4.3) and (4.4.) are integrated from t] to t2 to produce  x(t2") and
P(tp).  Equations (4.6-4.8), with t2 and R2 (instead of t] and R-]), are then
employed to define x(t^) and P(t|).  The Kalman filter  is recursive (as
opposed to batch) because new concentration estimates are obtained  each time
a measurement is taken.  Another property of the Kalman filter, which is
very important in the determination of sampling strategies, is that if  Equa-
tion (4.1) is linear in x or can be approximated by a linear function,  then
the variance equations (Equations (4.4), (4.7), and (4.8)) are independent
of the actual measured concentrations y(t-)),.. ,y(tn) and depend only on the
quality of the measurements, defined by R],	,Rn.  This allows the com-
putation of the variance and an evaluation of the effectiveness of  a
sampling strategy before the sampling begins.


A SIMPLE KALMAN FILTER EXAMPLE

    Probably the simplest mathematical model of the behavior of a pollutant
in a lake is the scalar differential equation

         x = -ax + u(t) + w(t)                                        (4.9)

where x(t) = concentration of the pollutant at time t, u(t) is the  input
rate to the lake (for example, an industrial discharge), a is the positive
constant sinking or degradation rate, and w(t) is the model error resulting
from either imperfect understanding of the physical, chemical, or biological
mechanisms that affect x(t), uncertain values for a, or inadequate  charac-
terization of the loading or forcing functions, or any combination  of these.
Suppose that two measurements of x are to be made at t] and t2 with errors
v(t-|) and v(t2) that have mean values of zero and variances R] and  R2-

                                     8

-------
    To apply the Kalman filter, statistics associated with the mathematical
model must also be supplied.  Suppose that the period of  interest begins  at
t0 < t] with initial estimates of XQ and PQ that are obtained either by
measurement or by model propagation from a time prior to  to-  To  illustrate
the behavior of the Kalman filter, suppose that the model errors  are con-
stants with zero mean values and variances Qi and Q2.  Also, suppose that
the forcing function, u(t), has a constant value of U] between t0 and t]  and
a value of U2 between t0 and t].

    With these assumptions, the prediction equations between measurements
corresponding to Equations (4.3) and (4.4) are
         x = -ax + u
         P = -2aP +
                    .j
which are easily integrated to form the solutions


         x(t) = [x(tt) - u
                                                                       (4.10)

                                                                       (4.11)




                                                                       (4.12)
P(t) =
                                              Qi+1/2a
where x(t0) = x0 and P(t0) = PO-  The values of x(t-j) and
surement are determined by Equations (4.6-4.8),
         x(t{) = x(t') +[K] y(t1) -


         P(t|) = (1-Kj) P(t')
                P(t7)
                                                                       (4.13)


                                                                 after  mea-


                                                                       (4.14)


                                                                       (4.15)
where, from Equations (4.12) and (4.13),
x(t-)
      = (X
                                         U]/a
         P(t') = (P  - Q/2a)e"2a(tl"to) + Q/2a
                                                                       (4.17)



                                                                       (4.18)
    For the sake of illustration, suppose that y(t-|) is a very good measure-
ment compared to the prediction  (R-j is much less than P(tJ)) and vice versa
for the y(t£) measurement (R2 is much greater then P(t2~)).  Note that
because the model differential equation is stable (that is, -a < 0), use of

-------
the dynamic model improves the accuracy of the estimate  if P  > Q-i/2a  (see
Figure 1).

    Similar estimates can be developed at t2 with the results shown  in  Fi-
gure 1.  In this case, Equation (4.16) implies K-| * 1 (because RI  is much
less than P(tp) and l<2 ~ 0 (because R£ is much greater  than PU^)).


SUMMARY

    The Kalman filter equations for a scalar differential equation process
model with discrete measurements have been presented, and a simple water
quality example was employed to illustrate the behavior  of the equations.
The Kalman filter equations provide appropriate weighting to both  imperfect
model predictions and measurements to produce an overall best estimate  of
concentrations based on minimizing the variance at each  step.  If the model
equations are linear or can be approximated by a linear  system, then it is
possible to propagate the variance of the concentration  values as a  function
of the uncertainty of the measurements, model parameter  values, model
forcing functions, and model structure.  However, the estimates of the  error
propagation are independent of the actual measurements themselves.   The
overall techniques of Kalman filtering are summarized in Figure 2.   In  the
following sections, the Kalman filter equations will be  used in the  deter-
mination of optimal sampling strategies for total phosphorus and chlorides
in Lake Michigan and phytoplankton and nutrient dynamics in Saginaw  Bay.
                                     10

-------
                        Kalman State Estimate
                        Ka I man Variance Estimate
Figure 1.  Changes in estimates  of the state variable and variance
         according to Kalman filtering technique.
                           11

-------
tr- t.
^v


^

•or Statistics
ft_
UJ
                                                                            CT
                                                                           •i—

                                                                            C
                                                                            o
                                                                            O)
                                                                            s.
                                                                            a;
                                                                            -p

"£
E
o
V.
O
CL
o


c
g
i4_
O
c

*^-
c
o
o
M—
p



(/>
v_
o


E
J»

                                                                            (O
                                                                           V-
                                                                            o
                                                                            (O
                                                                            3
                                                                            I/)

                                                                            cr>

-------
                                 SECTION 5

       AN APPLICATION TO LAKE MICHIGAN CHLORIDE AND TOTAL PHOSPHORUS
                              TREND DETECTION
INTRODUCTION

    One goal of the International Surveillance Program for the Great  Lakes
is to detect long-term changes in the concentrations of chloride  ion  and
total phosphorus in each of the lakes.  This section examines the cost-
effectiveness of such a surveillance program for Lake Michigan and demon-
strates how measurements of lake concentrations and mass balance models may
be effectively employed together for trend detection.  When appropriate, the
extension of the techniques to other large lakes and water quality variables
will be discussed.

    The present surveillance plan for Lake Michigan involves lakewide
sampling for two successive years every nine years and tributary sampling 26
times per year.  Material balance models are not normally considered  as
sources of information that can be utilized, along with data, to estimate
concentration changes and thereby detect trends.  This research demonstrates
how uncertain parameters in mass balance models of lakewide average concen-
trations can be interpreted as random variables, so that both the model and
data can be employed in a Kalman filter.  If the filter is used to estimate
the accuracy of lakewide average concentrations over the next thirty years
and a sampling program is prespecified, then the standard deviations  asso-
ciated with the accuracy of the estimates can be precomputed.  The estimated
errors must be small enough for trend detection, thus the standard devia-
tions must satisfy upper bound constraints.

    It is also shown that the problem of choosing the appropriate number of
days for which tributaries must be sampled is equivalent to that of choosing
a controllable model error covariance matrix.  The minimum-cost lake sur-
veillance and tributary sampling programs that meet accuracy criteria on the
estimates of lakewide average concentrations are determined.  The criteria
are chosen to guarantee that trends will be detected with some specified
statistical confidence.  A series of measurement strategy optimization
problems are solved to illustrate the role that Kalman filter techniques can
play in the design of monitoring programs.
                                    13

-------
DERIVATION OF THE STOCHASTIC MODEL

The Indicator of Trend

    Chloride is a conservative pollutant that occurs primarily  in  dissolved
form and is affected little by the annual biological cycle.  Total  phos-
phorus, however, includes phosphorus that is utilized in the biochemical
cycle of phytoplankton and other organisms.  As phytoplankton cells mature
and die, they accumulate at the sediment-water interface, resulting in  re-
moval of phosphorus from the water column.  A measure of the phosphorus
available for biological production during a particular year is  provided  by
the lakewide average concentration of total phosphorus following spring
overturn.  Because ice at this time may prevent a sampling program from
being implemented, it is convenient to assign May as the month when chloride
and total phosphorus concentrations are relevant to trend detection.

    The lakewide average concentrations of chloride and total phosphorus  for
May of year n are defined, respectively, by
         Xi(n) = m^nJ/VCn) and X2(n) = m2(n)/V(n)                     (5.1)

where m-](n) and m2(n) are the pollutant masses in the lake  (including  the
nearshore zone but excluding the sediments) and where V(n)  is the  lake
volume.  Lakewide average concentrations may be crude indicators of trend  if
severe concentration gradients exist.  However, if the lake  is segmented
spatially, then mass balance models must be derived for each segment aver-
age, and the resultant model must include uncertain parameters for advective
and dispersive mass transfer.

    In order to derive difference equations for the lakewide average concen-
trations, equations are first determined for the masses m-j(n) and m2(n).
During year n (May of year n to April of year n+1), mass exits the lake
through the Straits of Mackinac (SM) and the Chicago Sanitary and Ship Canal
(CSSC) and enters from tributary, atmospheric, and direct loading.  Further-
more, a fraction of the total phosphorus that reaches the sediments during  a
year is incorporated into refractory deposits, resulting in  a net sedimenta-
tion loss (Hutchinson, 1957).  These sources and sinks are  approximated on  a
net annual basis rather than on a seasonal basis, and their  statistical
characteristics are evaluated with information presently available.

Straits of Mackinac

    The net annual flow through the SM is toward Lake Huron; however,  at
certain times of the year the flow is toward Lake Michigan.  The net ex-
change of pollutant mass between the two lakes during one year is the  dif-
ference between the flux towards Lake Michigan and the flux  towards Lake
Huron.  Each flux is the product of an annual flow and an effective concen-
tration.  The effective concentration lies in the range of  instantaneous
concentrations during the year.  If qjn and qout are the components of an-
nual flow toward and away from Lake Michigan, and if X-jn and Xout are  the
effective concentrations on the Lake Huron and Lake Michigan sides of  the
SM, then the net loss from Lake Michigan is,

                                     14

-------
         Xout "out - Xin  then the net loss is proportional to Qout - qin> which  is
the net annual flow.  However, chloride and total phosphorus concentrations
in Lake Huron are less than those in Lake Michigan because of dilution by
Lake Superior water.  A conservative upper bound on the net loss  is obtained
by setting Xin = 0» in which case the loss is proportional to q0ut-  There-
fore, the actual loss must lie between Xout (qout - qin) and Xout q0ut and
is denoted by Xout qs, where qs is an uncertain effective outflow.  Quinn
(1976) has computed net annual flows, qout - q-jn, of 224 x 10^£  to 522 x
10^£.  The validity of the technique has been upheld by current  meter data
from Saylor and Sloss (1976).  Furthermore, Quinn suggests that these  cur-
rent meter data can be employed to obtain the following relationship between
net flow and gross flow,


                           -«in>'                                     (5'3)

If Equation (5.3) is used to estimate the gross outflow for 1950-1966  from
Quinn's net flow, then a range of 374 x 1CHU to 871 x 10"1U results.

    The effective flow for each year, n, qs(n), is interpreted as a random
variable, uniformly distributed between the minimum net flow and  maximum
gross flow for 1950-1966.  The mean and standard deviation of this random
variable are, respectively, 547 x 1QTU and 186 x 10^£.  The effective con-
centration on the Lake Michigan side of the SM for year n, Xou^-(n), must lie
in the range of concentrations in the outflowing water during the year.  It
is assumed that this range is narrower than the range of concentrations over
the entire lake for May of year n.  The effective concentration is repre-
sented as the sum of the lakewide average concentration in May, X(n),  and  a
deviation, es(n).  That is,

         XQut(n) = X(n) + es(n).                                       (5.4)


The deviation is a zero mean random variable that is representative of the
spatial heterogeneity of the lake.  Its standard deviation can be estimated
from unpublished data collected in the northern half of the lake  by the
Great Lakes Research Division (GLRD) of the University of Michigan for April
and June 1976.  A statistical summary of the chloride data from thirty-seven
stations sampled during the June cruise and similar results for total  phos-
phorus for thirty-three stations sampled during the April cruise  are pro-
vided in Table 1.  Only stations of depth greater than twenty meters have
been included to avoid bias resulting from nearshore pollution.   For each
station, the mean (unweighted with depth) and standard deviation  of the data
have been computed.  The standard deviation of the station means  is assumed
to be representative of the variation in the data because the data suggest
that the horizontal variation exceeds the vertical variation.  These stand-
ard deviations are 0.32 mg Cl/£ and 1.6 yg P/£.  However, the total observed
variation is a result of not only spatial heterogeneity, but also laboratory
analysis error.  Great Lakes Research Division has suggested that standard

                                     15

-------
                         TABLE 1.  SUMMARY OF GLRD DATA
Chloride
Station
no.
2
4
6
8
10
13
15
17
19
20
23
25
27
29
30
32
34
36
39
42
43
44
46
47
48
49
53
54
55
58
59
60
63
64
67
•68
69
Number
of
samples
4
6
13
6
4
4
6
17
5
7
5
14
6
4
4
4
5
9
4
3
3
3
3
4
3
3
2
3
4
3
5
3
5
3
6
5
3
Mean
mg/£
7.73
7.68
7.69
7.70
7.80
7.70
7.72
7.73
7.86
7.89
7.73
7.73
7.68
7.63
7.79
7.72
7.74
7.73
7.58
7.70
7.68
7.68
7.34
7.58
7.55
7.45
7.01
7.10
7.32
6.58
7.07
7.06
7.00
6.89
7.84
7.83
7.90

Standard
deviation
mg/£
0.02
0.03
0.03
0.05
0.19
0.01
0.03
0.06
0.02
0.02
0.02
0.02
0.07
0.03
0.10
0.05
0.04
0.05
0.11
0.04
0.03
0.08
0.22
0.14
0.01
0.14
0.01
0.03
0.19
1.13
0.06
0.12
0.26
0.10
0.06
0.05
0.06
Total phosphorus
Station
no.
2
4
6
8
10
13
15
17
19
20
23
25
27
29
30
32
34
36
39
42
43
44
46
47
48
49
54
55
58
59
60
63
64




Number
of
samples
4
6
12
6
4
4
6
17
6
6
5
14
6
4
4
4
5
9
4
3
2
4
3
4
3
3
3
4
3
3
2
3
3




Mean
yg/£
7.3
7.9
8.1
9.1
14.8
8.6
7.4
6.7
7.0
7.1
6.9
7.4
6.9
6.8
8.2
5.9
6.5
7.9
6.3
6.5
6.5
7.8
6.6
6.1
6.4
6.2
8.4
5.9
5.8
6.1
6.0
6.4
5.2




Standard
deviation
yg/£
0.6
0.6
1.2
1.0
2.4
1.3
0.4
1.3
0.6
0.5
0.4
0.7
1.1
0.8
1.1
0.6
0.6
1.2
0.6
0.7
0.1
2.3
0.4
0.2
0.4
0.4
2.0
0.6
0.7
0.6
0.1
0.6
0.4




Mean of station means

Standard deviation of
station means
7.55 mg/£     Mean of station means      7.2
              Standard deviation of
0.32 mg/£     station means              1.6 yg/£
                                    16

-------
deviations for laboratory errors are 0.10 mg C1/&  and  1.0  yg  P/&J   There-
fore, the standard deviations for station concentrations about the  lakewide
average concentrations are 0.30 mg Cl/£ and 1.3  yg  P/£.  These are  employed
for the standard deviations of £sj(n) and es2(n),  respectively, where  sub-
scripts have been introduced to distinguish chloride from  total phosphorus.

    The net losses of pollutant masses through the  SM  are, respectively,
   (n)  + es  (nfl  q$(n)  and  |~X2(n)
                                                     qs(n)
Chicago Sanitary and Ship Canal

    The annual diversion of water from Lake Michigan to the  Illinois River
through the CSSC has been limited to 28^7 x lO^A since 1970 by U.S. Supreme
Court decree.  The outflow, denoted by qfc, is  interpreted as a known con-
stant at this maximum value.  The effective concentration in the outflow is
interpreted in the same manner as for the SM with resulting  losses of
                             and
                                                                       (5.6)
The zero mean random variables,  £q(n) and  ^(n), are different random
variables than esi(n) and es2(n) but have the same standard deviations, that
is, 0.30 mg Cl/£ and 1.3 yg P/JL

Refractory Sediment Deposits

    For a lake in which the lakewide average phosphorus concentration  in the
spring is at steady state, the phosphorus retention coefficient is


         R  = (min -
where m-jn and m0ut are the mass loading and mass outfall during a year.  The
apparent phosphorus settling velocity S[m/yr] is that velocity such that the
net sediment loss for a year is Sm/h, where m is the lakewide mass in the
spring and R is the mean depth.  According to Chapra (1975), the apparent
settling velocity can be related to the retention coefficient by
S = Rp(q/A)/(l-Rp),
                                                                       (5.8)
where q is the total outflow and A is the lake surface area.  Although Rn is
difficult to measure for Lake Michigan, data have been compiled for smaller
lakes.  Resultant apparent settling velocities, computed with Equation
(5.8), are summarized in Table 2.  These data suggest a range of 0.7 to 37.9
m/yr.
^Personal communication, Curtiss Davis, GLRD.

                                      17

-------
              TABLE 2.  APPARENT SETTLING VELOCITY  FOR  SMALL LAKES
Lake
^Four Mile
Raven
Talbot
Bob
Twelve Mile + Boshkung
Halls
Beech
Pine
Eagle + Haliburton
Oblong + Haliburton
Rawaon
Clear

^Brewer
Clarke
Costello
Kearney
Aegerisse
Bodensee
Turlersee
Zurichsee
Hallwilersee
Griefensee
Pfaffikersee
Baldeggersee
Tahoe
Found
Little McCauley
S[m/yr]
11.0,11.0
11.3,13.3
9.3
16.2,16.3
22.9,21.4
25.5,29.4
8.6,16.8
5.0,1.4
8.6,14.6
14.5
5.6,4.9,6.0
4.3

3.7
17.0
8.7
6.5
12.1
37.9
26.0
11.3
4.1
15.3
23.1
11.8
5.7
7.2
8.8
Lake
3Mendota

^Joseph
Rousseau

5Conesus
Can ad ice
Honeoye
Washington
—t
'Leman
Bay of Naples
Canadaligua
Carlos
Carry Falls
Cass
Charlevoix
Higgins
Houghton
Long (Aroostook Co.)
Long (Cumberland Co.)
Mattawamkegg
Moosehead
Pelican
Rangeley
Sebago
Winnipesaukee
S[m/yr]
11.1

8.8
13.6

35.8
30.9
23.3
27.0

3.2
14.1
5.1
4.5
20.2
15.1
8.9
2.1
1.6
5.3
8.6
14.4
4.6
0.7
4.9
7.8
11.0
iKirchner and Dillon (1975).



2Dillon and Kirchner (1975).



3sonzogni (174).



^Michalski et al_. (1973).



Sstewart and Markella  (1974).



6Lorenzen e_t aj_.  (1976).



7Larsen and Mercier  (1976).





Multiple values represent different  data  sets.
                                      18

-------
    The mechanisms that govern the formation  of  refractory deposits  are com-
plex.  The exchange between the sediments  and  the  overlying waters of  a lake
depends on currents and degree of stratification,  on  phytop'iankton species
composition, and on biochemical cycles  at  the  interface.   Harrison et  al.
(1972), for example, discuss the effect  of oxygen  and  bacteria  on the  ex-
change.  It would be extremely difficult to relate exchange mechanisms  in
Lake Michigan with those of the lakes  in Table 2.  Therefore, S will be
interpreted as a random variable, uniformly distributed between 0.7  and 37.9
m/yr.  The_mean and standard deviation  are, respectively,  19.3  and 10.7
m/yr.  If h(n) is the mean depth of Lake Michigan, then the net loss of
phosphorus to the sediments during the year n  is

         S(n) m2(n)/h(n).                                              (5.9)


Tributary Loads

    The GLWQB has identified twenty-seven  major  tributaries in  the Lake
Michigan drainage basin.  The mass of  pollutant  discharged into the  lake
during a year by a tributary is the integral  of  the  instantaneous flux  over
the year.  The Mean Value Theorem permits  the  integral to  be replaced  by a
sum

                365

         Load =  Z) X ,  q,                                              (5.10)
                d=l  d  d
where q
-------
The error associated with this procedure is interpreted as a zero mean  ran-
dom variable whose standard deviation is equal to that of data  available  for
the Maumee River (U.S. Army Corps of Engineers, 1975).  The standard  devia-
tions of these data are 15.6 mg Cl/£ and 271 yg P/£.  These deviations  are
probably conservative upper bounds on the standard deviations of error  asso-
ciated with estimates of Lake Michigan tributary concentrations.  This  re-
port does not consider tributary-specific standard deviations,  but  rather
employs o^ = 15.6 mg Cl/£ and aj? = 2.71 yg P/£ for each tributary.

    The error variances for the annual tributary loads are given by

        2^2    22
       0-i  / j q , + crC  x j  q ,  and

                                                                      (5.11)
        22    2       2
       °2  ^ qd + a2  ^  qd  '
for chloride and total phosphorus.  The calculations in Equation  (5.11) re-
quire the tributary discharges qd-  In order to compute optimal lake mea-
surement strategies for the next thirty years, however, standard  deviations
must be available now.  Therefore, the qj are replaced by the maximum daily
discharge on record for the tributary.  This discharge for the jth tributary
is denoted by qmaxJ, and is computed from the maximum discharge at the
closest station to the mouth of the tributary by correcting for ungauged
drainage.

    Conservative error variances for the total tributary loads for year n
are then computed by

                                  27
       r 2         2           i
        a, D(n) + a'  (365-D(n))   T,  (qmax3)
       L          '           J j=l
       E
2         2           I  2?
y D(n) + a; (365-D(n))   ^  (qmaxj)'
where it is assumed that both chloride and total phosphorus are analyzed  in
the water samples collected on days D.  The most appropriate number of  days
of sampling for year n, D(n), is interpreted as a variable to be determined
in association with measurement strategy optimization.

Atmospheric Loads

    Precipitation samples collected by Junge and Werby  (1958) suggests  that
the annual chloride atmospheric load for Lake Michigan  is between  6.4 x 1012
and 8.7 x 1012 mg.  In the mass balance model, the chloride load is denoted

                                    20

-------
by the random variable a-|(n), which  is uniformly distributed  between  these
extremes and, therefore has a mean of 7.6 x  10^ mg  and  a  standard  deviation
0.7 x 10^2 tng.  Murphy and Doskey  (1975) collected precipitation  samples
from the shores of the lake.  Their  data are  employed  to compute  a  range  of
12 x 10^4 to 18.5 x  10^4 yg for the  annual total phosphorus atmospheric
load.  This  load is  interpreted as a random  variable,  a2(n),  uniformly dis-
tributed between the extremes, and thus has  a mean of  15.3 x  10^ yg  and  a
standard deviation of 1.9 x 1014 yg.  If data become available  that define
more significant phosphorus loading, then the optimal  strategies  reported in
this report  should be recomputed.

Direct Loads

    Pollutants from  some industries  and municipalities discharge  directly to
Lake Michigan or discharge to a tributary below the  sampling  point  for that
tributary.   The U.S. Environmental Protection Agency (1972) has computed  a
direct annual phosphorus load for  1971 of 17  x 10^4  yg, which is  significant
compared with the sum of 1970 tributary and  direct loads of 97  x  10'4 yg
(Chapra, 1977).  Corresponding data  for the  direct chloride loads are not
available, although  the sum of 1960  tributary and direct loads  was  computed
by O'Connor  and Mueller (1970).  The direct  loads are  denoted by  d](n) and
d2(n) and are interpreted as known constants  because their standard devia-
tions are probably small compared to those of the tributary and atmospheric
loads.

Summary of the Model

    The approximations for the sources and sinks of  chloride  and  total phos-
phorus are employed to derive the stochastic  difference equations for the
lakewide masses:
             = m.,(n)-  [x-,(n)+es  (n)]  qs(n)-  [x](n)+£c (n)]
                     r             -I         r             n            (5J3)
    m2(n+l) = m?(n)-  X?(n)+e   (n)  q  (n)-  X9(n)+e   (n)   q
               ^     L t.      :>2   J  S      L  c.      Cp   J  C


            - S(n) m2(n)/R(n)+  [t2(n)+a2(n)+32(n)]


    Quinn (1975)  has computed Lake Michigan water levels for the first of
May for 1900-1972.  The levels have a standard deviation of  1.1 feet and can
be converted to volumes with an  accuracy of + 0.4 percent.'   Volumes range
^Personal communication, Leonard Schutze, U.S. Army Corps of Engineers.

                                    21

-------
from 4.89 x 1015£ to 4^97 x 1015£.  Therefore, the volume has been inter^
preted as a constant, V = 4.93 x 1015.  Similarly, the average depth is h
85m.  Therefore, the equations for lakewide average concentrations are


     X^n+1) = [l  -  qs/V  -  qc/V"]Xl(n)


            - = (X  e.  (n)  + q"  e.  (n)  + X,(n)  q (n)l
              y L  S   S,        C   C-|        I      b    .1
                    ,(n)  + a,(n)  + d,(n
               v  L1        '        '                                   (5.14)


     X2(n+l)  = [l  - q~s/V  - q"c/V - S(n)/F]x2(n)


                         (n) + q   e  (n) + X2(n) q (n)l
                                C.  02       C-     :>   J



             + t  [t2(n)  + a2(n)  + d"2(n)]



where qs(n) = qs(n) - Q^s and "qs is the mean of qs(n).  The terms £s-j(n)qs(n)
and eso^^sC11) are small and have been neglected.
                      e
               rr    S
    The apparent annual phosphorus loss rate may be interpreted as a model
state (X3 = S/h) with trivial dynamics

         X3(n+l) = X3(n)                                              (5.15)


and a random initial condition.  Of course, by writing Equation (5.15), it
is hypothesized that the apparent loss rate is constant in time with some
uncertain value.  Although this may be strictly incorrect, it is desirable
to demonstrate that the estimate of an uncertain model parameter can be im-
proved, that is, its error variance reduced by interpreting the parameter as
a model state.  With this interpretation, Equations (5.14) and (5.15) repre-
sent a three-dimensional nonlinear stochastic model with states X], X2» and
*3-

    At the initial year, n = 1976, the three state variables are uncertain.
From GLRD data in Table 1, mean values of 7.55 mg Cl/£ and 7.2 yg P/£ are
computed for X] (1976) and X2 (1976), respectively.  The accuracy of these
estimates depends on the variance of point concentrations in the lake and on
the number of samples employed in the computation of the means.  If the
average of L(n) station concentrations serves as the measurement, then it is
possible to determine a formula that can be used to relate the measurement
error and L(n) (DePalma, 1977).  If 62 is the variance resulting from the

                                     22

-------
heterogeneous nature of the  lake  and Y2  is  the  variance  resulting from
laboratory analysis error, then the overall measurement  error  variance is


          (62 + Y2)/L(n).                                               (5.16)


The overall error variance of a measurement and Equation (5.16)  are  used to
compute the standard error of the  initial concentration  estimates.   The
standard  errors are 0.05 mg  Cl/£  and 0.29 yg  P/£ using data  in Table 1.

    An estimate for S  in 1976 is  19.3 m/yr  from Table 2.   The  standard de-
viation of the error in this initial estimate  is the  standard  deviation
associated with the uniform  distribution, that  is,  10.7  m/yr.  Note  that
this does not equal the standard  deviation  of  the data in  Table  2 because
these data were only employed to  establish  a  range  for S.  A summary of  the
model parameters, constants, and  initial states in  given in  Table 3.

Kalman filter State Estimation

    Equations (5.13),  (5.14), and  (5.15) constitute a nonlinear  stochastic
model with uncertain states  (X],  X2, and X3),  uncertain  parameter^ (es-|,_
£S2'-£C1' £C2' ^s» al' a2' t]» and t2)'  and known constants  ("qs, q"c>  v»  dl •>
ana d2).  The Kalman filter  (Sage  and Melsa,  1971)  is a  method that  can  be
used to calculate an unbiased minimum variance  estimate  of the state vari-
ables (chloride and total phosphorus).   A detailed  description of the basis
and approach of the Kalman filter  is described  in Appendix A.  Briefly,  the
method requires statistical  characterization  of both the model equations and
the measurements but does not depend on  the actual  measurements  themselves.
It is this property of the filter  that allows  the development  of an  a priori
optimal measurement strategy.  If  estimates of  the  state variables  are also
desired, then measurements are required.  In  order  to implement  the  linear-
ized Kalman filter, the nonlinear  state  equations are linearized about nomi-
nal states that are computed by setting  each  uncertain parameter and  initial
state to  its mean value and  then  solving the  state  equations (Equations
(5.13), (5.14), and (5.15)).  The  nominal states have been computed  assuming
that future loads are the same as  present loads;  however,  the  Kalman  filter
approach  is also applicable  if future loads can be  represented as known
functions of time.

    The Kalman filter combines the model and measurements  to compute  an
estimate of the lakewide average  concentration,  which is theoretically more
accurate than estimates based on  the model  or measurements alone.  The esti-
mation error variance, which is a measure of the accuracy  of the Kalman
estimate, can be computed, and it  depends on  the model parameter variances
and measurement error variance but not on the  actual measurements.   Once the
number of days, D(n), on which tributary samples are collected and  the num-
ber of stations, L(n), at which lake samples  are collected have  been  speci-
fied (for a number of years  beyond 1976), the  estimation error variances can
be computed.  Furthermore, D(n) and L(n) may be  chosen to  meet constraints
imposed on the estimation error variances.
                                     23

-------

















oo
LU
I—

h-
oo

_1
l~«<
1—
^ — i
5^
i — i
Q
^

**
oo
r-

e-f
i 	
OO

O
o
„
oo
ce:
LU
h-
LU

^f
f^S
t^
PI

u_
o

^_
rv

«g-
Kr1
— )
OO

•=c

%
oo

LU
	 1
CD
1 —

















+J
C










c:
•o o

3 4_i
-O ro
— .p-
rO >
4-> CU
00 TO











f^
ro
CU
^























P-
£>
>r-
+->
Q.

(_

a
Q















C S«
•p- a
rO +->
4J CU
S_ E
O) ro
0 S-
C ro

^ ^ ^ ^
CD CO CT> CD CO O} CO CT>
E^-S3. E ^ E 2-








i — CM *d- CM ^t
r— p-~ I — p— i —
O O O O O
O O r— r— P- p- r-
OO OO OO OO
x xx xx
O i— O r-
to i — to i^. o~>
CO to • • •
P- IP- O r-
CM i —
CM 1
CM
CM




CM «d- CM ^d-
0 O O O
I — 1 — p— 1 —

OOO OO XX XX

*^I" r^ to oo
oo to .
en r^ LD
ai .—

to
\&~
c
ro
CU •—
E "a
E ro
0 E 0
E s- o •—
O *4-. S_
S- tt- -t->
t(_ c: o
0 20)
C T- 0 S_
O -M ' — >r-
• P- rO M- T3
-!-> S-
rO +-> CU TD
S_ C > E
4_> 
d) C O "O
O O CU ro
C O M- O
0 M- i— CM
o o cu to
cu oo cu >> -a
s: 01 oo a> 2: s- ro
OO ro O "3 OO fO O
S_ S_ 4-> P-
4_ CU M_ CU 4- 3
o > o > o ja u
n3 (O >r— •!—
C C C S- S-
O CU O Q) O -I-J OJ
+J -p- +-> -P- -(-> "4— D-
rO 2 ro S ro O to
• P- CU •(— d) -P- O
>±> > -^ > E E
QJ ro CU ro CU 3 -(->
OP- O i— O OO «=C





- «k

p— CM
x 1 "O I'D
/ 	 * 	 ^ ' v + + .-^— ^^— v
r— CM p- CM ^,
0,00 oo 01 /r,?0 J7~^
W OJ OJ W !CT +-> •>-> ro ro
T3
CU
•r- O

C
O tO
o -o
0
a.
cu
S-
o
o

to
c
0

i '
ro
•P-
>
CD
~CT

S-
(O
T3
C
ro
to
t,_
O

OJ
CD
C
ro
S_

cu
_c
1—

•
< — N
to
~^
s-
o
.c
Q.
to
o

CL
^ — ••

O
r^
Ol
t —

-a
p~
rd
a>
•o
•p-
i-
o
p^
^3
o
o
to
CTl
p—

S-
0
a> •
i- CM
rd i—
00 A|
c~
rO O
11
E A|
cu ur>
JC 10
h- 00











































.
. — -»
00
3
S-
O
-C
a.
00
0

Q.
* — *

•=3"
r*^
CTi
P—

T3
c~
ra

O)

•1 —
S-
0
p—
c~
o
un
LO
Ol
p—

^_
0
ti-
0)
S-
ro
oo
C
ro
CL)
E
CU
.c
CM
24

-------
                 i—     LO

                 ""o   '~
       
oo

UJ
	I
CO
       c:
       rd
       l/l

       C

       O
             00


             JC
             en
             OJ
             (J
             O)
             OJ
             rd
             CU
                   CO
                   oo
                         ro
                         CTl
                                in
                                co
                   O
                   ^0
                   t/O
cn
rs
o
                          a>
            Q.
            a>
            a
              l/>    O
             !cr   la-




t3
S-
rd
-o
C
rd
-4-J
oo








































rd
•r—
4-5
C
I — I
-P
•r—
C
ZD
C
O
• ^»
+->
rd
•r—
>
O)


rd
CD


















c~
0
•r—
_[_>
Q.
•r—
^_
O
I/)
O)

+->
rd
to

^
CD
E


un
o
•
o



LT>
un
•


c:
0
•i —
i *
rd
S-
-4-J
C
0)
<_>
cr
O
o

O)
CD
rd
S-
0)
>
rd

O)
~O
• r-
^
O)
rd
1 —
O)
~C3
s_
o
1 —
o
r--.
CT>
'""
~T^-
X
^
cn



o^
00
•
o




oo
•

o
•f—
_(_3
rd
S-
1 *
rr
O)
o
c:
o
o

O)
CD
rd
^_
O)
>
ro

CD
-a
•i —
s
O)
V
rd

00
S-
0
Q_
00
o
a.
to"
CT>
1 —
^C\l
X
1
S-


^o
oo
^-«
•
o



00
00
•
o
















l-c

oo

cu
4J
rd
S-

co
(/I
o
r~
00
S-
0
Q.
00
O
Q-
VD~
C7i
^_
^"0"
X
                                               25

-------
DESIGN OF MEASUREMENT STRATEGIES

    The Great Lakes Water Quality Board  (GLWQB)  (1976) has recommended  a
long-term schedule for intensive sampling of the off-shore waters of  Lake
Michigan.  The lake will be sampled two  successive years every seven  years
(1976, 1977, 1985, 1986  ... ).  Tributaries will be sampled 26 times  per
year.  One goal of the program is to detect trends in the concentrations of
chloride and total phosphorus.  This goal would be accomplished by comparing
estimates of lakewide average concentrations for the years when samples are
collected.  The estimates of concentration for any one year would be  based
only on data from that year.  No procedure is presently available for intro-
ducing mass balance information into the estimation process.

    It is appropriate to first examine the effectiveness of the proposed
GLWQB sampling program for the case where no mass balance information is
utilized and the data are not processed  by a Kalman filter.  Consider a
sampling program in which sixty deep-water stations are sampled two consecu-
tive years every seven years.  If it is  assumed that four depth samples are
collected at each station and are averaged to compute the station concentra-
tion and the mean of the station concentrations is interpreted as a measure-
ment of the lakewide average concentration, then from Equation (5.16) the
standard deviations of the chloride and  phosphorus measurement errors are
0.04 mg/£ and 0.2 yg/£, respectively.

    In order to evaluate the effectiveness of this surveillance program,
criteria for trend detection must be established.  For demonstration  pur-
poses, it will be assumed that changes of 1.0 mg Cl/£ and 1.0 yg P/£ must be
detected with 0.90 probability.  Measurement errors must, therefore,  fall
below 0.5 mg Cl/£ and 0.5 yg P/H with 0.90 probability.  In order to  convert
these criteria into constraints on the measurement error variances, it could
be assumed that measurement errors are normal random variables (which is not
true in general) or the Chebyschev Inequality (Davenport, 1970) could be
used that applies to any random variable regardless of its density function.
To avoid any unnecessary assumptions, the latter approach is chosen.  This
inequality guarantees, with a probability of p, that the estimation error
will not exceed A, if the standard deviation does not exceed A /1-p.  There-
fore, (for p = 0.90 and A - 0.5) the constraints on the standard deviations
are 0.158 mg Cl/£ and 0.158 yg P/£.

    It is concluded that the GLWQB planned surveillance program does  not in-
clude enough stations to meet the detection criterion for phosphorus  (0.2
> 0.158 yg/£), but includes more than enough stations to meet the criterion
for chloride (0.04 < 0.158 mg/£).  Note  that the measurement error variance
will be the same for 1977, 1985, 1986, 1994, 1995, 2003, and 2004 if  sixty
stations are sampled each year and the Kalman filter is not employed  to pro-
cess the data.  In order to meet the detection criteria (without the  Kalman
filter) a minimum of about 108 stations must be sampled according to  Equa-
tion (5.16) (Strategy I).

    Although the criterion for total phosphorus trend detection is not met
by the monitoring program proposed by the GLWQB, a strategy may exist that
satisfies the criterion but costs less than Strategy I.  An interactive

                                     26

-------
optimization algorithm  involving  the  solution  of  a sequence of linear pro-
gramming problems  (DePalma,  1977)  has  been  applied to  compute the minimum-
cost strategy that satisfies the  chloride  and  total  phosphorus detection
criterion every nine years  (Strategy  II).   The number  of  lake stations L(n)
is chosen between  zero  and  168, the  latter  corresponding  to two ships,
operating two weeks, each sampling six stations per  day.   The number of
days of tributary  sampling  D(n) is chosen  between  twelve  and 365.   The opti-
mal strategy is summarized  in Table 4.  Tributary  samples,  beyond the basic
twelve per year, are not requested by  the  strategy,  and  lakewide sampling is
only requested for those years when constraints have been imposed on the
standard deviations.  The standard deviations  corresponding to Strategy II
and all other strategies to be discussed in  this  chapter  have the following
basic characteristics.  The chloride  standard  deviations  are well  below im-
posed constraints  of 0.158  mg/ji for all thirty years,  and the phosphorus
standard deviations equal the constraint of  0.158  yg/£ for  the years when a
constraint is imposed and exceed  0.158 yg/£ for all  other years.  Clearly,
requirements for phosphorus detection  control  the  measurement strategy and
the optimal strategy would  not change  if chloride  were ignored in the opti-
mization .

    Employment of  the Kalman filter has reduced the  cost  required for trend
detection.  If four water samples  are  collected at a station and analyzed
for all basic chemicals, including chloride  and total  phosphorus,  then the
total cost incurred per transect  is $3,320.   If one  water sample is  col-
lected from each tributary  and analyzed for  all basic  chemistry, then the
total cost incurred in  sampling the system  is  $3,300 per  day. "I  Only twelve
days of tributary  sampling  are requested by  the optimal  strategy,  which re-
presents a saving  of $46,200 each year, and  lake  samples  are requested only
for those years when constraints  are  imposed.   The number of lake stations
required for each of the four years when a constraint  is  imposed is  shown in
Table 4 as Strategy II.  If no model were  employed (Strategy I), 108 sta-
tions are required each year resulting  in  a  four-year  cost  of $239,000.   The
model permits a reduction in cost of  lake  sampling to  $227,000.   The overall
savings for lake and tributary sampling is $1,398,000  over  a thirty-year
period.

    Only $12,000 in lake sampling costs are  saved  by utilizing the mass
balance model because the formulation  of the standard  deviation  of the tri-
butary load as a function of D(n) is very conservative (being based  on maxi-
mum observed discharges and Maumee River concentration data).   The above
minimum-cost strategy has been recomputed under somewhat  more  realistic  as-
sumptions:   that daily discharges equal the  average  daily discharge  on re-
cord for a tributarty; that a-| and 02  are  1/50 of  the  values previously as-
sumed;  and the a-j" and ap assume the values originally  assigned to  a-]  and a?.
The number of lake stations requested  by the optimal strategy (Strategy II)
is presented in Table 5 and is substantially less  than for  the conservative
assumptions.   The costs for the more liberal ($153,000) and  conservative
($227,000)  cases bracket the cost that would be computed  if  actual values
for the discharges and standard deviations were available today.
'Sampling costs provided by William Richardson, EPA, Grosse He.

                                    27

-------
TABLE 4.  REQUIRED NUMBER OF SAMPLING STATIONS AND COSTS FOR CONSERVATIVE
                   FORMULATION OF TRIBUTARY LOAD ERROR

Year
Strategy
I
II
III
IV
V
VI
VII
VIII
IX
X
XI
XII
XIII
XIV
1977
108
106
106
106
92
84
83
76
67
88
70
52
34
25
1986
108
108
108
108
107
95
91
78
68
90
72
54
36
27
1995
108
100
100
100
100
94
91
72
63
82
65
46
29
21
2004
108
97
97
96
96
93
91
69
61
79
61
43
26
18
Cost
[$]
1,625,000
227,000
227,000
227,000
218,000
201,000
196,000
162,000
142,000
187,000
148,000
107,000
68,000
49,000
                                     28

-------
TABLE 5.  REQUIRED NUMBER OF SAMPLING STATIONS AND COST FOR MORE
         REALISTIC FORMULATION OF TRIBUTARY LOAD ERROR
Year
Strategy
I
II
III
IV
V
VI
VII
VIII
IX
X
XI
XII
XIII
XIV
1977
108
106
106
106
84
56
55
76
67
88
70
51
33
25
1986
108
107
107
107
106
64
0
77
68
89
72
53
36
27
1995
108
45
41
33
45
36
0
33
29
36
27
18
11
8
2004
108
19
14
10
19
16
0
14
13
14
10
6
3
2
Cost
[$]
1,625,000
153,000
147,000
141,000
139,000
94,000
30,000
110,000
96,000
125,000
98,000
70,000
45,000
33,000
                              29

-------
    It is of considerable interest to determine the reduction  in measurement
cost that can be realized by applying research funds to the study of the
various mass balance model parameters.  Inaccuracies in these  parameters  de-
pend on uncertainty in the pollutant concentrations in outflowing water,  un-
certainty in the effective SM flow, uncertainty in atmospheric loads, and
uncertainty in tributary loads.  Reduction of uncertainty associated with
the outflowing water would require the implementation of a routine sampling
program and will not be considered here.  The second component may be re-
duced by intensive current metering and sampling in the SM to  identify the
nature and magnitude of the return flow.  Consider a hypothetical situation
in which the effective flow at SM is known precisely.  The minimum-cost mea-
surement programs in Tables 4 and 5 (Strategies III) have been computed
based on this hypothesis, with conservative and liberal assumptions for the
formulation of tributary load error variance.  The costs entered in these
tables represent the minimum costs achievable under such a research program
because the effective SM flow cannot be measured precisely.  The difference
between Strategies II and III must be weighed against the cost of the re-
search.

    A third component of the covariance matrix may be reduced  by an inten-
sive atmospheric-load sampling program.  The program would not have to be
routine because atmospheric loads would not be expected to change signifi-
cantly over the next thirty years.  Consider a hypothetical situation where
the atmospheric loads are known precisely.  The minimum-cost measurements  in
Table 4 and 5 (Strategies IV) have been computed and represent an additional
reduction in cost compared to Strategies II.  The savings must be weighed
against the cost of collecting and analyzing precipitation and dustfall
samples.

    The fourth and final component of the covariance matrix, uncertainty  in
tributary loads, has been assumed to be controllable in this report.  How-
ever, because of the high cost of sampling the tributary system, none of  the
strategies computed so far or to be computed, contain any samples beyond  the
routine twelve per year.  This is true for both the conservative and liberal
assumptions concerning the tributary load error variance as a  function of
the number of samples.

    The apparent annual phosphorus loss rate, X3, is also a model parameter
but has been interpreted as a state variable in the stochastic model.  This
permits the linearized Kalman filter to estimate this rate in  the same man-
ner as it estimates chloride and phosphorus lakewide average concentrations.
Of course, the estimate is not based on measurements of the rate itself but
rather on measurements of the total phosphorus lakewide average concentra-
tion.  The standard deviation of X$(n) is constant between measurements and
decreases when a measurement is made.  In general, the accuracy of the esti-
mate can be significantly improved by the end of the thirty year sampling
program.  However, it would be desirable to dedicate enough research to the
mechanisms of total phosphorus removal in Lake Michigan to reduce the pre-
sent uncertainty in X3.  If mechanisms of sinking and removal  could be cor-
related with apparent settling velocity, then the range of uncertainty in  X3
could be reduced from the present range of +_ 19.6 m/yr.  Minimum-cost mea-
surement strategies in Tables 4 and 5 have been computed under hypotheses

                                     30

-------
that the range of S is reduced  immediately  to ± 5.0 m/yr  (Strategy  V),  ±1.0
m/yr (Strategy VI), or ± .5 m/yr  (Strategy  VII).   It  is evident  that  this
improvement in model accuracy can provide a significant reduction  in  mea-
surement cost.  A decrease in the range of  S beyond +. 5.0 m/yr,  however may
be difficult given the present  understanding of  phosphorus  exchange me-
chanisms.

    Next, consider the effect of  total phosphorus  laboratory  analysis error
on the optimal cost.  The variance of  laboratory analysis errors affects the
error of a lakewide measurement in Equation (5.16) where  6? is the  variance
resulting from spatial heterogeneity and y^ "is the variance resulting from
laboratory error.  Minimum-cost measurement (Strategy VIII)  and  (Strategy
IX) in Tables 4 and 5 correspond  to cases where  the laboratory standard de-
viation is reduced from  its present value of 1.0 yg P/& to  0.5 yg P/£ and
0.1 yg P/£.  The reduction in measurement costs  is significant but  must be
weighed against the research cost necessary to  improve the  present  analysis
technique.  Note that the reduction in error is  limited by  the variation of
point concentrations about the  lakewide average  concentration that  cannot be
altered.

    Finally, reductions  in the  optimal cost can  be achieved by relaxing the
criterion of trend detection.   Instead of requiring that  changes of 1.0 mg
Cl/£ and 1.0 yg P/£ be detected with 0.90 probability, detection with 0.88,
0.85, 0.80, 0.70, and 0.60 probability may  be acceptable.   The Chebyschev
Inequality can be used to compute the constraints  that must be imposed  on
the standard deviations  of estimation errors in  order to  meet the detection
criterion.  If it is assumed that the estimation errors are normal, then, in
order to meet the detection criterion with  0.90  probability,  standard devia-
tions would only have to  be maintained below about 0.3 mg Cl/£ and  0.3  yg
P/&.  This approximately  corresponds to the constraint imposed by the
Chebyschev Inequality for 0.60  probability.  Minimum-cost Strategies  X  to
XIV have been computed corresponding to probabilities of  0.88 to 0.60.  The
cost of detection at 0.60 probability  is about one-fifth  the  cost of  detec-
tion at 0.90 probability, and a sharp  increase in  cost is incurred  as the
probability increases from 0.80 to 0.90.  Specification of  trend-detection
criterion will likely remain an ad hoc scientific  decision.   The purpose of
the methodology applied  in this section is  to analyze the effect of parti-
cular uncertainties on the estimation process once detection  criteria have
been established.

    The optimal measurement strategies discussed in this  section and  sum-
marized in Tables 4 and  5 have  been computed based on trend detection every
nine years.  In general,  the Kalman filter  approach for trend detection de-
monstrates its superiority over conventional procedures (no model,  only
data) more lucidly if detection is required every  year.   Consider the case
in which liberal  statistics are employed for tributary loads  and all  other
statistics are as presented in  Table 3.  This case will,  therefore, be
analogous to Strategy II except error variance constraints  will  be  imposed
every year instead of every nine years.  The optimal  strategy and cost  are
summarized in Table 6.   It is observed that the  required  number  of  stations
is greatly reduced after  the first few years of  the program.  If the  Kalman
filter were not employed, then  108 stations would  have to be  visited  each

                                     31

-------
TABLE 6.  OPTIMAL MEASUREMENT STRATEGY AND COST FOR DETECTION
                          EVERY YEAR
n
1
2
3
4
5
6
7
8
9
10

Number of
stations
106
89
70
54
42
33
27
22
18
15

n
11
12
13
14
15
16
17
18
19
20
Total
Number of
stations
12
10
9
8
7
6
5
4
4
3
Cost = $301,600
n
21
22
23
24
25
26
27
28
29
30

Number of
stations
3
3
3
2
2
2
2
1
1
1


                               32

-------
year costing $1,793,000 over thirty years.  The optimal  cost  is  only
$301,600 when the filter is employed.


SUMMARY

    A nonlinar model has been derived that describes the  long-term  dynamics
of chloride and phosphorus lakewide average concentrations  in  Lake  Michigan.
Statistics for each uncertain parameter  in the model have been approximated
from available data.  The nonlinear model has been  linearized  about a  nomi-
mal, so that the linearized Kalman filter can be employed.  Finally, the
role of this model was explored for measurement strategy  design.

    The monitoring strategy proposed by  GLWQB cannot detect changes of 1.0
yg P/£ at 0.90 probability with or without the Kalman filter.  However, a
minimum-cost strategy that meets the detection criterion  every nine years
has been computed.  The optimal strategy contains limited tributary samples
for both conservative and liberal formulations of the tributary  load error
variance.  It is concluded that such samples are not a cost-effective  source
of information for detecting trends in lakewide average  concentrations.  It
is also concluded that the phosphorus error variance constraint  controls the
number of lake stations required for trend detection because the  chloride
error variance satisfies the constraints and thereby has  no effect  on  the
optimal strategy.  Furthermore, the optimal lake sampling strategy  is  very
simple.  Lake samples must be collected  only for those years when the  detec-
tion criterion must be satisfied.

    An investigation was conducted to determine the sensitivity  of  the opti-
mal cost to the statistics of the mass balance model parameters.  It is con-
cluded that the calculated cost is most  sensitive to the  variance in the ap-
parent phosphorus settling velocity, not as sensitive to the accuracy  of
laboratory analyses, and rather insensitive to the  accuracy of the  effective
SM flow and the atmospheric loads.

    The actual design of a Lake Michigan surveillance program  entails  econo-
mic and practical considerations and would take into account other  goals be-
sides chloride and total phosphorus trend detection.  The analyses  presented
here simply illustrate the manner in which Kalman filtering techniques  can
aid in defining and subsequently reducing the requirements of  trend surveil-
lance programs.
                                    33

-------
                                 SECTION 6

               AN APPLICATION TO ANNUAL PLANKTON AND NUTRIENT
                           CYCLES IN SAGINAW BAY
INTRODUCTION
    Section 5 described the use of Kalman filtering techniques to derive
optimal measurement strategies for long-term trend detection of chloride  and
total phosphorus concentrations in Lake Michigan.  The major goal of Section
6 is to conduct a brief preliminary investigation of the applicability of
similar techniques to define efficient sampling strategies for phytoplank-
ton, zooplankton, and nutrients such as nitrogen and phosphorus in Saginaw
Bay.  Annual rather than long-term dynamic cycles and interactions will be
considered in an aquatic system where significant spatial gradients also
exist.

    The EPA Large Lakes Research Station at Grosse He has supported an in-
tensive limnological monitoring program in the bay since 1973.  Measurements
from this program have been used to develop nutrient and plankton models  for
the system.  These models have been used to predict phytoplankton levels  in
the bay following enactment of various schemes to control the level of phos-
phorus input loading.  These calculations of phytoplankton levels are based
on model equations and parameter estimates derived by comparing the model
output with actual phytoplankton concentrations observed during 1974 and
1975.  It is assumed in these calculations that independent laboratory and
field data are not available to determine the model parameters.  However,
data in the literature are used to establish ranges for these parameters.

    Several interesting and practical questions arise concerning the reli-
ability of such procedures.  For example, how good are the estimates of the
predicted peak or average phytoplankton concentrations under the assumed
conditions?  It is known that uncertainties in the calculations are caused
by imperfect knowledge of each model parameter and each state variable ini-
tial condition.  It would be useful to know how well the model parameters
and initial conditions must be known in order to calculate the peak phyto-
plankton concentration to within a known or specified degree of certainty.
If it is impractical or impossible to quantify the model parameters and ini-
tial conditions sufficiently, it may be necessary to supplement model cal-
culations for future conditions with measurements to obtain more accurate
estimates of peak phytoplankton concentrations.  In these cases it is neces-
sary to determine the appropriate minimum number, frequency, and location of
measurements.
                                    34

-------
    Systematic and formal procedures  have  not  previously  been  used  to  ad-
dress and resolve these  kinds of questions  associated  with  plankton-nutrient
models.  However, the results in this  section  illustrate  a  promising  ap-
proach to these  issues using Kalman filtering  techniques.

    Another class of problems in Saginaw Bay concerns  the design  of cost ef-
ficient sampling programs that  can be  used  to  batter define  the value  of
model parameters and initial conditions.   For  example,  suppose it is  desired
to calculate the peak phytopjankton concentrations within a  specified  toler-
ance using the model without'supplemental  data.   Then,  the  techniques  can  be
used to estimate the required certainty associated with the  model parameters
and initial conditions.  This desire  to determine the  model  parameters  and
initial conditions is best fulfilled  by performing field measurements
directly in the  bay.  It is of  interest to  determine cost efficient sampling
schemes (number  and frequency of samples)  to accomplish these  goals.   This
section introduces a methodology involving  the  use of  Kalman filtering  tech-
niques that can  be used  to calculate  suboptimal  and, eventually,  the  least
expensive sampling program.


SAGINAW BAY NUTRIENT-PLANKTON MODEL

    Spatial and  temporal variations of limnological variables  in  Saginaw Bay
are determined by solving dynamic differential mass balance  equations  for
each of five finite segments or cells  as illustrated in Figure 3.   This fi-
gure also shows the location of the Saginaw River, which  is  the major  source
of nutrients to  the bay.  The bay is  coupled to  Lake Huron  by  advective and
dispersive hydraulic transport.  The  general flow pattern among various
model segments is represented by arrows in  Figure 3.   The major biological
and chemical kinetic reactions  and forcing  functions that affect  the plank-
ton and nutrient annual  dynamics are  illustrated  in Figure 4.  The  dynamic
behavior of the  limnological state variables in  each completely mixed  finite
segment of the bay is obtained  by numerically  integrating a  system  of  ordi-
nary differential equations of  the form


         V g£ - Ac + VR  + W                                            (6.1)


where V is an nxn diagonal matrix of  volumes, c  is an  nxl vector  of the
state variables, A is an nxn matrix of advective  and dispersive transport, R
is an nxl  vector of kinetic interactions, and W  is an  nxl vector  of input
forcing functions for c.  For m interacting variables  and n  finite  segments,
a total of mxn equations must be solved.  For this example,  the state  vari-
ables are phytoplankton  (expressed as  chlorophyll a),  herbivorous zooplank-
ton,  carnivorous zooplankton, organic  nitrogen,  nitrate nitrogen, ammonia
nitrogen,  organic phosphorus, and reactive  phosphorus.

    A simulation model  called LAKE-1 was developed by Thomann  e_t  a\_.,  (1975)
to describe the dynamics of similar phytoplankton-zooplankton-nutnent
interactions in Lake Ontario and the  effects of  nutrient loadings on those
interactions.  This model was subsequently  applied to  Saginaw  Bay by


                                     35

-------
     SAGINAW BAY
1974 Sampling Network
   Legend oBoat Station
         A Water Intake
                Segment
                                   Oi	110 KM
                                   0"	HOMI
        Saginaw River
 Figure 3.  Saginaw Bay model segments, sampling stations, and
          flow pattern.
                          36

-------
                             Material Sinks
                                 (m/T)
                                              Physical Transport
i
I
i
I
   Phytoplankton
   Concentration
  Nutrient
Concentration
        Time




Zooplonkton
(m/L3)
\
Prey
I
Gra*
1
'ing
Phytoplonkton
(m/L3)
Nutrient
Limitation
1


Nutr
(m/
i
Nut
\
tents
L3)



hentUse
\

i
Material Inputs
(m/T)


Input Flow Growth Rote Growth Rote Grazing Rate

S~~~
Temperature
^
Temperature
/^
Solar Radiation
jiA/n
— HI I!V*~^/N«-W«
Time
    Figure 4.  Saginaw  Bay model  biochemical reactions and
               forcing  functions.
                               37

-------
Richardson and Bierman (1976).  The Richardson and Bierman model was  adapted
for this project to run on MTS (the University of Michigan Terminal System).

    A general computer program was developed that implements models that  can
be described by a set of differential equations using an Adams predictor-
corrector numerical integration method called DVDQ (Krogh, 1969).  Solutions
for a particular model are obtained by calling a subroutine that evaluates
derivatives for the model using Equation (6.1) and any data files  necessary
to define time variable forcing functions.  Temperature and light  cycles  are
examples of time variable forcing functions used in the model.  The equa-
tions for the derivatives and input data for this project were taken
directly from a computer listing supplied by the EPA.


ERROR PROPAGATION WITHOUT MEASUREMENT

    The theoretical development in Section 4 has suggested how the Kalman
filter can be used to calculate the minimum-variance estimate of a state
variable based on both a linear stochastic difference model for the system
and actual field measurements.  This section examines state variable  error
propagation for cases where measurements are not employed.  Such applica-
tions might arise when the models are used to predict future water quality
conditions that result from alternative nutrient control procedures.

    Variance error propagation for linear systems is defined by Equation
(4.4).  This equation can be applied to nonlinear systems by constructing a
linear system that defines the deviations of the states from a nominal  solu-
tion generated from current best estimates of the parameters and state  ini-
tial conditions (see Appendix A).

    Estimates of the means and standard deviations for the model parameters
are listed in Table 7.  Standard deviations for the parameters were obtained
by defining a range of values for a given parameter from the literature
(Canale ejt _al_., 1976, Richardson and Bierman, 1976, Thomann et jil_., 1975)
and assuming a uniform distribution over this range.  For the purposes  of
this illustration and to limit computer costs, only seven parameters  were
chosen for study, while the other ten were assumed to be known perfectly
(see Table 7).  The initial values of the states are dependent on  observed
data and vary from model segment to segment.  For the purposes of  illustra-
tion, standard deviations associated with the states were arbitrarily
equated to 0.25 times their mean value.

    The computer generated nominal gives the time variable behavior of  the
states for each of the five model segments.  These states depend on the
model structure, parameter values, and forcing functions.  The nominal
states for model segment 5 are plotted in Figure 5.  A 19-week time interval
was used in the calculations to approximate the first phase of the phyto-
plankton growing season and to reduce computation costs.  Time zero in  this
figure corresponds to a date of April 1.  The calculated nominal states
agree reasonably well with measured values of the states for all model  cells
and variables (Richardson and Bierman, 1976).


                                    38

-------
   TABLE 7.  ESTIMATED MEANS AND STANDARD DEVIATIONS FOR MODEL PARAMETERS

Parameters
Nitrogen
Half-saturation constant
Organic nitrogen decompositon
rate
Ammonia to nitrate nitrifica-
tion rate
Nitrogen-chlorophyll ratio
Phosphorus
Half-saturation constant
Organic phosphorus decomposi-
tion rate
Phosphorus-chlorophyll ratio
Zooplankton
Conversion efficiency
Endogenous respiration rate
Herbivorous and carnivorous
zooplankton grazing rate
Phytoplankton
Chlorophyll half-saturation
constant
Endogenous respiration rate
(at 20 )
Settling velocity
Saturated growth rate
Optimum light intensity
Mean
value

0.025

0.007

0.015
0.01

0.005

0.007
0.00083

0.06
0.0026

0.04


20

0.0
0.05
0.20
350
Standard
deviation

0

0

0
0.00175

0

0
0.0003

0
0.0011

0.01


0

0.0225
0
0.091
0
Units

mg/£
-1 -1
day deg

day" deg"
mgN/ygCh

mg/£

day" ' deg"
mgP/ygCh

-i -I
day" 'deg"

£/mg C day deg


g/£
-i
day"1
m/day
day"!
langley/day
Carbon
    Carbon-chlorophyll ratio
0.025
0.02
mgC/ygCh
                                    39

-------
o  100

i
u
-006
o>
6
  0031-
CL
(S)
O


a
                                                0.8
             CHLOR A
                                   HERBIV

                                     ZOO
                                                     3
                                                      0>
                                                     O
                                                     s
0.4  >
                                                     CO
                                                     o:
                                                     LU
                                                     X
                                                0.02
             PHOSPHATE
                                                0.0\
                                                     CL

                                                     O
                                                     o:
                                                     o
    1.0
 Ld
 a:
   0.5
                                                0.1
                                                     ^


                                                     E
          NITRATE N
                            AMMONIA N
                                                0.05 §
        0         5          10         15

                       WEEKS

          Figure 5.  Nominal states for model segment 5.
                           40

-------
    Model application might require  information regarding  the  uncertainty
associated with a model caused by  imperfect knowledge of the parameters  and
initial states, assuming that the  model structure  is exact in  all  other
ways.  These standard deviations have been computed according  to  Equation
(4.4) assuming no measurements are made.  Column  (b) of Table  8  lists  the
calculated standard deviations at  the peak values  of chlorophyll  a,  herbiv-
orous zooplankton, and carnivorous zooplankton using the data  in  Table 7 for
the uncertainty in the seven selected parameters.  Column  (c)  from this
table lists the calculated standard  deviations at  the peak chlorophyll a and
zooplankton levels for a case where  the standard  deviations for the  seven
selected parameters are reduced by a factor of ten.  Column (d) gives  simi-
lar calculations for the case where  the original  uncertainties are used for
the seven parameters but the uncertainties associated with the initial
states are reduced by a factor of  ten.  These results are  an indication of
the magnitude of error propagation that would result when  an uncalibrated
model is run using uncertain initial values,  literature ranges for model
parameters, and accurate forcing functions.

    The results in Table 9 show that the  largest  calculated standard de-
viations occur in segment 1 for all  three states.  Therefore,  according to
these results, model predictions of  the peak populations without measure-
ments are the most unreliable in segment  1.  Furthermore,  the  calculated
standard deviations at the peaks in  segment 1 are  large compared  to  the mean
value of the peak populations (see column a).  This, of course, indicates
that the model has limited value for predictions  of peak states in segment 1
without measurements.  Uncertainty accumulates as  a function of time accord-
ing to Equation (4.4), which for this case depends only on the nature  of the
system equations and parameter values.  Relatively large errors occur  in
segment 1 because it has a small volume and depth, relatively warm tempera-
tures, and high nutrient levels.   On the  other hand, the model has rela-
tively good predictive capability  for peak plankton levels in  segment  5,
which has a large volume and depth and low temperatures and nutrient levels.

    The calculated standard deviations for the peak populations improve when
the standard deviations associated with the seven  parameters are  reduced by
a factor of ten.  Such a reduction might  be achieved by measurement  of these
parameters in laboratory or field  experiments.  The effectiveness  of these
improvements follows the flow pattern, which  is in a counterclockwise  direc-
tion.  Standard deviations are only  reduced about  1 percent in segment 4
whereas about 55 percent improvement is calculated for segment 5.  An  oppo-
site phenomenon occurs when the uncertainty of the initial  states  is reduced
and the uncertainty in the parameters remains the  same.  Reductions  in the
calculated standard deviations are most dramatic  upstream  in segment 4
(about 84 percent) and smaller in  segment 5 (about 11 percent).

    These calculated trends for the  standard deviations of the states  are
consistent with physical considerations.  Segments 4 and 2 are influenced
more by boundary forcing functions and initial conditions  and  less by  the
structure of the equations and the model  parameters.  The  changes  in the
states of segments 1, 3, and 5 are effected by input from  adjacent model
segments and the structure of the  model and are relatively insensitive to
the initial conditions.

                                     41

-------












CO
^r
o
HH
I—

	 i
ZD
CL.
O
Q-
LU
Q_

LU
h-

1—


GO
O GO
t— I 1 —
•=£ LU
. . «^-
l~~l ^~
>i i i
LU
LU o;
O ZD
GO
f ^ e^,
C£. LU
< s:
o
z o
1— LU
GO rvj
O rv
•sz o
< u-
•£.
LU
Q
LU
1—
	 1
ZD
o
co

LU
CD
^f
1—
















•o
cu
c >
T3 O O
^ re) -J3 Q.
T3 T3 res E
^— * c" »i — »i—
re) >
4-* CU S~
GO -O O
4-
T3
CU
C >
T3 0 0
S- -i- S-
O "O re) E
. 	 * j^ >r— »i—
res >
4-> CU 5-
GO TD O
<+-


r-~
c~ re)
-DOE
C »^— t
>— ™ ^_
— ~ res 4-> O
.a ~a res c
« 	 ^ ^- •[—
re) > s-
4-> O) O
GO T3 4_





^^
^,



































T3 CO
c~ ^.
i — re! i — cu
•i- co E CU
4-> 0) S- E
•r- 4-> O (O
c re> c s_
•r- 4-> re)
CO CX

CO ^~
S- res
cu E •—
4-> S- re) co
E C 4-> 4->
to -r- re)
S- -a c: 4->
re) c: -r- co
Q. rci


,_
co re!
M- i- -i-
O CU -t-J CO
1,3 •,— QJ
co eu c 4->
QJ £ -I- (O
3 res -t->
i — S- T3 CO
res res c
> ex res

4->
cu re)
i — cu re)
re) 4-> cu
> re) ex
c: co cu
res _c
CU C|_ 4J
2! 0

4— ^"^
O CO
cu rei cu
E CU CD
1 — ' — "
"CD •— •
T3 r- 0
o cu c:
s: u










CO
CU
1 *
re!
i *
GO

i —
CD

O
SI



CO f^. *s3~
en co «3-
CM CM en
r— tO







^1" i — LO
vo co ^d~
en en to
LO r—
n
CO





co en ^*
O CM CO
O O CM
i — > O O
-c: s- s-
CX 0 0
o > >
J_ •!- -r-
O -Q C
<— i- S-
^: cu re)
o x: o



co en CM
•*•—<-
r— CO IT)








•— CO CM
co CM r~~
tO LO LO
i— CM







CO CO LO
en LO i —
to LO to
i— CM







CM <* O
LO CM i—
CO 0 0






en «* r^



CM CM CM


c: c:
0 0
^ ^*
c c
re) re)
ex a.
o o
0 0
re) N N

i — CO CO
r— 3 ^
>> O O
-c s- s-
ex o o
0 > >
i- -1- -r-
O J3 C
i— S- S-
.c CD re)
t •> -r c i



OO r— LO
*3- 1 — 1^
i — CM CM
LO







to o oo
o «3- co
i— en i—
CO
A
CM





CM co r^-
CO LO LO
r— en to
oo
«s
CM





oo to ^f
to ^t- >j-
IO O O
OO





to co to



OO CO CO


c c
0 0
^. -^
c c:
res re)
'a. "a.
0 0
o o
re) N N

i — CO CO
i — 3 3
>> O O
.c s- s-
cx o o
o > >

O -Q C
i— S- S-
_C CU re)
C_> 31 C_3


oo en to
to o co
(^ O CM
^~








co en LO
to en oo
^••=3-1—








CM to r^.
r^ O co
^f LO i —








CM r^ «3-
LO i — i —
•=* O O






O LO 1^



«xj" *^- ^3"


C C
O O
-* JXL
$~_ £~
re) re)
"a. 'a.
O 0
o o
re) N N

i — co co
r— 3 3
>-> O O
-c i. s-
0. 0 0
0 > >
S_ -r- -r-
o *~* c
r— S- S-
jc cu re)
c_> x: o


o <=3- r~.
<^- i — oo
«3- o o








r— ^" t—
CM i — «^-
CM 0 0








O O LO
en CM LO
*d- o o









CM en to
O CM i —
^- o o






CO r— LO



LO LO LO


C C
0 0
^£ ^.
c. c
re) re)
ex ex
o o
0 0
re) N N

i — co co
i — 33
>5 O O
-c s- s-
ex. o o
o > >
i_ *i — •) —
O -Q C
•— S- i-
.c CD re)
o x: o
42

-------
    The results of  these few  simulations  are  provocative  and  raise several
 interesting questions.  However,  time  and  funding  limitations and  the origi-
 nal purpose of this  section  (to  investigate the  large-scale  problem poten-
 tial of the approach  described in Section  4)  did not  permit  additional
 analysis.  For example, it would  be  important to study the effect  of param-
 eter identification with parameters  other  than the  seven  selected,  to per-
 form a more extensive  study  of the trade-off  between  more accurate knowledge
 of initial conditions  as opposed  to  more  accurate  knowledge of parameters,
 and to perform Monte  Carlo type  simulations to determine  the  validity of
 Equation  (4.4) in predicting  standard  deviations for  this class of problems.


 SUBOPTIMAL MEASUREMENT  STRATEGY  GENERATION

    The previous section has  examined  the  propagation of  errors for the  peak
 plankton  populations  when models  alone were used to predict future condi-
 tions without supplemental measurements.   Because  the uncertainties asso-
 ciated with these predictions are unacceptably large, direct  measurements
 and Kalman filtering  techniques  have been  employed  to reduce  the error vari-
 ance associated with  the state variable estimates.  A major goal of this
 section is to apply these techniques to a  problem  that is so  large that
 automatic optimization  as applied to trend detection  in Lake  Michigan (Sec-
 tion 5) is either too expensive  or of  questionable  value  because of limited
 statistical knowledge of the  chemical  and  biological  mechanisms.

    The computer program that was utilized in Section 5 to determine optimal
 measurement strategies  is a  general  purpose program (DePalma,  1977).   The
 optimization algorithm  in the program  is very efficient for certain classes
 of problems, but it  involves  considerable  storage.  Automatic computation of
 full optimal sampling strategies  for Saginaw  Bay involves computer  storage
 requirements well above the  limit of the computing  facilities at the Univer-
 sity of Michigan.  This difficulty can  be  addressed by modifying the ap-
 proach to the problem to take advantage of the structure  of the governing
 equations or by employing only the simulation portion of  the  program to
 analyze various measurement  strategies.  In this case,  only the simulation
 portion of the program  has been employed to demonstrate how reasonable sub-
 optimal measurement strategies can be  developed with  a minimal  amount of
 computation.

    The mathematical model described earlier  in this  section  is a  nonlinear
 differential equation system  involving  eight  state  variables  in each  of  five
 bay segments.  Furthermore,  it has been assumed that  seven parameters in
 each segment have unknown values  that  are  to  be improved  during the measure-
 ment process.  Thus, seventy-five differential equations  define the state
model.   Only the growing season  (ti  =  1,2,...,19)  is  considered for  the
 range of admissible measurement times  because of the  large number  of  differ-
 ential  equations.  The variance of a measurement can  be improved by averag-
 ing a larger number of samples.   However,  practical considerations  dictate
 that measurements at any time involve  all  eight state  variables.   In  this
 section,  the minimum measurement  variance  in  a given  week corresponds to a
maximum practical sampling effort, while the  maximum  measurement variance
 corresponds to no measurement.  The  measurement strategy  designer  then

                                     43

-------
selects, with the aid of the Kalman filter, the set of weekly measurement
variances, times, and segments that results in a cost-effective  sampling
program with acceptable accuracy.

    The first step in this suboptimal measurement strategy selection  process
is the generation of the covariance matrix for the full measurement case
(that is, minimum variance measurement for each week in each segment) which
corresponds to a calibrated model with a full data set.  Next, the covari-
ance matrix is determined for the zero measurement case (that is, no  mea-
surements at any time) which corresponds to an uncalibrated model.  These
cases indicate the minimum and maximum variances attainable by the sampling
program.  The purpose of the optimal sampling strategy is to determine which
partial data set will be adequate for accurate state estimation.  Figures 6
to 9 show the nominal states and standard deviations for chlorophyll  a,
herbivorous zooplankton, phosphate3 and nitrate for the full and zero mea-
surement cases in each cell.  The dots represent the nominal values,  the
vertical lines to the left of the dots indicate a one standard deviation
range for the zero measurement strategy, and the vertical lines  to the right
give the one standard deviation range for the full measurement case.  In
some segments the variance grows so rapidly that the zero measurement case
gives little information, whereas some predictive capability exists (without
measurement) in segment 5.  This indicates that it is unacceptable to use
uncalibrated models for prediction without supplemental measurements.  These
measurements of course correspond to what are normally required  for model
calibration and subsequent verification.  The Kalman filter can  be used to
help define cost-efficient measurement programs.

    Figures 10 to 13 show how measurements can be used to improve estimates
of the model parameters.  The dots represent the best estimate of the param-
eter before the sampling program, the vertical bars to the left  of the dots
indicate the one standard deviation range without any measurements, and the
vertical bars to the right of the dots indicate the one standard deviation
range with a full measurement strategy.  Measurements can only improve
knowledge of a parameter.  Note that in most cases the improvements are bet-
ter in segment 5 than in segment 1.

    The full and zero measurement time history plots for all of  the state
variables and parameters were used to hypothesize a simple suboptimal mea-
surement strategy.  The following guidelines were useful in designing this
suboptimal strategy.

    (1)  Determine the segments in which measurements are the most
         useful (for example, measurements in segment 5 are preferred
         to measurements in segment 1).

    (2)  Determine the earliest time when most of the variance begins
         noticeable decreases in the parameter plots (for example,  in
         most cases this occurs between t = 3 and t = 6).

    (3)  Determine the time region where the maximum variance decrease
         occurs on most of the seventy-five plots (for example,  this
         occurs most frequently between t = 6 and t = 11.

                                    44

-------
t
,
1
u
II
M

0
, 	 '- 	 ,
, » ,
It
H
L
J
J
&-
1

in
^n
^r
jj
jj

in




o

o in
f^ ^^ -
f/Bn/jv aoiHO
a-  o
    m
                       9w
                          ^
                          UJ
                          Lu
    m
 0
     O
       UJ
       LJ
    IO
    O
                                                                          9£
                                                                             UJ
                                                                             UJ
                                                       U)
                                                                    -o
                                                                    E
                                                                    fO
                                                                    i-
                                                                    o
                                                                    q-

                                                                    ro
                                                                    o.
                                                                    o
                                                                    o
                                                                     S-
                                                                     o
                                                                    q-

                                                                     a
                                                                     o
                                                                     to
                                                                     O)
                                                                    -a
                                                                                        ro
                                                                                       -O   •
                                                                                        E  10
                                                                                        ro  0)
                                                                                       +->  in
                                                                                        00  ro
                                                                                           O
                                                                                       •a
                                                                                        E  -!->
                                                                                        (O  E
                                                                                             O)
                                                                                        (O  S-
                                                                                       4J  3
                                                                                        00  CO
                                                                                           ra
                                                                                       i—  d)
                                                                    •r-  O

                                                                     O  CU
                                                                    •ZL  N
                                                                     0)
                                                                     i-
                                                       O
                      45

-------
                         Oco
                           LU
                           UJ
                        in
                                                             UJ
                                                             LU
                                                          i-
                                                          o
                                                         tf-

                                                          E
                                                          O
- O
  00
(T/5uj) ooz
            o
  ro
  O
m

O
 (r/6uj/ OOZ AI9H3H
y
H
HI _
H
M
H
H
M -
1 *






1
in
C)
H
JL,
A
A
A
A
H
a-
i
m
CM
O

o
— CO
LU
LU
^
in



O
                                                                     H  O
                                                                _ '_.
                                                        -]
                                                                       m
                                                                         LU
                                                                         LU
                                                                     A
                                                                     A

                                                                     i
                                              o
                                                         m
                                                         ro
                                                         O
                                                                    (B

                                                                    "o.
                                                                    o
                                                                    o
                                                                    N
                                                                    O

                                                                    o
                                                                                S-
                                                                                o
                                                                                q-
                                                                                
                                                                    ^-> S
                                                                     10
                                                                       o

                                                                     fo a)
                                                                     c: N
                                                                    •i—
                                                                     E -a
                                                                     o c
                                                                    'z. to
                                                                    I—


                                                                    O)


                                                                    3
                                                                    C7>
        ooz Aiea3H
                                       46

-------
                     UJ
                     LU
                  -O
CM
O
      o

3lVHdSOHd
                                                                 O
                                                                 UJ
                                                                 UJ
                                                                 O
                                        O
     m
     O
BlVHdSOHd
                                                                            o
                                                                            s-
                                                                            OJ
                                                                            N
                                                                            c
                                                                            fO
                                                                            4-
                                                                            S-
                                                                            o
                                                                            (O
                                                                            .E
                                                                            O-
                                                                            to
                                                                            O
                                                                            o
                                                                            •i —
                                                                            -t->
                                                                            (O

                                                                            >
                                                                            
                                                                             (T3 tO

                                                                             O) O
                                                                             10 O)

                                                                            r— CU
                                                                             CO S-
                                                                             C 3
                                                                            •i— (/)
                                                                             E "3
                                                                             O 0)
                                                                            CO
                                                                             S-
                                                                             Z3
                                                                             cn
                                  47

-------
r
O CO

   UJ
   LU
   5
                                    o
in
                  m
                  s-
                                                                                                    LU
                                                                                                    LU
                                                                                                 O
                              q
                              rd
                                                                                                               o

                                                                                                               01
                                                                                                               c.
                                                                                                               fO
                                                                                                              q-

                                                                                                               s_
                                                                                                               o
                                                                                                              q-

                                                                                                               O)
                                                                                                              -i->
                                                                                                               re
                                                                                                               5-
                                                                                                               O
                                                                                                               fO

                                                                                                               >
                                                                                                               OJ
                                                                                                              -a

                                                                                                              -a
                                                                                                               S-
                                                                                                               ea
                                                                                                              •o
                                                                                                               E
                                                                                                               (O
                                                                                                              4->
                                                                                                               CO

                                                                                                              •a  to
                                                                                                               E  0)
                                                                                                               n3  co
                                                                                                                  (O
                                                                                                               a>  o
                                                                                                              •— cu
                                                                                                               fO S-
                                                                                                               C 13
                                                                                                              •i— CO
                                                                                                               ^~ fO
                                                                                                               o cu
                                                                                                              Z= E
                                                                                                              a-i
                                                                                                               cu
                                                                                                               s-
                                                                                                               en
                                                                             Ni
                                                     48

-------
CO
O
CO
C>
 I
ro
O
                         in
                         Oco

                           UJ
                           LL)
                        - O
                                                                    Oi
                                                                      UJ
                                                                     in
                                                                     O
(,/op) 31Vd H1MOH9
   NOiMNVldOlAHd
                                           to
                                           O
                                                                               c
                                                                               O
                                                                               rO
                                                                               a.
                                                                               o
                                                                               O.


                                                                               
                                                                               M-  (/)
                                                                                  rc3
                                                                               to  O
                                                                               c
                                                                               O  -4->
                                                                               •i-  C
                                                                               -i->  a>
                                                                               (O  E
                                                                               •i-  a>
                                                                               >  s-
                                                                               O)  3
                                                                               •O  co
                                                                                  fO
                                                                               -a  aj
                                                                               S-  £
                                                                               ta
                                                                               ^a  o
                                                                               c  s-
                                                                               (O  O)
                                                                               -M  N
                                                                               •a
                                                                               c:
                                                                               ro

                                                                               CO i
                                                                               d)
                                                                              E  s-
                                                                              •i-  o
                                                                              •4-> 4-
                                                                              co
                                                                              CU  0)
                                                                                -t->
                                                                              S-  fd
                                                                              a;  s-
                                                                              +->
                                                                              a> ^:
                                                                              E 4->
                                                                              n3  2
                                                                              i-  O
                                                                              ro  S-
                                                                              D-  CD
                                                                                a>
                                                                                s-
                                                           H1MOW9

                                                 N01»NVHd01AHd
                                        49

-------
    in
    O
    m
    O
    in
    o
                            in
                              LJ
                              UJ
                            m
                          - O
 (6ap

9NIZVH9 NOlMNVTdOOZ
                                                                        _ in
                                                                          m
                               U-'
                               LU

                               $
                                                                          O
     8
                                                                                   c
                                                                                   o
                                                                                   O-
                                                                                   o
                                                                                   o
                                                                                   CU
                                         in
                                         O)
                                                                                     fO
                                                                                     O
                                                                                   O C.
                                                                                  •r- CU
                                                                                  +J e
                                                                                   (O CD
                                                                                  •r- S_
                                                                                   > 3
                                                                                   CU 
                                                                                   a)  c
                                                                                   E T-
                                                                                   rci  NI
                                                                                   S--  ro
                                                                                   res  S-
                                                                                   Cu  CD
                                                                                   s~
                                                                                   13
ONIZVb'J NOl>iN\ndOOZ
                                          50

-------
               8
               o
                00
                O
                O
o
                              O en
                              "~ y:
                                LU
                                LU
                              IT)
                                                                             LO

                                                               LU
                                                               LU
                                                                             o
   O
   O
IMOI.LVaidS3H NOlMNVldOOZ
S
o
o
                                                                      c:
                                                                      (O

                                                                      "o.
                                                                      o
                                                                      o
                                                                      tsl   •
                                                                         IO
                                                                      CD  QJ
                                                                      _c  10
                                                                      -!->  (O
                                                                         o
                                                                      s_
                                                                      O  -M
                                                                      (4-  E
                                                                         OJ
                                                                      
                                                                      fO 
                                                                      •r- (t3
                                                                      •4-> S-
                                                                      co
                                                                       (O
                                                                      OJ S-
                                                                      E T-
                                                                      n3 CX
                                                                      S- CO
                                                                      fO Ol
                                                                      Ou S-
                                                                                      O)
                                             N01»NVldOOZ
                                            51

-------
         O
         O
         O
         O
                             O in
                             ~- ^
                               UJ
                               UJ
                            If)
                             O
9*

  UJ
  5


in
                                                                      - O
                                                                                Q.
                                                                                O
                                                                                S-
                                                                                o

                                                                               .C
                                                                                O
                                                                                S-
                                                                                o
                                                                                _c
                                                                                Q.
                                                                                l/l
                                                                                O
                                                                                ^:
                                                                                CL

                                                                                CU
                                                                                S-
                                                                                o
                                     O CO
                                    •r- OJ
                                    -P to
                                     rO rt3
                                    •r- O
                                     >
                                     O) 4->
                                    -O E
                                        O)
                                     00 £

                                    "O O
                                     c: s-
                                     03 CL)
                                       N
                                     to
                                     O) TD
                                     +-> c
                                     res fo
                                                                                tO
                                                                                O>
                                      M-
                                     O)
                                     E O
                                     (O -i-
                                     S- +->
                                     fO fO
                                     Q- S_
                                                                                O)
                                                                                S-
llAHdOdOlHD-SnHOHdSOHd
nAHdOdOlHD-SnaOHdSOHd
                                          52

-------
    Application of these guidelines suggests  a  suboptimal  sampling  program
that consists of full measurements only  in segments  2  and  5  at  t  =  3,  8,  9,
and 10.  The rationale for this strategy  is that most  of the variances do
not start decreasing until t = 3 and that the maximum  variance  decreases
occurred in the region t = 6 and t = 11.  Both  sides of the  bay are sampled
because of the segmented structure of the system.  Segment 5 is  chosen be-
cause  it gives the most improvements overall, and  segment  2  is  chosen  from
the west side because it is connected to  two  segments.  However,  estimates
from segments 2 and 4 have similar variance levels,  so the choice between
these  segments is somewhat arbitrary.  This suboptimal sampling  strategy  re-
quires approximately 8 percent of the effort  of the  full measurement
strategy.

    A  summary of some of the major characteristics of  the  suboptimal strat-
egy (as compared to the full measurement  case)  is  presented  in  Tables  9 and
10.  The current best estimates of the standard deviations of the parameters
are shown for all model segments in column (a) of  Table 9.   The  standard
deviations attainable with the full measurement strategy are shown  in  column
(b).   The standard deviations are reduced by  approximately an order of mag-
nitude in segments 3 and 5, whereas approximately  a  half order  of magnitude
reduction occurs in segments 1, 2, and 4.  Larger  reductions in  segments  3
and 5  are probably due to the counterclockwise flow  pattern  in  the  bay and
the structure of the system equations.  Thus, for  parameter  identification,
more information is gained from measurements  in segments 3 and  5  compared to
segments 1, 2, and 4 for the same measurement cost.

    The standard deviations for the parameters with  the suboptimal  measure-
ment strategy are shown in column (c) of  Table 9.  Substantial  reductions in
the standard deviations are obtained even though this  strategy  involves only
8 percent of the effort of the full measurement strategy.  Again, the  most
significant improvements are obtained in  segments  3  and 5.   It  is interest-
ing to note that estimates in segment 3  are more accurate  than  in segment 2
even though no measurements are performed in  segment 3 while measurements
are performed in segments 2 at t = 3, 8,  9, and 10.

    Important aspects of state variable  (as opposed  to parameter) estimation
are presented in Table 10.  Standard deviations of the estimates  of chloro-
phyll  a, herbivorous zooplankton, and carnivorous  zooplankton at  their peak
population concentrations are presented for a number of cases.   The calcu-
lated  standard deviation at the peak population in each segment  with the
full measurement case is given in column  (a).  Column  (b)  presents  the
standard deviations at the peak population with the  suboptimal  strategy.
Suboptimal standard deviations are approximately an  order  of magnitude
larger than the full measurement standard deviations,  except for  chlorophyll
a in segments 2 and 5.  Chlorophyll a is  estimated more accurately  in  these
segments because the suboptimal strategy  involves  measurements  in segments 2
and 5  during the critical peak growth period for chlorophyll a.

    Herbivorous and carnivorous zooplankton are not  estimated nearly as ac-
curately as chlorophyll a in segments 2  and 5.  The  reasons  for this are
twofold.  First, the suboptimal measurement strategy does  not contain  mea-
surements during the important growth periods for  herbivorous and carniv-

                                     53

-------



















o;
0
i -
UL_
oo
LU
	 1

>

ry
LU
1—
LU
CXL OO
CC LU
Q_ i — i
O
U_ LU
O (—

oo cc:
0 to
t — t
1— 1—
< z
1— < 1 1 1
^^ ^-
LU LU
o o:

Q OO
eC LU
Q 2:
t^.
«=c oo
1— ZD
oo o
1 — 1
LU CC
1— >
eC
<
c— '

.
cn

LU

m

1 —






















-a
CU CO
•— -M > c:
i— c o i— o
3 CU S-  •<-
• 	 S- S- •!- T- -O
CU 13 d C
4-> CO _E -i- O
14- ro +J O
ef CU -i—
E 5
-o
CU
i— -l-> > S-
i— c o a.'
3 CU S- 4-> CO
C|- E O- CU CU
— • cu E E 3
T3 S- S- -i- ro r—
~-- CU 3 S- rO
-1— J t/) C" f« ^
M- ro +J Q.
eC CU •<-
E 3

i
i— CU
S- ro S-
• — » a> i E 3 -M
O -M _Q •!- CO C
- — <4- 3 -(-> ro CU
0 E


1
CU
S- S-
•— * CU i — 3 +->
jO -Mi — CO C
- — 14- 3 ro CU

13 CO
O CU


s-
•— CU
O 3
c























^_
cu

cu
E

S-
rd
Q-



o
^t~ *3~ r^
LO i — OJ i — OO
O cn i — i — LO r-~ o
UD O OJ O OJ O O
r— i— O O O O O
o o o o o o o






o
UD OJ CO LO OO
i — o LO cvj cn *3- cvj
oo o f~~- r--. cn r— o
LO CVJ O O O O O
o o o o o o o
o o o o o o o



o
oo
cn «3- co co oj
oo oo cn UD LO co O
oo cn to o cn i — o
«* i— O O O O O
o o o o o o o



UD
en r~>. LO LO
1JD 1 — !*•» i — 1"-- CO O
CO <3- OJ O CO O O
CVJ r— O O O O O
o o o o o o o



LO i —
O LO O i — O r-- oo
i — OO CD i — O i — O
cn cvj i — o oj o o
o o o o o o o







cu
ro
S-
o
-C CU •!-
-4-> -M O -M
S ro CU T- »O
OS- -M O -M S-
S_ rd •!- rd
cn c s- -t-> S- i —
O CU n3 i —
-a -i- -M c s- r— >>
CU +-> rd O i — -C
4-> ro S- •!- i— >, Q.
> Q. S-
rs CL c s- .c: o o
+J CO •!- •!- O- S- 1 	
rd CD N Q- O O JZ.
CO S- rd CO S- i — CJ
S- CO O jc
c: c cn S- i — o o
o o -E -t->
-M -M c c: o o
^: .* O o -t-> oo
C C -M +J O 3
rO rd -i} >-> O O S.- -M O
^: -E O O  — O
<£> r— O O •— O O
O 0 O 0 O O O
o o o o o o o




OJ o
cn r-» LO ID
oo LO *3- «j- co oj i—
OJ ID VO O O i— O
«* •— O O r— O O
o o o o o o o


ID
OJ ^t" ^^
r^« f^* Ovj ^xj" oo
LO co r^. i — o oo O
cn i — oj o «*• o O
OJ ' — O O O O O
o o o o o o o



LO i—
O LO O •— O 1^ 00
r— OJ O 1— O 1— O
cn oj r— o cvj o o
o o o o o o o




CVJ CVJ OJ OJ OJ OJ OJ


cu
ro

o
-C CU -i-
+J -M O 4->
3: re CU -r— rd
OS- .M O -M S-
S_ rd -r- rd
cn c: s_ -M s- r—
O CU rd i —
-0 -i- -M C S- ,— >,
CU +-> rd O i — -C
+-> ro S- -i- i— >> Q-
rd S- +-> r— .c O
S- -i- cn ro >> CL i-
3 Q- C S- .C O O

rO CU M Q. O O -C
CO S- rO CO S- i — O
S- CU O _C
c c cn s- •— u o
O O _C -M
-M -M C C O O
jy: ^K; o o -t-> co
C C -M +-> O 3
rO ro * c J-
>— •— c c: cu o
D. O. ro ro c Cn_c
O O i — i — O o O.
-P -M CL CL.a S- co
>> >^> O O S- 4-> O
£1 .c O O rd •.- ^:
D_ a. M r-J c_3 2: Q.

r^ co
cn [••«• i~^ o r~»
oo LO cn LO i^^ ID o
<— oo •* o i^ r— o
CO OJ O O CD O O
o o o o o o o
o o o o o o o






LO i—
CO LO CO O CO
cn to co LO cvj i — o
"JD r—. CO O ID r— O
OJ r— O O O O O
o o o o o o o
o o o o o o o



1^,
cn cvj o
CVJ i — CVJ «3- •*
r^ cn o «3~ i — cn O
CO CVJ OO O LO O O
•— •— o o o o o
o o o o o o o



OO I^~ CO
r*^ *^" r*^. co cn t-D <~~
^j~ ^d~ *JD o cn cvj o
ID LO O O O O O
O O O O CD O CD
O O O O O O O



LO i —
O LO O i — O f**. CO
r— CVJ O r- O i— O
en cvj i — o oj o o
o o o o o o o




CO OO CO OO CO CO CO


cu
ro
S-
o
-c: cu •[-
-t-> +-> O -M
S ro  O -M S-
S_ rd -i- ro
cn c s- +-) s- i—
O CU ro i —
T3 -i- -M C S- i — >)
CU 4-> ra O i — .C
-M ro S- -r- r— >> O.
rO S- -M i— -C: o
S- •<- cn ro >i Q. S-
3 a. c s- -c o o

rO CU M CL o O -E
co S- ro co S- >— O
S- CU O .E
E E Cn S- r- O O
O O -E -M
-M 4-> E E O O
-*: .* O O +-> CO
E E 4-J +-> O 3
ro rO ^^ -^ -M E S-
i— r— E E CU O
Q. CL rO rO E Cn_E
O O r— i— O O Q-
-M -P Q. CL r*t j_ co
>i >> O O S- •»-> O
J= .E O O (d -i- JC
Q- Q- M IVI C-3 ^ D_

^-^
-^
eu
3

>r_
-M
O
CJ













































































54

-------


































. — *.
~o
cu
•r-
O
C_)


•
cn
UJ
1 J
CO
^£
1—






































T3
CU CO
r— 4-> > C
i— c o i— o
3 o> S- res -r-

Ol CU E 4-> •'-

cu 3 c c
4_ res 4-i o
ea: cu -i-
E S
-a
cu
i— 4-> > S_
i— C O CU
3 O) S- 4-> CO
c&_ g= Q_ CU CU
'— > O) E E 3
ID J_ S- -r- res i—
- — cu 3 S- res
•4—5 t/1 -^* fO ^
<4- reS 4-J Q-
=i: CU •<—
E 2

1
r— CU
S_ res s-
'— - CU 1 E 3 4->
O 4-) J3 -i- CO C
— M- 3 4-> res a>
cC CO O- CU E
0 E
,
cu
s- $-
-— • O) i— 3 4->

1 — " <4— 13 res cu
<^ cu. E
cu
[ * 4-3
c res
^ CU E
res s_ -r-
• 	 S- 4->
13 CO
c_> cu


S-
r— O>
'aJ'E
C_> 3























J_
cu
4_>
OJ
E
res
S-
res
Q-






•^i- cn CD
•* -vi- cn i — co
o i — «3- i — cn cn o
cn o CM o *d- o O
CM •— O O O O O
o o o o o o o






O 00
O I^ CM r—
co ^J- cn co in cn CM
co cn r-~ o *3~ i — o
00 i— O O •— O O
o o o o o o o
O O O 0 O O O




O CM CO *d"
^J- CO O CD CD *3" i —
cn r^> r^^ o co i — o
CD i — O O i — O O
o o o o o o o

o cn
CM CO ^f i — "3-
cn o CM i — co o o
r^ CM co o Ln i — O
CO i— O O O O O
o o o o o o o


cn r—
o cn o i — o r«~ co
i — CM O i — O i — O
cn CM i — o CM o o
o o o o o o o





*3- <* ^t- «3- «=j- ** ^±

cu
4->
res
S-
0
^: a> •<-
4-) 4-> O 4->
S res cu -i— res
OS- 4-> O 4-> S-
S- res T- res
en c: S- 4-> S- i —
O cu res i —
T3 •<- 4-> C S- i— >,
a; 4-> res o i — .c:
4-> res s- -i- r— >> Q.
res s_ 4-> i— jc o
s- -i- en res >i Q. s-
3 o. c S- -c o 0
4-3 CO -r- -f- Q. S- 1 	
res cu N Q. o o -C
co S- res co S- i — o
s_ cu o .C
c c: cn s_ r— o o
O O -C 4->
4-> 4-> C C U O
^* ^£ O C> 4-» CO
C C 4J 4-> O 3
res res ^^ NX 4-3 c S—
•— r— C C 0) 0
o_ Q- res res cz cn ^.
O O i— r— O O O_
4-> 4-> Q. Q.JD S- tO (
>> >> O O S_ 4-> O i
jn .c o o res -i- ^i
D- D- M rxi o z: a. |
55

in cn
co t — oo ^)- LO
tn CD co co r-» i — o
co t— CM o cn CM o
CD CM O O O O O
o o o o o o o
O 0 O O O O O






CO ^J"
i — co r~- i — CD
rv. o co CM i — i— o
cn «=j- CM o en i — o
CO i— O O O O O
o o o o o o o
o o o o o o o


cn
cn co CM cn
cn o i — CD en CM
r-. co r-. CM co cn O
i— CO O O r— O O
CM O O O O O O
O O O O O O 0

cn in co o
 4-3 O 4-3
2 reS CU •' — reS
OS- 4-> O 4-> S-
S- reS -i— res
cn c S- 4-> S- r-
O cu res i—
"D -r- 4-> C S- 1— >,
cu 4-> res o i — -c
4-> res s- -i- i— >, o.
res s- 4-J " — .c o
s_ -i- cn res >> Q. s-
3 Q- C S- -C O O
4-3 CO •!-•!- Q. S- 1 	
res cu N O- o O x:
co s- res co S- i — o
S- CU O .C
c c: cn s- i— u o
O 0 -C 4->
4-3 4-> c c: u o
^ ^t O O 4-> CO
C C 4-> 4-> O 3
res res _*: _i*: 4-3 c: i-
i— i— C C QJ O
O- O- res res c: cnx:
O O i — i — O o Q.
4-> 4-J D- CL_D $_ co
>> >i 0 0 S- 4J O
-C ^: o o res -r- x:
Q- D- M r~>l C_3 Z Q-


-------













Q<
o
u_

oo

1— H
1—
i

O-
O
D_

^
LU LU
CL. l— I
03
LL. LU
O H-
^^
OO O£
z I—
o oo
i — i
& l~~
i — i iii
^* ^~
LU LU
Q o:
§ <:
<=C LU
r"*t ^"

oo o
O O£
LU 
<=C

^3
C_)
O
O
LU
	 1
CO
^


















I — -f_>
I 	 C~
3 CL)
— x 4— E
-O CD
• — S-. S-
CU 3
M— ro
•=£. 0)
E

•— -M
i — C
3 CU
•+- E
-^ CU
O S- S-
• — 0) 3
_L j ,/ JO
<=C 00





S-
~-, OJ
rO +J
• — t|_
^






































0) 00
> C
0 r— 0
S_ ro •<-
Q.-I- +J
E •*-* '("~
•r- -r- T3
C C
J^ •! O
-(-> 0

2
CU
> S-
O  Q-
•1 —
2

|
i— 0)
ro S-
E =3 4J

Q. CD £
0 E


1
CU
s_
i— 3 4->
r— 00 C
=3 ro CU
H_ r^


ro oo
O> CU ^^
E Q- CU
f^ M- 2
0 --^


S-
i— CU
O) i — J3

O Q) 3
s: o c







00
0)

f~i
rO
»i—
S_
ro
^»

CU

rO
j S
OO




oo o
F-^. 00
LO ro i—
ro o o
CM 0 0
o o o





CM 00

ro oo i —
OO O O
CM O O
o o o




CT) VO
o LO r^.
sf ro i —
CM O O
CM O O




r— O
00 CO
r^ oo i —
OO O O
CM O O
O O 0




oo sj- r>-







r—' i — r—




c: c
0 0
"^ S
c c;
ro ro
Q. Q.
0 O
O 0
ro N N

i — OO OO
' — Z3 ""^
>5 O 0
JT s- s-
Q. O O
0 > >
S_ -r- T-
O J3 C
•— ±- s-
JZ CU ro
C_5 3Z O



•— CO
LO CO
GS LO r—
O O O
00 O O
o o o





f^v. ^—
r— OO
i — LO i —
00 O O
CM O O
o o o




LO Sf
i — f^, r^.
Sf Sf r^
00 O O
O 0 O




LO O
co cr>
O LO i—
i— O O
OO O O
o o o




CTi sf r*.







CM CM CM




5- j-
0 0
+2 t^
!Z C
ro ro
'Q.'Q.
0 0
o o
ro N N

i — oo oo
•— =3 3
>i 0 0
-c s- s-
Q- O O
0 > >

0 JD c
•— s- s-
-CT Ol ro
C_3 HI C_)



O sf
r— VO
i— OO r—
r-^ o o
sf O O
o o o





sf oo
O to
OO OO r—
CM O O
sf O O
o o o




r*-* o
LO CT) IO
OO CM .—
OO O O
oo o o




i— LO
n— to
sf OO i—
en o o
sf O O
O O 0




to oo to







oo oo oo




c: c
0 0
"*"* ^
c c
ro ro
'Q.'Q.
0 0
0 0
rO NJ tsl

I — 00 00
i — ^3 ^3
>> 0 0
JT s- s_
CL O O
o > >

o Jo c
r- i- S-
JT OJ ro
O 31 O

r^
LO Sf
O to
CM O
CM O O
CTi O O
r- O O
O 0 O



r--
sf sf

CM O
LO O O
oo o o
•— o o
o o o


ro
oo sf
o to
i — CM O
CM O O
sf O O
r- 0 0


00
to sf
o to
CM O
sf O O
O O O
CM O O
O 0 O




O LO r-^







sf sf sf




c c:
0 0
4-> 4->
c c
ro ro
Q. CL
O 0
0 0
ro tsl N

i — 00 00
i — 13 3
>> 0 0
J^ ^. i-
Q. O O
0 > >
i- -r- •<-
O JO C
r- S- S-
jz rjj ro
0 3= C_>


sf
00 >—
OO CM
oo r^ o
en o o
.— 00
O O 0




r— CM
CM r—
r^ CM
CT> O O
to o o
r- 0 0
000



CTi CO
LO O
r^ LO CM
CM O O
CM O O
000



CM LO
Sf r—
r^ CM
to o o
CTi O O
.— 00
o o o




OO r— LO







LO LO LO




c c
0 0
-(-> 4->
c c
rO ro
Q. CL
o o
0 0
(T3 hsj N

i — 00 C/l
r— 33 ^?
>» 0 0
JT s- s_
Q. O 0
0 > >
S_ •,- -r-
O r^ c~
•— S- S-
JZ CU ro
c_> re o
56

-------
orous zooplankton.  Second, the  variances  for  these  two  state  variables
change most rapidly for t > 10,  but the suboptimal measurement  strategy  does
not include measurements after t  = 10.  If  peak  population  prediction  of
zooplankton is desired, the measurements at t  =  8  and  10 should  probably be
eliminated and replaced by measurements t  = 11 and t = 17.   In  such  a  case,
the suboptimal strategy would consist of full  measurements  at  t  =  3, 9,  11,
and 17.

    It is also of  interest to determine how the  uncertainty of model states
changes when knowledge of the model initial states and parameters  improves
when a full measurement strategy  is employed.  The first case  involves full
measurements with  an order of magnitude reduction  in the initial standard
deviations of the  parameters.  The second  case considers full  measurements
with an order of magnitude reduction in the initial  standard deviations  of
the state variables.  Physically, the first case corresponds to  a  systematic
improvement of the parameter values.  This might occur during  the  season
year of a sampling program if the first year of  monitoring  resulted  in an
order of magnitude reduction of the parameter  standard deviations, or  it
could occur as a consequence of  laboratory  or  field  experiments.   Case 2
could correspond physically to a  situation where an  extensive  initial
sampling effort allows the standard deviations of  the  state variables  to be
reduced by an order of magnitude.

    The results of these cases are summarized  in columns (d) and (e) of
Table 9 and columns (c) and (d) of Table 10.   A  comparison  of  the  results in
columns (d) and (e) with column  (b) in Table 9 indicates that  standard devi-
ations are reduced by approximately a factor of  two  to four and  that im-
provements of parameters gives somewhat better results than improvements of
initial conditions for parameter estimation accuracy.  It is interesting to
note that the peak concentration  standard  deviations are approximately the
same for all three full-measurement cases  (see Table 10).   Thus, an  order of
magnitude reduction in the initial parameter or  state  variable standard  de-
viations does not  increase the capability of making  more accurate  peak value
predictions.  This is probably because of  large  increases in the standard
deviations of chlorophyll a, carnivorous zooplankton,  and herbivorous  phyto-
plankton during critical growth periods, and only measurements  in  these  re-
gions can insure reasonably accurate values.   This is  in sharp contrast  to
the results presented earlier with no measurements where improvements  in the
state initial conditions and parameter values  significantly reduce the cal-
culated standard deviation at the peak populations (see  Table  8).


SUMMARY

    This section contains a preliminary investigation  of the potential ap-
plicability of Kalman filtering techniques for concentration and parameter
identification associated with a phytoplankton and nutrient cycling model
for Saginaw Bay.  It has been determined that  the  accuracy  of uncalibrated
model predictions is limited.  Therefore, field  monitoring  and laboratory
research programs are necessary to improve model predictive capabilities.
Cost-effective monitoring programs must be designed  using trail-and-error
optimization rather than automatic optimization  because  of  the relatively

                                   57

-------
large number of differential equations required to describe the annual
nutrient and plankton cycles.

    A suboptimal, but effective, sampling program has been developed to  im-
prove the accuracy of the model calculations.  An example monitoring program
for Saginaw Bay involves approximately 8 percent of the maximum number of
samples,  it has been determined that this sampling program applied in the
context of the Kalman filter and the system model can significantly reduce
the uncertainty associated with predicted values of the state variables  such
as chlorophyll a.  Several tables are presented showing these results.   Al-
though the limited analyses in this study are preliminary and incomplete, it
is concluded that Kalman filtering techniques can be applied to determine
the accuracy of limnological model predictions as a function of the uncer-
tainties associated with the model parameters or coefficients and initial
conditions.  If the accuracy of such models is unacceptable, even with good
knowledge of parameters and initial conditions, it may be necessary to
supplement model predictions with field measurements of the concentrations.
In these cases, Kalman filtering and optimization techniques can be used to
determine efficient and cost-effective monitoring programs.
                                     58

-------
                                 REFERENCES


Beck, B., and P.C. Young.  1976.  Systematic Identification of DO-BOD.
    Journal of the Environmental Engineering Division, Vol. 102, No. EE5,
    pp. 909-927.

Canale, R.P., L.M. DePalma, and A.H. Vogel.  1976.  A Plankton-Based Food
    Web Model for Lake Michigan.  In:  Modeling Biochemical Processes in
    Aquatic Ecosystems, R.P. Canale, ed., Ann Arbor Science, Ann Arbor,
    Michigan.

Chapra, S.C.  1975.  Comment on an Empirical Method of Estimating the Reten-
    tion of Phosphorus in Lakes by W.B. Kirchner and P.O. Dillon.  Water
    Resources Res., 11: 1033-1034.

Chapra, S.C.  1977.  Total Phosphorus Model for the Great Lakes.  ASCE J.
    Environmental Engr. Division, Vol. 103, No. EE2, pp. 147-161.

Davenport, W.B., Jr.  1970.  Probability and Random Processes.  McGraw-Hill,
    New York.  542 pp.

DePalma, L.M.  1977.  A Class of Measurement Strategy Optimization Prob-
    lems—with an Application to Lake Michigan Surveillance.  Ph.D. dissert-
    ation, University of Michigan, Ann Arbor, Michigan.   122 pp.

Dillon, P.J. and W.B.  Kirchner.  1975.  Reply to Chapra (1975).  Water Re-
    sources Res., 11:  1035-1036.

Gelb, A.  1974.   Applied Optimal Estimation.  MIT Press, Cambridge,
    Massachusetts.  374 pp.

Great Lakes Water Quality Board.  1976.  International Surveillance Program
    for the Great Lakes.  In:  1975 Annual  Report, Great Lakes Water
    Quality, Appendix  B, Windsor, Ontario,  Canada.  255 pp.

Harrison, M.J.,  R.E. Pacha, and R.Y. Morita.  1972.  Solubilization of In-
    organic Phosphates by Bacteria Isolated from Upper Klamath Lake Sedi-
    ment.  Limnology and Oceanography., 17: 50-57.

Hutchinson, G.E.  1957.  A Treatise on Limnology,  Vol. 1.   John Wiley and
    Son, New York.  1015 pp.

Jazwinski, A.H.   1970.  Stochastic Processes and Filtering Theory.   Aca-
    demic Press, New York.  376 pp.


                                     59

-------
Junge, C.E. and R.T. Werby.  1958.  The Concentration of Chloride, Sodium,
    Potassium, Calcium and Sulfate in Rain Water over the United States.  J.
    Meterology., 15: 417-425.

Kalman, R.E.  1960.  A New Approach to Linear Filtering and Prediction Prob-
    lems.  Trans. ASME, J. Basic Engr., 82D: 34-35.

Kalman, R.E. and R. Bucy.  1961.  New Results in Linear Filtering and Pre-
    diction Theory.  Trans. ASME, J.  Basic Engr., 83D: 95-108.

Kirchner, W.B. and P.J. Dillon.  1975.  An Empirical Method of Estimating
    the Retention of Phosphorus in Lakes.  Water Resources Res., 11:
    182-183.

Krogh, F.T.  1969.  VODQ/SVDQ/DVDQ -  Variable Order Intergrators for the
    Numerical Solution of Ordinary Differential  Equations.  Jet Propulsion
    Laboratory TU Document No.  CP-2308, NPO-11643.   22 pp.

Larsen, D,P. and H.J.  Mercier.   1976.  Phosphorus Retention Capacity of
    Lakes.  J. Fisheries Res. Board of Canada.,  33: 1742-1750.

Lorenzen, M.W., D.J. Smith, and L.V.  Kimrnel.  1976.  A Long-Term Phosphorus
    Model for Lakes:  Application to  Lake Washington.  In:  Modeling Bio-
    chemical Processes in Aquatic Ecosystems, R.P.  Canale, ed., Ann Arbor
    Science, Ann Arbor, Michigan.

Michalski, M.F.P., M.G. Johnson, and  D.M. Veal.   1973.  Muskoka Lakes Water
    Quality Evaluation, Report  No. 3:  Eutrophication of the Muskoka Lakes.
    Ontario Ministry of the Environment, Toronto, Ontario, Canada.  57 pp.

Moore, S.F.  1971.  The Application of Linear Filter Theory to the Design
    and Improvement of Measurement Systems for Aquatic Environment.  Ph.D.
    Dissertation, University of California at Davis, California.  218 pp.

Murphy, T.J. and P.V.  Doskey.  1975.   Inputs of Phosphorus from Precipita-
    tion to Lake Michigan.  EPA-600/3-75-005, U.S.  Environmental Protection
    Agency, Duluth, Minnesota.   27 pp.

O'Connor, D.J. and J.A. Mueller.  1970.  A Water Quality Model of Chloride
    in Great Lakes.  ASCE J. Sanitary Engr. Div., 4: 955-975.

Quinn, F.H.  1975.  Lake Michigan Beginning-of-Month Water Levels and
    Monthly Rates of Change of  Storage.  National Oceanic and Atmospheric
    Administration Report No. TRERL326-GLERL2, Boulder, Colorado.  32 pp.

Quinn, F.H.  1977.  Annual and  Seasonal Flow Variations through the Straits
    of Mackinac.  Water Resources Res., Vol. 13, 137-144.

Richardson, W.L. and V.J. Bierman, Jr.  1976.  A Mathematical Model of Pol-
    lutant Cause and Effect in  Saginaw Bay, Lake Huron.  In:  Water Quality
    Criteria Research  of the USEPA, EPA-600/3-76-079, Corvallis, Oregon.
    cp. 138-158.

                                     60

-------
Sage, A.P. and J.L. Melsa.  1971.  Estimation Theory with Applications to
    Communications and Control.  McGraw-Hill, New York.  529 pp.

Saylor, J.H. and P.W. Sloss.  1976.  Water Volume Transport and Oscillatory
    Current Flow through the Straits of Mackinac.  J. Physical Oceanography,
    Vol. 6, 229-237.

Sonzogni, W.C.  1974.  Effect of Nutrient Input Reduction on the Eutrophica-
    tion of the Madison Lakes.  Ph.D. Dissertation, University of Wisconsin,
    Madison, Wisconsin.  352 pp.

Stewart, K.M.  and S.J. Markella.  1974.  Seasonal Variations in Concentra-
    tions of Nitrate and Total Phosphorus and Calculated Nutrient Loading
    for Six Lakes in Western New York.  Hydrobiologia., 44: 61-89.

Thomann, R.E., D.M. DiToro, R.P. Winfield, and D.J. O'Connor.  1975.  Mathe-
    matical Modeling of Phytoplankton in Lake Ontario - 1.  Model Develop-
    ment and Verification.  EPA-660/3-75-005, U.S. Environmental Protection
    Agency, Corvallis, Oregon.  177 pp.

U.S. Army Corps of Engineers.  1975.  Lake Erie Wastewater Management Study:
    Preliminary Feasibility Report, Buffalo, New York.  167 pp.
                                     61

-------
                                 APPENDIX A

                 DISCRETE KALMAN FILTER AND OPTIMIZATION OF
                           MEASUREMENT STRATEGIES
INTRODUCTION

    The Kalman filter equations for a scalar differential equation process
model with discrete, full state measurements were presented in Section 4 to
develop insight into the basic procedure.  In this appendix, the Kalman fil-
ter equations for a discrete process model (for example, the discretization
of a system of differential equations by numerical integration formulas) and
discrete measurements will be presented along with the method employed to
minimize the cost subject to constraints on the state estimate variance.
DISCRETE KALMAN FILTER

    In this section, the process model is assumed to be of the form

         x(k+l) = <|>k x(k) + w(k), (k = 0,1,...,k)                      (A.I)

where x(k) is an n-vector describing the state at index k (for example, time
tk)» ^k is an nxn matrix describing the known relation between the state at
x(k) and x(k+l), and w(k) is the model equation error at k (for example,
model inadequacy, lack of knowledge of parameters in 4^, etc).  (Since the
index k is usually associated with time, k will usually be referred to as
time instead of index.)  Equation (A.I) is a linear difference equation;
however, most water quantity models are nonlinear.  The Kalman filter  and
optimization equations will be presented for the linear system of Equation
(A.I).  It will be shown how to transform nonlinear water quality process
models into the form of Equation (A.I) later in this section.

    Equation (A.I) may result naturally (for example, in Section 5 a dis-
crete model is employed to describe year-to-year changes of phosphorus,
chloride, and sinking rate) or from a discretization of a differential equa-
tion model (for example, in Section 6 the LAKE-1 differential equation model
is discretized to form a system of difference equations).

    As with the development of Section 4, to apply the Kalman filtering ap-
proach to Equation (A.I), statistics associated with the initial state,
x(o), and the model error, w(k), must be supplied.  Again it  is assumed that
the model error is zero-mean, sequentially uncorrelated (white) noise, that
is,

                                    62

-------
                EJw(k)] = 0   ,  E[w(k)w(j)T]= Q(k)5kj                 (A. 2)


              ( 1 , k=j
where  5, . =      ,  , .  , and the initial state statistics  are
         KJ    (. U , KrJ



              E[X(O)] = xQ   ,   PQ -  E[(x(o)-xo)(x(o)-xo)T].            (A-3)


    In Section 4 it was assumed that the full state was measured at each
measurement time.   In this section, the general situation  in which measure-
ments of  linear combinations of the states are made at certain times  is con-
sidered.  A linear  combination of states at time k is a weighted sum  of the
individual states.   If a row vector Cm is introduced to write concisely this
weighted  sum, then  a measurement of type m at time n may  be denoted
             = cm xCO + vmk>                                          (A. 4)

where ym^ is the scalar measurement of the  linear combination Cm x(k)  and
vmk ^s the associated measurement error with


                                Rmk  •                                   (A.5)

If at any time k all states are measured, then the Cm set  is the identity
matrix.  If M types of measurements are made at each time  k = 1, ...,  N,
then M times N measurements are made in total.  As an example of this  nota-
tion, consider an eutrophication model with states x-j,  ..., X3 corresponding
to the eight water quality variables:  phytoplankton, zooplankton, dissolved
phosphorus, particulate phosphorus, nitrate, ammonia, organic nitrogen, and
silicon, respectively.  The measurement type corresponding to a measurement
of only zooplankton biomass would be

         C-] = [0 1 0 0 0 0 0 0],                                       (A. 6)

and the type corresponding to a Kjeldahl nitrogen measurement (ammonia plus
organic nitrogen would be

         C2 = [0 0 0 0 0 1 1 01.                                       (A. 7)

The error variance for a measurement of type m at time  k,  Rm|<, depends on
the number of vertical hauls made at the station.  The  variance of a total
phosphorus measurement would depend on the number of subsample digested and
analyzed.

    The Kalman filter combines the information provided by the model in
Equation (A.I - A. 3) and the measurements in Equations  (A. 4 and A.5) to com-
pute the conditional mean estimate of the system state, denoted by x(k),
which is the estimate based on all measurements taken prior to and at time


                                    63

-------
kJ  The filter is recursive in that it permits x(k) to be computed from
x(k-l) and the measurement vector at time k.  Recursive estimation is  a com-
mon procedure, for example, least squares curve fitting is commonly formu-
lated recursively so that each data point is processed as it is received
rather than the data being batch processed when all are available.

    As in Section 4, the Kalman filter equations are applied recursively  in
two stages.  Given x(k-l), P(k-l), the first involves the propagation  of  x
and P to the next measurement time, k.  The state prediction is simply the
difference equation without the model error, tha+ is,
         x(k-) = k.-| x(k-l),

and the covariance propagation is
                                  Q(k-D,
                                                             (A.8)
                                                             (A.9)
which has an interpretation analogous to Equation (4.8) in Section 4, that
is, the covariance is influenced by the dynamics through the first term and
the model inadequacy through the second term of Equation (A.9).

    The second stage involves processing the measurement at time k, and the
equations are essentially the same as Equations (4.6 - 4.8) in Section 4
since discrete measurements were assumed there.  In particular
                       M
          P(k)  =
                           r   r  "
                           m  m
                        -1
                            R
                            'mk
                          P(k")
                                           and
                                                             (A.10)
x(k) - x(k") + P(k)
                             m=l
ymk-Cmx(k"
                                                                      (A.11)
The difference between these equations and Equations (4.6 - 4.8) result from
the following:  the special way that the measurements are treated at each k
in Equation (A.4) (that is, as a set of scalar measurements); the fact that
a full state measurement is not assumed; and, because this form is more con-
venient for the measurement strategy optimization problem.  Also, because
the measurements are treated as a set of scalar measurements at a given k
(which is advantageous for optimization of measurement strategy purposes),
the Kalman gain provides little insight for this case and is not introduced.
As with the case of Section 4, note that if Rm|< is relatively large (indi-
cating a relatively bad measurement), then the coefficient of P(k~) in Equa-
tion (A.10) is approximately I, which implies P(k) * P(k"), that is, the
  n this section, x(k) is analogous to x(tk) of Section 4.  The + sign  is
 not employed because it becomes awkward for indices of the form k-1, k+1,
 etc.
                                    64

-------
measurement has  little  influence.   Conversely,  if  the  R^ are  relatively
small, the inverse of         j
                        I +  Y.  CT C /R ,
                            m = 1  m  m  mk

will also be relatively small, which results  in a  reduction  of P(k~)  in
forming P(k).


MEASUREMENT STRATEGY OPTIMIZATION

    It is evident from  Equation  (A.11)  that the estimate  for x(k)  for all
k = 1, ..., K cannot be computed until  the measurements  have been  taken.
that is, until ymk are  available for each m = 1,  ...,  M  and  k  = 1,  ..., K.
However, the accuracy of the estimates, which is manifested  in the  covari-
ance matrix P(k) for k  = 1,  ..., K, can be precomputed from  Equations (A.9)
and (A.10) once  the necessary  matrices, vectors,  and scalars have  been
specified.  This point  is vital to  the  technique  for definition of  optimal
measurement strategies.  The estimation error covariance  matrices  P(k) can
be computed for  every k = 1,  ..., K once the  measurement  strategy  has been
specified for k  = 1,  ..., K.   Computation need  not wait  until  the  actual
measurements have been  taken.  The  Kalman filter  exhibits this property be-
cause the statistics of model  uncertainty and measurement uncertainty are
not computed during the processing  but  rather are  assumed known prior to
time zero.  The manner  in which the statistics  might be  quantified  are de-
monstrated in the applications in Sections 5  and  6.

    The vector estimates, x(k), for k - 1,  ..., K  are  comprised of  n  indi-
vidual state estimates  x-j(k) for i  = 1, ...,  n.   The accuracy  of x-j(k) is
reflected in the estimation error variance of x-j(k), tnat is the itn  dia-
gonal entry of the estimation  error covariance  matrix  P(k).   It is  of con-
siderable interest to investigate the effect  of the measurement strategy on
the variances P-j-j(k).   In particular, it may  be of interest  to calculate  the
minimum-cost measurement strategy that  satisfies  constraints

         P-j-jtk^Pmax-jk  for  i = 1, ...,  n and  k  =  1,  ...,  K.            (A. 12)

In Equation (A.12), P-j-j(k)  is  a scalar, and the Pmaxik are bounds  that are
imposed to guarantee that the  resulting estimates  of x-j(k) are of  sufficient
accuracy to serve some  purpose.  In Section 5,  for example,  values  of Pmax-j^
are chosen to guarantee that the state  estimates will  be  accurate  enough  for
long-term pollutant trend detection.

    The optimization problem of this section  will  be formulated to  be com-
patible with DePalma (1977).   In order  to accomplish this, assume  that a
measurement ym|/  is actually the arithmetic average of  a  number of multiple
measurements.  In particular,  suppose L^ multiple  measurements are  made of
the form

         y1' = Cmx(k) =  r1'  for  i = 1,  ..., l_k                         (A. 13}
                                     65

-------
where ri is a measurement error (associated with y1) and zero mean  and  vari-
ance Rm.  The resultant measurement is the arithmetic average of these  L|<
measurements, or
If the correspondence

                     Lk

          rmk = IT   £  r'                                            (A-
                 K  1 — I

is made, then Equations (A.4) and (A.14) are equivalent.  Furthermore,  if r1'
are independent random variables, then the variance of the resultant mea-
surement error is inversely proportional to Lk,

         Rmk = Rm/Lk-                                                  (A.16)

    For many types of measurements, the number of multiple measurements  is
limited by such practical considerations as the number of stations that  can
be sampled in one day or the number of analyses a laboratory can schedule.
For such situations in which each measurement must be computed from no more
than Lmax multiple measurements,

         0 £ L < Lmax for each k = 1, ..., K.                          (A.17)

Furthermore, the cost of multiple measurements is usually proportional to
the number of multiple measurements.  Therefore, the total cost of a mea-
surement strategy would be

            K

           k^1 a  Lk                                                    (A.18)


where a is the cost of taking one measurement of each type m = 1, ..., M.
The unit cost a and the maximum number of multiple measurements Lmax are as-
sumed not to be functions of time k.

    In the optimization problem, Lk is interpreted as a variable that  can be
chosen from the interval  defined by Equation (A.17).  Besides having control
over the measurement strategy, it may be desirable to control the model
error covariance matrices defined by Equation (A.9).  The motivation for the
formulation that follows  is the trend detection problem examined in Section
5.  In that problem, one component of model error is caused by tributary
load uncertainty.  The variance of this error, however, depends upon the
number of days for which  the tributaries are sampled.  In this optimization
problem, both a measurement strategy (in the sense defined in the preceding

                                    66

-------
paragraph)  and  a  tributary  sampling  program will  be sought that minimize
cost  and  satisfy  Equation  (A.12).
    First, define  an  nxn matrix  at  each  time  k

         n(k) = Q(k)
                                       (A.19)
and consider  a  situation  in which  fi(k) may  be  decomposed  into  the  sum of an
uncontrollable  matrix £2u(k) ^nd  a  controllable matrix  ^c(k).   Furthermore,
assume that only the diagonal  elements of fic(k)  can  be controlled  and that
other elements  are  zero.   DePalma  (1977) has demonstrated  that the error
variancescof  tributary  load estimates can be interpreted  as  the diagonal
entries ^-j-j(k)  of a controllable matrix ^c(k).   Control over these elements
is feasible if  it is possible  to control the number  of days, D(<, for  which
tributary water samples are, collected.  A cost of  Dk will  be assigned to the
tributary sampling  program.  The total cost, over  K  years,  is  then
             K
            E
                                                                       (A.20)
    The optimization problem considered  by  DePalma  (1977)  requests  the  mea-
surement strategy  (Lk for k =  1,  ..., K)  and  the  number  of days  (Dk for
k = 0,  ..., K-l) which minimizes  the total  cost  (sum  of  Equations  (A.18)  and
(A.20)) subject to constraints on estimation  error  variances  defined by
Equation (A.12).  An algorithm is described by DePalma  (1977)  to solve  this
problem, and  it is applied to  the trend  detection problem  of  Section 5.   The
theory behind the algorithm utilizes the  theory of  mathematical  programming
and will not  be presented here.
APPLICATION TO NONLINEAR MODELS

    In the previous sections of this appendix, only  linear models  were  em-
ployed as a source of information to estimate the  state  of a  physical
system.  Most physical systems,
mined in this report, cannot be
additive modeling uncertainties
X(k+l), is a nonlinear function
tainty in this transition cannot
including the water resource systems exa-
described adequately by a linear model with
  In general, the system state at time k+1,
of the state at time k, X(k), and the uncer-
 be interpreted as additive.  A very general
nonlinear stochastic difference equations of the form

                = fk[X(k), W(k)l                                       (A.21)

will be considered where f is a vector of n nonlinear functions,  each  de-
pendent on some or all of the states X(k) and the uncertainties W(k).  The
initial__state_and the random uncertainties have means and covariance ma-
trices X(0), W(k), P(0), and Q(k),  (k - 1, ..., K), respectively.

    The Kalman filter is a scheme for computing the conditional mean esti-
mate of X(k), based on a linear model and measurements.  Exact schemes for

                                     67

-------
computing the conditional mean of X(k) for a nonlinear model have  not yet
been developed.  Approximate schemes have been derived and are presented  in
Sage and Melsa (1971).  Most of these methods solve^two couple difference
equations, one for an approximate conditional mean X(k) and the other; for  an
approximate estimation error covariance matrix P(k) associated with X(k).
Because the equations are coupled and X(k) depends upond the actual measure-
ment values, the covariance matrix P(k) also depends upon the actual mea-
surement values, the covariance matrix P(k) also depends on the measure-
ments.  Therefore, P(k) for k = 1, ..., K cannot be precomputed, even if the
measurement strategy has been specified prior to time zero.  Because of this
limitation, these methods for approximating the conditional mean will not  be
considered further in this report.

    However, a method has been proposed (the linearized Kalman filter) for
which the approximate estimation error covariance matrix can be precomputed
and is not dependent on the actual measurements.  Rather than X(k) being
estimated directly, the deviation of X(k) from a nominal state Xnom(k)  is
estimated.  The nominal state is that which one would expect if random
model uncertainties W(k) were known (and were equal to their means) and the
initial state was equal to its mean.  The nominal is computed by

         Xnom(k + 1) = fk[Xnom(k), W(k)],                             (A. 22)

which employs the full nonlinear model with Xnom(O) = X(0).  The deviation
of the state from the nominal is defined by

         x(k) = X(k) - Xnom(k).                                       (A. 23)

    A linear approximation to the dynamics of x(k) can be derived  by expand-
ing fk in a Taylor series and ignoring terms of order two or higher.  In
order to simplify notation, the vectors of partial derivatives are denoted
by
          f  x.  =  3fk/ax.(k)
          k n      k    n
f w. = 3f./3w.(k)
 k n     k   1
                            Xnom(k)

                            W(k)
                                                                       (A. 24)


                            Xnom(k)

                            W(k)
The dynamics of the deviation x(k)  are approximately  governed  by


                     X] fkx2...fkxn]x(k)  +w(k)                       (A.25)

where the initial deviation x(0)  has zero  mean_and  covariance  matrix P(0)
and where w(k) =  [f|
-------
variance matrix Q(k).  Note that the bracketed quantities are matrices be-
cause f^x-j and f^w-j are vectors.  By making the correspondence


        $k E Kxl fkx2 '•• VnJ                                      (A.26)
         N   |_ N  I  l\ t.       l\  M J                                      »     /

it is seen that Equation (A.25) is a linear difference equation of the form
of Equation (A.I).

    The measurement equation  (assuming that the measurements are  linear com-
binations of the state variables) is of the form
         Ymk = V(k) + rmk.                                           (A.27)

Upon substitution of Equation (A. 23) into (A.27),

         Ymk = Cm[x(k) + Xnom(k)] + rmk                                (A. 28)

or,

             * Ymk - CmXnom(k) = Cmx(k) + rmk.                         (A. 29)
Note that once the measurement Ym|< is available, the pseudo-measurement
can be computed and is of the form (A. 4).  In the linearized Kalman filter,
Equation (A. 25) is interpreted as the model of the physical system and the
conditional mean of the deviation x(k), and its error covariance matrix is
computed as earlier in this section.  Once x(k) has been computed, Equation
(A. 23) is employed to compute the state estimate

         X(k) = Xnom(k) + x(k).                                       (A. 30)
SUMMARY OF THE KALMAN FILTER AND MEASUREMENT STRATEGY OPTIMIZATION

    The procedures described in this appendix can be employed to estimate
the state of any physical system for which an uncertain nonlinear model and
noisy measurements are available.  The steps in this procedure are as
follows:

    (1)  Model the system and determine the necessary statistics
         of the initial state and model uncertainty.

    (2)  Compute the nominal state.

    (3)  Linearize the nonlinear model about the nominal.

    (4)  Optimize the measurement strategy and controllable model
         error covariance matrix.

    (5)  Compute the estimation error covariance matrix.
                                     69

-------
(6)   Collect  and  process  the  measurements  to  compute  the  estv
     mate of  the  deviation.

(7)   Compute  the  estimate of  the  state.
                                 70

-------
                                 APPENDIX  B

        OPTIMAL MEASUREMENT STRATEGY:   OVERVIEW  OF COMPUTER PROGRAM
    This program computes the minimum dollar  cost measurement strategy sub-
ject to satisfaction of specified  limits  on the  diagonal elements of the
state covariance matrix at each  event time.   The theoretical details are
presented in DePalma (1977).   An example  of the  usage of this program is
presented in Section 5.
DESCRIPTION

    This program optimizes measurement  strategy for the discrete time prob-
lem using linearized Kalman filter  techniques.  The method is described in
Appendix A.  Briefly, given the  following  system


         x  = ViVi  + Bk-iVi + Bk-iVi                          (BJ)

         zk = Hkxk + vk                                              (B.2)


where xk is an n-dimensional  state  vector, wk  is  an n-vector of process
noise, f|< is an m-dimensional  input vector, zk is  an  ^-dimensional measure-
ment vector, and vk is  an I vector  of measurement  noise.  $k_-| (the state
transition matrix) and  B(<-i  are  known n  x  n matrices, and Hk_-j is a known
£xn matrix.  In addition, the  statistics of wk and vk are known
0,              ,                                   (B3)
          EJwk]=



          E[Vk]=0' E[vkvJ]=Rk '
Then, the covariance matrix  P|< for  the  above system processed by a Kalman
filter can be calculated as  follows:
         P/MI  -KkHk)Pk-                                          (B.6)

where
                                     71

-------
This is the  Kalman  gain.  This program determines r^,  rmi-n •<  r^,  such that
the economic cost


        V   Ck     ,
        2-i   —  ,   (c.  a known measurement cost factor)
        k=l   rk       k                                             (B-8)


is minimized subject to  inequality constraints on the  diagonal  elements of
the covariance  matrix.
PROBLEM FORMULATION

    The first  step  in  optimizing a measurement strategy with  this  program  is
to convert the system  under consideration into a form useable  by the pro-
gram.  For a general continuous time problem with X = F(t,X,f), X  an n-vec-
tor:

    1.   Form the  linearized problem using a Taylor series,  that is,
        let x(t)  =  X(t) - Xnom(t) where Xnom(t) is the nominal tra-
        jectory.
Then
«*>:M.
                              x(t) = A(t)x(t)                         ,
                                                                     (B'9)
        For the  interval of interest [0,T], form k subdivisions  of
        length At,  that  is, At = i/k, where k is selected  by  accu-
        racy considerations.  Then the k state transition  matrices
        $1, ...,  $[< are  defined by the series

                        99          "I     "I
         .  A(0)At .  A(0)^ +     + A_'(0)At_
           ~T!        2!                n.
           Af(k-l)*At]At  ,  Ar(k-l)*AtrAt^ x     + A[(k-l)*At]
         + 	fl        +        2!          ••'           n.!
                                                             "•-  "k
        where n-j  is  selected such that
              n.
             ^     ~.     <^e     (a prespecified small  number)
                                     72

-------
2.  Determine Q(t) = E[w w^l where w is an n-vector of process
    noise.  Then form k nxn matrices (diagonal)

     W0(n) = Q(n - At); n = 1, ..., k.                            (B.12)

3.  The program permits the control of the diagonal matrix WQ by
    measurements such that WQ = Wmax for no measurements, and
    Wo = Wmin for maximum permissible measurements.  If the prob-
    lem being considered will  use M such measurements, then form
    the two M-vectors Wmin and Wmax for the M measurements, where
    the Wmin and Wmax vectors  define the diagonal elements of the
    Wm-jn and Wmax matrices.

4.  Determine the £ vector Rm-jn,  which is the diagonal of E[v VT]
    for maximum permissible measurements.

5.  Determine the cost vectors 6w and BR such that is maximum per-
    missible measurements are  made, then the total cost is
k
E
i = 1
m
L
p . ' -i

£
L
n = 1
gR
n
                                              minn
    where 6W and 6^ are the same for each time k.
                                                                      (B.13)
                 .E[(VXO>
-------
         N = 3, £ = 2, M = 2.
    In the program coding, the following integers are defined:
         NT = K
         NSEG
         NSS = N/NSEG
         NSR = £/NSEG
         NSW = M/NSEG

         NS = N
         NR = £
         NW = M
             (the number of time steps)
             (the number of segments)
             (the number of state variables per segment)
             (the number of state measurements per segment)
             (the number of forcing function measurements per
              segment)
             (the total number of state variables)
             (the total number of state measurements)
             (the total number of forcing function measurements)
         (Note:  NSS*NSEG = NS, NSR*NSEG = NR, and NSW*NSEG = NW.)

The program must be dimensioned for each new problem since the amount of
core storage required is a function of N^ • NT.  The matrices should be
dimensioned as follows:
               MATRIX

         STNUMW, BETAW, WMIN, WMAX
         STNUMR, RMIN, BETAR
         PMAX, PMAXO
         PO, TEN, D, PINVM, PM, PINV, P, PHIP
         WO, PHI, E, SAVE
                                        DIMENSION

                                        (NW)
                                        (NR)
                                        (NT, NS)
                                        (NS, NS)
                                        (NS, NS, NT)
FORMAT
SYMBOL
              DESCRIPTION
75E12.5

75E12.5
PHI(I,J,N)

WO(I,J,N)
 The  NT  state  transition  matrices  starting
     from  time = 0 and  read  row by row
 The  NT  process  noise matrices  (diagonal)
     read  in same order as  PHI(I,J,N)	
4015
STNUMW(I)
 The  NW  locations  of  the  measured  forcing
     functions  within the state vector.   For
     example,  if NS = 4  and  NW = 2,  and  the
     forcing  functions are measured  on the
     first  and  third  states,  then
           STNUMW(i) -  1
	STNUMW(2) =  3	
75E12.5
75E12.5
75E12.5
WMIN(I)
WMAX(I)
BETAW(I)
 The  NW  minimum variances  on  WO
 The  NW  maximum variances  on  WO
 The  NW  cost  values  for  full  measurements
                                     74

-------
FORMAT
SYMBOL
             DESCRIPTION
4015
STNUMR(I)
The NR locations of the measured state
    ables within the state vector.  For
    example, if NS = 4 and second, third,
    and fourth states are measured, then
    STNUMR = 2,3,4
75E12.5
RMIN(I)
The NR minimum covariances for a maximum
    measurement
75E12.5
75E12.5
NOTE:
75E12.5
75E12.5
BETAR(I)
PO(N,N)
if NW = 0 Skip
if NR = 0 Skip
u(D
PMAXO(NS,NT)
The NR cost values for full measurements
The initial covariance matrix read row by
row.
STNUMW, WMIN, WMAX, BETAW
STNUMR, RMIN, BETAR
The (NR + NW)*NT controls if a control is to
be tested; otherwise zeros
The previous N vectors of standard deviation
                                    constraints on the  states  if  using re-
                                    start feature  (see  program operation);
                                    otherwise, large, e.g.,  1.00E + 05
PROGRAM OPERATION

    The program is interactive and must be run from a terminal.  The flow
chart for operation is presented below.  The program prints out a number of
options for the user.  The user either selects this option  (by typing  1) or
ignores it (by typing zero).  During the optimization the user can termi-
nate, continue, or dump (for restart with new bo"nds).  He  also has a  choice
of outputting as the optimization proceeds.  Briefly, the method starts with
the zero control and proceeds to the optimal control (that  is, minimum cost
subject to the covariance constraints).  When the cost does not change for
several iterates, it is assumed that the optimum has been found.  The  pro-
gram has five iterations per stage, and the output can be controlled for an
entire stage or for a single interation.
                                     75

-------
STANDARD DEVIATIONS FOR THE CONTROL
READ FROM DEVICE 1 ARE PRINTED
ON DEVICE 6
STOP
STANDARD DEVIATIONS FOR ZERO CONTROL AND FULL
CONTROL PRINTED, IF BOUNDS ARE MET WITH ZERO CONTROL
THE TRIVIAL SOLUTION (ZERO CONTROL) IS THE OPTIMUM.  IF THE
BOUNDS ARE NOT MET WITH FULL CONTROL (u=l) THEN THE
SOLUTION DOES NOT EXIST,
                    STOP 4
OLD AND NEW BOUNDS USED TO COMPUTE
NEW LINEARIZATION FROM DATA ON DEVICE 3
DATA READ FROM DEVICE 3,  SAME BOUNDS,
                        »~FIVE ITERATIONS  COMPLETED
                           TIME AND COST PRINTED FOR
                           TYPE-W AND  TYPE-R  CONTROLS
                           PRINTED IN  TEMPORAL ORDER
                     I_J^_CONSRAINT INFORMATION CORRESSPONDING
                           TO NON-BASIC AND BASIC SLACKS  PRINTED
                           STANDARD DEVIATIONS PRINTED  FOR
                                               DUMP  DATA  ON
                                                 DEVICE  2
                                                                 -STOP
                                           *-STOP
                76

-------
                                   TECHNICAL REPORT DATA
                            (Please read Instructions on the reverse before completing)
 1 REPORT NO.
   EPA-600/3-80-055
4 TITLE AND SUBTITLE
  SAMPLING STRATEGIES  FOR WATER QUALITY IN  THE  GREAT
  LAKES
                                             6. PERFORMING ORGANIZATION CODE
                                                           3. RECIPIENT'S ACCESSIOr*NO.
                                                           5. REPORT DATE
                                                              JUNE 1980 ISSUING DATE.
  AUTHOR(S)
  Raymond P.
  William F.
                                                           8. PERFORMING ORGANIZATION REPORT NO.
Canale, Leon
Powers
M. DePalma, and
9 PERFORMING ORGANIZATION NAME AND ADDRESS
  University of Michigan
  Depts. of Civil  and Aerospace Engineering
  Ann Arbor, Michigan  48109
                                                           10. PROGRAM ELEMENT NO.
                                               1BA608
                                             11. CONTRACT/GRANT NO.

                                               R803754
 12. SPONSORING AGENCY NAME AND ADDRESS
  Environmental  Research Laboratory
  Office  of  Research and Development
  U.S. Environmental Protection Agency
  Duluth,  Minnesota 55804
                                             13. TYPE OF REPORT AND PERIOD COVERED
                                               Final
                                             14. SPONSORING AGENCY CODE
                                                    EPA/600/03
 15. SUPPLEMENTARY NOTES
16. ABSTRACT
  The major goal  of  this  project was to investigate  the  potential  applications of
  Kalman filtering and  modern optimization techniques  to the design of sampling
  strategies for  water  quality in the Great Lakes.   Two  representative problems of
  general limnological  interest with considerably  different characteristics were
  studied.  The first problem concerns sampling  requirements for long-term trend de-
  tection of chloride and total phosphorus concentrations in Lake Michigan.  A mini-
  mum-cost sampling  strategy has been determined in  association with the Kalman filter
  and mathematical models that can detect changes  of 1.0 mg Cl/£ and 1.0 yg P/£ in
  Lake Michigan with  90 percent probability.  The  cost of this sampling program is
  considerably less  than  monitoring programs designed  without benefit of mass balance
  models and Kalman  filter data processing.  An  investigation was conducted to deter-
  mine the sensitivity  of the optimal cost to the  accuracy of the mass balance model
  parameters.  It is  concluded that the calculated sampling cost is most sensitive  to
  the variance in the apparent phosphorus settling velocity, not as sensitive to the
  accuracy of laboratory  analyses, and rather insensitive to the accuracy of the
  effective flow  at  the Straits of Mackinac and  atmospheric loads.   The second problem
  concerns the potential  applicability of Kalman filtering techniques for concentration
  estimation and  parameter identification for a  phytoplankton and nutrient cycling
  model for Saginaw  Bay.	
17.
                               KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
  Lakes
  Sampling
  Mathematical models
  Minimization
                                             b.IDENTIFIERS/OPEN ENDED TERMS
                                 Sampling strategy
                                 Large lakes
                                 Lake Michigan
                                 Saginaw Bay
                                                             COSATI Field/Group
                                               08/H
                                               12/B
13. DISTRIBUTION STATEMENT

  Release Unlimited
                                19 SECURITY CLASS (This Report)
                                  Unclassified
                                             21. NO. OF PAGES

                                                   87
                                             20. SECURITY CLASS (This page)
                                               Unclassified
                                                                        22. PRICE
EPA Form 2220-1 (9-73)
                                            77
                                                         •A- U.S. G"VERflHFMT PniNTINr. OFFICE- 198H--657-1 FS/nniQ

-------