United States EPA-600/3-81-011
Environmental Protection April 1981
Agency
vvEPA Research and
Development
Lake Data
Analysis and Nutrient
Budget Modeling
Prepared for
Office of Water Regulations and
Standards
Criteria and Standards Division
Prepared by
Environmental Research Laboratory
Corvallis OR 97330
-------
EPA-600/3-81-011
Aoril 1981
LAKE DATA ANALYSIS AND
NUTRIENT BUDGET MODELING
By
Kenneth H. Reckhow
School of Forestry and Environmental Studies
Duke University
Durham, North Carolina 27706
Project Officer
Spencer A. Peterson
Freshwater Division
Corvallis Environmental Research Laboratory
Corvallis, Oregon 97330
CORVALLIS ENVIRONMENTAL RESEARCH LABORATORY
OFFICE OF RESEARCH AND DEVELOPMENT
U.S. ENVIRONMENTAL PROTECTION AGENCY
CORVALLIS, OREGON 97330
-------
DISCLAIMER
This report has been reviewed by the Corvallis Environmental Research
Laboratory, U.S. Environmental Protection Agency and approved for publication.
Approval does not signify that the contents necessarily reflect the views and
policies of the U.S. Environmental Protection Agency, nor does mention of
trade names or commercial products constitute endorsement or recommendation
for use.
-------
ABSTRACT
Several quantitative methods that may be useful for lake trophic quality
management planning are discussed and illustrated. An emphasis is placed on
scientific methods in research, data analysis, and modeling. Proper use of
statistical methods is also stressed, along with considerations of uncertainty
in data analysis and modeling.
Following an introductory discussion of scientific methods, limnological
variables important to lake quality management are reviewed. Methods of data
acquisition, or sampling design are then presented, along with techniques for
analyzing, summarizing, and presenting data (with an emphasis on robust
methods). The concept of summary statistics forms a logical introduction to
the next section on lake water quality indices. This is followed by methods
for acquiring nutrient budget data which are of prime importance to the suc-
ceeding section on lake trophic quality modeling. Included in this section is
a step-by-step procedure for the prediction of phosphorus concentration, and
the estimation of the prediction uncertainty, from land use information and
certain lake characteristics. At the end, some thoughts are offered on the
use and limitations of the methods presented herein for lake trophic quality
management planning.
m
-------
CONTENTS
Page
Abstract iii
Acknowledgements vi
1. Introduction 1
2. Acquisition of Lake Data 5
3. Analysis of Lake Data 18
4. Indices of Lake Water Quality 36
5. Acquisition of Nutrient Budget Data 42
6. Lake Trophic Quality Modeling 47
7. Concluding Comments 56
References 60
-------
ACKNOWLEDGMENTS
A number of people assisted in the development and preparation of this
document. Appreciation is extended to Dennis Cooke of Kent State University,
and to Phil Larsen and Spencer Peterson of CERL, for providing the opportunity
to prepare this report. Thanks are also due to Janine Niemer and Sue Watt for
typing the document, to Paul Schneider for graphics work, and to David Lee,
Ralph Ancil, Michael Beaulac, and Robert Montgomery for editorial assistance
and proofreading. This document was prepared while the author was a member of
the faculty in the Department of Resource Development at Michigan State
University.
-------
1. Introduction
Many useful quantitative methods exist that can be of assistance in lake
quality management. Most of these methods fall under the general heading of
"statistics" or "mathematical models." In this document we present some
techniques from each area, but our emphasis is on methods that are applicable
under the very realistic conditions of limited financial resources available
for planning and of non-normal distributions of data. The methods presented
are empirical, and nonparametric, or robust, whenever possible. Procedures
that require few assumptions,, and/or that are carried out with little
investment of time and money, are stressed.
Of course, it is critical to recognize that there are often trade-offs
between cost of analysis and risk associated with the resultant management
decision. This is illustrated when we consider two extremes:
1. No analysis is undertaken and a decision is made based upon
intuition.
2. A complete analysis is made so that outcomes associated with
management options are known with certainty.
In between lie virtually all planning and management exercises. Thus, the
cost of data acquisition, data analysis, and modeling must be justified in
terms of benefit to the planning process. This means that our previously
expressed desire for simple, low cost, methods of analysis must be tempered by
the needs of the particular problem at hand.
This brief discussion of cost versus risk underscores the responsibility
of the modeler, or data analyst, to the planning process. Since it is
unreasonable to assume that planners are familiar with all of the tools of the
modeler/analyst, the planner must, to some degree, accept the modeler/
analyst's statements concerning reliability and utility of results. For that
reason, quantitative analyses, or more generally, scientific research, should
proceed according to some well established rules. When followed, these rules,
collectively called the scientific method (Ackoff, 1962), insure that
scientific studies yield credible, reliable results.
While it is not the purpose of this presentation to discuss the
scientific method at length (see Reckhow and Chapra, 1980, Chapter 1), some
thoughts are presented below. These represent scientific method issues that
the author has found to be of concern in lake data analysis and modeling.
1. Definition: Many terms are used in limnological studies that, in
part because of everyday usage, are vaguely defined. Planning depends
upon useful, valuable information, and information value is a function of
error. Since error can result from uncertainty in models and data, as
well as from faulty communication due to confusing terminology, it is
important that definitions be frequently provided. For example, what is
average lake phosphorus concentration? The answer to that question
depends upon the location statistic employed (see Section 3), the methods
(sampling design) used to acquire the data, and the phosphorus chemical
-------
fraction (total, ortho, ...) of concern. These should be specified when
a vague term like "average" is used. As another example, consider the
term, "phosphorus loading." Black box lake modelers have considered this
term to mean annual areal phosphorus mass input to a lake. However, in
the absence of this definition, statements made about "lake sensitivity
to phosphorus loading" may be confusing or misleading (see Figure 1-1 in
Reckhow and Chapra, 1980).
Assumptions: Implicit in all mathematical models and many statistics
summarizing data are assumptions about the behavior of the system
described or about limitations in the particular method or statistic
employed. In order that the planner may properly weigh the quantitative
information provided, the modeler/analyst should clearly specify all
relevant assumptions necessary for the study conducted. For example,
application of certain statistical tests is based upon an assumption of
normality (see Section 3). When conducting these tests, the data analyst
should identify the required assumptions and document tests for
compliance (and discuss the implications of violations, if necessary).
Uncertainty: Uncertainty is present in all model studies because of
errors in the model, in the parameters, and in the variables.
Uncertainty is also present in most statistical analyses because of
variability and bias. As we suggested above, uncertainty may also be
introduced into an analysis because of poor communication. Uncertainty
is a good measure of the value of information; as uncertainty is reduced,
information becomes more precise, and hence more useful or valuable. The
modeler/analyst can greatly assist the planner by specifying, whenever
possible, the uncertainty in results. The planner may then use this
estimate of uncertainty as indicative of the value of these results to
the planning process.
Representativeness: In the absence of a complete census, statistics
selected or calculated to represent some attribute of a system may be
variable or biased. It may seem all too obvious that representativeness
should be a criterion for the selection of a statistic. However,
convention often interferes. For example, it is common to represent the
center and spread for a data set by the mean and standard deviation.
However, many "real" data sets in limnology are non-normal and highly
skewed. When this situation occurs, the normal-mean-standard deviation
conventional statistics are less representative than certain robust
statistics (see Section 3). Often, one may face a trade-off between
representativeness and some other issue, like cost of analysis. For
example, in Section 5, we discuss nutrient budgets, and compare direct
sampling versus nutrient export coefficients as sources of nutrient
loading estimates. Export coefficients are less costly to acquire but
probably less representative than the alternative. This choice,
involving cost and risk, must be made according to the merits of the
issue of concern. However, the modeler/analyst should consider
representativeness when selecting statistics (or designing sampling
programs), and he/she should justify representativeness if statistics may
be in question.
-------
5. Causality: If models and other quantitative analyses are to be useful,
there must be a causal linkage between decision variables and control
variables. From an understanding of theory and through sensitivity
testing of the model, cause-effect relationships may be established.
Without corroboration of causality, one cannot assert with confidence
that selected management strategies will have the desired effect.
6. Appropriate Variable(s): There are two considerations when we think
about the appropriate variable(s). First, the variable(s) for which
information is gathered must coincide (or be causally-linked) with the
variable(s) that impute value to the water body. Second, when a model is
employed, the model variables and the decision and control variables may
not always be the same. If this occurs, the modeler should strive to
modify his/her analysis so that the variables of concern are included.
Otherwise, the modeling will be incomplete and errors associated with a
decision may be underestimated (see Reckhow and Chapra, 1980).
7. Corroboration: Models must be tested before they are applied, and this
testing process has traditionally been called validation or verification.
However, those terms imply truth, an attribute that a mathematical model
can never achieve. Therefore, the term, corroboration (Popper, 1968), is
adopted instead. Popper states that a model is corroborated when it has
passed rigorous independent tests. A model that is useful for water
quality management planning must be able to predict changes in water
quality associated with changes in input conditions. A planning model
must be adaptable. Therefore a candidate model must first be tested
under conditions different from those used to calibrate the model, and a
statistical goodness of fit criterion should be applied to assess the
degree of corroboration. The modeler has this responsibility and the
model user or planner should request documentation of these tests.
Without this, there is no assurance that the model can be depended upon
for accurate predictions under new conditions.
8. Cost/Risk: To briefly re-iterate an important issue in policy analysis
previously stated, quantitative analyses and planning studies are not
without cost (in money and time). This cost is justified only if the
perceived benefit (or correspondingly, the perceived reduction in risk)
from the information obtained outweighs the cost. This decision to
undertake certain analyses also has a dimension of degree or thorough-
ness. As an analysis becomes more thorough, presumably it becomes more
precise. Eventually, however, the increased level of precision may not
justify the cost necessary to achieve it. This should be considered in
selecting and designing planning studies.
In this brief treatment of scientific method issues, we have made some
rather strong demands of modelers and data analysts in the documentation of
their work. Unfortunately, some of these requirements must be tempered by the
limitations in the state of the art. For example, it may not be possible to
accurately assess the trade-off between cost of analysis and risk in decision
making. However, the concept still holds. As long as the planner or policy
analyst realiz.es this trade-off is part (either explicitly or implicitly) of
the design of policy studies, then he/she may at least intuitively consider
-------
the trade-off. In conducting water quality management planning, we should
strive toward the conduct of analyses according to the scientific method.
When this is not possible, an understanding of the concepts of the scientific
method can still aid the planner by serving as an "ideal" against which to
evaluate scientific studies.
One of the eight issues listed above that we should address immediately
concerns the variable(s) to be studied. While this issue is problem-specific,
the U.S. Environmental Protection Agency (Larsen, 1980) has developed a list
of variables that is likely to contain the variables of concern in most lakes.
This list, presented in Exhibit 1, is broken into two parts. The methods
presented in the remainder of this chapter are more applicable to the "General
Lake Quality" variables in part A of Exhibit 1. In particular, the problem of
eutrophication is emphasized herein, so techniques oriented to the study of
trophic variables predominate. However, many of the methods can be useful for
other limnological variables. In addition, an effort has been made to stress
concepts, so that the reader may understand scientific and statistical
inference independent of the direct utility of the specific techniques.
Exhibit 1. Liminological variables of importance in
lake management (Larsen, 1980).
A. General Lake Quality Variables
phosphorus
nitrogen
dissolved oxygen
turbidity (Secchi disk)
chlorophyll a
macrophyte coverage
bacteria and viruses
toxic substances
B. Use-Specific Variables
1. Swimming '<.
temperature (air/water)
turbidity
algal abundance
macrophytes
odor (dissolved oxygen)
disease-causing organisms
parasites and insects
toxic substances
oil
trash
facilities
beach and bottom type
Fishing
pianti ng/stocki ng
programs
fish type and abundance
dissolved oxygen
toxic substances
algae
macrophytes
spawning grounds
temperature
3. Boating
macrophytes
algae
obstructions
trash
facilities
lake size/
depth
-------
2. Acquisition of Lake Data
Lake data are acquired because there is a need for the data. For lake
quality management, this need is reflected in the value of the information
provided by the data. The purpose, therefore, of this section is to provide
guidance in the establishment of cost-effective data gathering programs.
Although much of this section is devoted to statistical sampling design,
"data acquisition" is purposely used in the section title to underscore the
notion that data may be obtained by means other than sampling. For example,
many limnological issues may be completely or partially addressed using
existing data. Alternatively, existing data on surrogate variables may prove
useful after statistical analysis is used to quantify the relationship between
the surrogate and the quality variable of concern. As we have stressed in the
last section, however, the decision to use existing data must be made with
some understanding of the cost/risk trade-offs. Acquisition of existing data
is almost always less costly than sampling to obtain new data. However,
existing data may be less representative of the issue of concern than new
data, and this non-representativeness translates into greater risk in decision
making. The planner must consider these trade-offs when designing data
acquisition programs.
Most likely, some or all of the data needed for lake quality management
planning will be obtained under a sampling program that should be designed
using statistical methods. Before we survey these methods, it is instructive
to discuss some concepts inherent in statistical sampling design. Consider
the words used to identify this topic: "statistical sampling design." This
task is called "sampling" because only a limited amount of information is
obtained. The entirety of the characteristic sampled is called the popula-
tion. Statistics obtained through sampling are called sample statistics and
they are intended to represent the population, or true, values. Sampling is
undertaken because it is often infeasible to survey the total population. For
example, it is clearly impossible to survey an entire lake throughout time and
space for a population value for algal biomass. Instead we turn to sampling
and undertake a program to obtain a representative sample statistic. This is
where the other terms in "statistical sampling design" become important.
Sampling is a problem in "statistical design" because statistical methods help
us design a program that yields representative data.
Statistical sampling design, has, as a basic consideration, the trade-off
between uncertainty and cost. Uncertainty results from variability, error,
and bias. Variability exists because of natural fluctuations inherent in a
characteristic (e.g., natural variations in stream or lake phosphorus con-
centration), or because of uncertainty inherent in a statistic used to sum-
marize a set of data. Errors may arise in any of the individual steps of
sampling, measurement, analysis, and estimation. Bias may result from a
number of causes, all associated with the fact that a sample may not be
representative of the population from which it was drawn. For example, a
survey of a stratified lake consisting of fifty concentration samples, with
only one taken from the hypolimnion, probably will yield a biased statistic
for mean concentration.
-------
When sampling programs are designed, variability, error, and bias should
be estimated for all candidate designs. In this manner, the trade-off between
uncertainty and cost of sampling can be as explicit as possible. The trade-
off can be evaluated in terms of financial constraints and needs for data
reliability for the selection of an appropriate design.
In order to understand the statistical relationships that are used to
design sampling programs, there are some statistical terms that must first be
defined. The terms result from expected value theory and are most useful with
well-behaved symmetric probability density functions, like the normal distri-
bution.
1. Mean: The mean is a measure of locatioji of central tendency for a
distribution or set of data. The mean, x, is
where: x. = data point i
n = total number of data points.
For symmetric distributions of data, the mean is a reliable statistic to
use to represent the average or central tendency. However, it must be
emphasized that, when representing a set of data with descriptive
statistics, our true objective is to select a statistic that best
indicates the distribution center, and not simply to calculate the mean.
Sometimes the mean is the appropriate statistic, and sometimes it is not.
Other candidate statistics for location include the median, mode, geo-
metric mean, trimmed mean, tri-mean, and biweight (see Reckhow and
Chapra, 1980, or Hosteller and Tukey, 1977 for discussion and analysis).
Some of these are presented in the next section.
2. Variance and Standard Deviation: The variance and standard deviation are
measures of spread or scale for a distribution or set of data. The
variance, s2, is:
n n
I (x, - x)2 I x?
S =
5
n-l n-1
The standard deviation, s, is simply the square root of the variance:
-------
(A")
n-1
(3)
Here, too, we should recognize that the standard deviation or variance is
not, by definition, the spread in a distribution. Rather it is a
statistic chosen to represent spread. For certain symmetric distribu-
tions, the standard deviation is a good choice. For other distributions,
notably skewed distributions, alternative measures of spread, such as the
average deviation, the range, or the interquartile range may be more
appropriate (see Reckhow and Chapra, 1980, and Mosteller and Tukey,
1977). Some of these alternatives are presented in the next section.
3. Standard Error of the Estimate: The standard error of the estimate is
the mean square error for a statistic or estimate. Often the statistic
of concern is the mean, so we use the symbol, s-, and calculate the
standard error as:
(4)
The standard error of the estimate is a measure of precision of a
statistic. To the extent that precision and uncertainty are equivalent,
s- is also a measure of uncertainty. However, we mentioned earlier that
uncertainty includes variability and bias. In the absence of
supplemental uncertainty1 (see Reckhow and Chapra, 1980 and Mosteller and
Tukey, 1977), precision accounts for variability but not bias.
Therefore, while we use the standard error of the estimate extensively in
sampling design, we must be wary of its limitations, s- is a measure of
variability in a statistic, such as the mean. This may not be equivalent
to the uncertainty in central tendency for a population. Generally, our
true concern is with the latter, not with the former.
Supplemental Uncertainty is uncertainty that is not measured by the
statistic employed (in this case, the standard error of the estimate). For
example, supplemental uncertainty exists when data are not truly representa-
tive of a characteristic. Since s- is data-derived, there must be
additional uncertainty associated with nonrepresentativeness.
-------
4. Coefficient of Variation: The coefficient of variation, cv, is:
cv = ! (5)
x
This statistic is a useful measure of relative variability. It is a
dimensionless quantity that facilitates comparison among dispersion
statistics by expressing the standard deviation as a fraction of the
mean.
The design of a sampling program is often expressed in terms of random
sampling. In theory, random sampling refers to data acquisition when
individual points are selected by chance. Under random sampling, all members
of a population are equally likely to be chosen in the sample. In practice,
however, limnological sampling is rarely random. It is usually systematic in
space (i.e., sampling occurs at pre-specified sites) and systematic, or
systematic with a random start (i.e., begun on a randomly chosen day and
continued on systematically pre-specified days thereafter) in time.
Statistical relationships used in the design of sampling programs are
generally aimed toward random sampling or a variation thereof. However, with
an understanding of limnological relationships, the rudiments of sampling
design, and possible sources of supplemental uncertainty, we can often apply
random sampling design relationships to the systematic sampling programs that
we often adopt in limnology. In particular, random sampling design equations
may be used for systematic sampling if there is no bias introduced by
incomplete design, and if there is no periodic variation in the population
measured. Use is further justified if the systematic sampling begins with a
random start.
There are certain quantities that are common to most sampling design
relationships. These include the number of samples, the desired precision (or
error) of the estimate, and the inherent variability in the characteristic
measured. The quantities are all present in the relationship for the standard
error of the estimate, Equation 4. When we invoke the common assumption that
the data are normally distributed,1 we can use the t-statistic (see the next
section) to specify the confidence level desired in our sample. Thus, for
simple random sampling, Equation 4 is modified to yield:
t2 c2
n = i—§- (6)
d2
The central limit theorem states that the distribution of x for sufficiently
large samples and for any population with a finite variance, will be normal.
("Sufficiently large" is determined by the degree of normality of the
population and the acceptable error; 30 to 100 samples may be required
depending upon these issues (Blalock, 1972).) This justifies the use of the
t-statistic with the standard error of the mean. However, when the
distribution of concern is severely non-normal, robust statistics (see
Section 3) should be employed, and sampling design may be conducted on a
somewhat ad hoc basis.
-------
where: n = number of samples
t = student's t-statistic
s2 = population variance estimate
d = desired precision.
Equation 6 may be used to estimate, for random (or "effectively" random)
sampling, the number of samples necessary to achieve a desired level of
precision, given an estimate of population variability. Desired precision is
selected after consideration of the acceptable error, the inherent variability
of the characteristic sampled, and the sampling cost. The sampling design
decision can be expressed as a trade-off between desired precision and cost,
if the number of samples is re-expressed in terms of sample cost. For
example, one common cost function is:
C(n) = C + Cln (7)
where: C(n) = total cost of sampling
c = initial fixed cost
G! = cost per sample.
When Equations 6 and 7 are combined, a random sampling design may be specified
according to either desired precision or a cost constraint.
Now, to use Equations 6 and 7 for sampling design, an estimate of the
population variance is needed. In theory, we want to estimate the variance
using Equation 2 on normal-like data. In practice, we are really interested
in the "vague concept" (Hosteller and Tukey, 1977) distribution spread, which
may or may not be best estimated by a sample variance. Further, we rarely
have sufficient data on the characteristic to be sampled to reliably calculate
a variance. (If we did, this might call into question the decision to
sample.) Therefore, we must depend upon a variety of methods for a measure of
distribution spread that can be used in Equation 6 for population variance.
These methods include (Cochran, 1963):
1. Use existing information on the population to be sampled, or
existing information on a similar population.
2. Rely on informed judgment, or on an educated guess.
3. Undertake a two-step sampling procedure. Use the results from the
first step to estimate the required terms in Equation 6. Then use
this to design the second step. Data from both steps may be
employed in the final estimate of the characteristic of interest.
4. Conduct a pilot study on a convenient or particularly meaningful
subsample. Use the results to estimate the required terms in
Equation 6. Unlike for two-step sampling, the pilot study results
are generally not used in the final estimate of the characteristic
of interest. This happens because the pilot sample is often not
representative of the population as a whole. This possible non-
-------
representativeness must be taken into account when the pilot survey
results are used to estimate variance. A modification might be
necessary if it is thought that the pilot survey provided an
overestimate or underestimate of the population values.
For many types of problems, sampling can be more efficient when the
design is based on the fact that a population often contains strata that are
homogeneous within and heterogeneous with respect to other strata. For
example, stratified lakes generally exhibit homogeneous conditions within the
epilimnion and the hypolimnion, while at the same time these two strata are
heterogeneous with respect to each other. As another example, the nutrient
flux to a lake can vary significantly from tributary to tributary. In these
situations, sampling is more efficient when sample numbers are allocated
according to stratified random sampling design. Then within each stratum,
sampling is random or systematic with a random start. Sampling is allocated
in stratified random sampling design according to:
- Z(w. s.)
where: n. = number of samples in stratum i
n = total number of samples
w. = a weight reflecting the size (number of units, for
example) of stratum i
s. = standard deviation of sampled characteristic within
stratum i .
If sampling cost may be estimated by:
c = CQ + Z(c1 np (9)
then:
n- w.
""•"- (10)
In order to apply Equation 8 or 10, a relationship is needed for the
total number of samples, n. Two equations are available, depending upon
whether precision or cost if fixed beforehand. If precision is fixed (at d),
and cost may be estimated according to Equation 9, then (Cochran, 1963):
n.fajljS
d2/t2
If cost is fixed, then (Cochran, 1963):
10
-------
n =
(c - CQ)
(12)
(wi si ^i
In summary, the composition of the stratified random sampling design
equations leads to the following general conclusions concerning stratified
sampling. A larger sample should be taken in a stratum if the stratum is:
1. more variable (s)
2. larger (w)
3. less costly to sample (C).
Example 1
To illustrate how samples that are acquired without concern for
statistical design may be quite misleading, a hypothetical example is
constructed. For ease of explanation, assume that Exhibit 2 is a complete
description of the population of phosphorus concentration values (in
micrograms per liter) in a stratified lake. The values in the exhibit were
randomly generated from three lognormal distributions.1 Using p and a as
symbols for the population mean and standard deviation, the distribution
parameters are:
Epilimnion
p (log transform)
M (M9/D
a (log transform)
M + a (pg/1)
Number of cells
1
301
20
146
28
61
Metalimnion
1
544
35
089
43
35
Hypolimnion
1.700
50
.093
62
41
The "population" statistics that best represent the center of the population
of lake phosphorus concentration values are probably the weighted geometric
mean2 and the median. These statistics are:
weighted geometric mean = 30.4 jjg/1
median = 34 pg/1
1 After the values were generated, they were placed in the lake diagram in
order to best approximate realistic concentration contours and gradients.
2 The geometric mean is the antilog of the mean of a lognormal distribution.
In this example, the geometric mean of each stratum is weighted according to
the stratum's percentage
geometric mean.
of cells, for the calculation of the weighted
11
-------
CM
•o
C
re
cu
"a.
re
x
s-
o
-
re
+j
c
0)
u
c
o
-------
The results of sampling should be compared to these statistics for a measure
of the success of the sampling program.
Now, suppose that we undertake a brief sampling program in order to
estimate the average phosphorus concentration in the lake. Consider the
following examples illustrating how this might be done.
A. Take a single depth profile in a deep section of the lake. Randomly
selecting three profiles, we find:
1. measurements (ug/1): 11, 14, 19, 28, 37, 39, 43, 54
mean = 30.6 ug/1
median = 32.5 ug/1
2. measurements (ug/1): 19, 20, 26, 32, 42, 47, 62, 76
mean = 40.5 ug/1
median = 37 ug/1
3. measurements (ug/1): 14, 15, 22, 30, 35, 40, 50, 53
mean = 32.4 ug/1
median = 32.5 ug/1
B. Take surface samples only. Randomly selecting eight samples we find:
measurements (ug/1): 30, 26, 24, 20, 19, 11, 14, 18
mean =20.3 ug/1
median = 19.5 ug/1
C. Take surface-to-bottom samples at three randomly selected sites:
measurements (ug/1): 20, 20, 28, 40, 41, 51, 57, 13, 20, 24, 30,
36, 43, 49, 53, 14, 13, 18, 34, 42, 53, 61
mean = 34.5 ug/1
median = 35 ug/1
D. Take four samples from any site and depth in the lake. Randomly
selecting three sampling programs, we find:
1. measurements (ug/1): 26, 50, 14, 34
mean = 31 ug/1
median = 30 ug/1
2. measurements (ug/1): 13, 18, 20, 28
mean = 19.8 ug/1
median =19 ug/1
13
-------
3. measurements (ug/1): 20, 26, 43, 61
mean = 37.5 ug/1
median = 34.5 (jg/1
While we must be careful in drawing conclusions from a small sample of
sampling programs, there are a few results in the examples presented above
that are consistent with the findings of many lake sampling experiences.
1. Surface sampling can lead to biased estimates of average conditions
in a stratified lake. Underestimation is often the result.
2. Depth profile sampling is preferred to single layer (stratum)
sampling, particularly if samples taken are roughly proportional to
the stratum volume.
3. A small number of samples (example D) is more apt to result in a
biased estimate of average conditions than is a large number of
samples (example C).
Example 2
Let us use stratified sampling design to develop a sampling program for
the lake in Exhibit 2. Assume that:
1. The samples taken in Example 1C above represent existing data from
which the sampling program will be designed. Because the number of
samples is small, the standard deviation (s) will be estimated as
one-half the range of data points within each stratum. In an actual
lake sampling program, measurements could be assigned to a stratum
on the basis of a temperature profile.
2. The size (w) of each stratum will be estimated by the relative
number of stratum measurements in Example 1C. In an actual lake
sampling program, the size of a stratum would be determined by its
volume.
3. It is desired that a sampling program be designed to provide an
estimate of mean phosphorus concentration that is within ± .005 mg/1
of the true mean at the 95% level.
From the samples in Example 1C, we have the following breakdown.
Measurements (ug/1)
epilimnion: 20, 20, 28, 13, 20, 24, 14, 13, 18
metalimnion: 40, 41, 30, 36, 34, 42
hypolimnion: 51, 57, 43, 49, 53, 53, 61
14
-------
The necessary statistics are:
Range (|jg/l)
s (ug/1)
w
Epilimnion
15
7.5
9/22
Metal ironion
12
6
6/22
Hypolimnion
18
9
7/22
To design the sampling program, first solve for the total number of samples to
be taken, using Equation 11 with cost (c.) constant across all sampling sites.
n =
d2/t
for the sample sizes under consideration here, t-2. Therefore:
n = [(9/22) (7.5) + (6/22) (6) + (7/22) (9)]2
52/22
_ 57.28 _ Q
n--- 9'
1C
16
n - 9 samples
Equation 10 may be used to allocate the samples among the strata (again with
cost constant across sites).
ni
wi si
n I(wi s
for the epilimnion:
(9/22) (7.5)
n (9/22) (7.5) + (6/22) (6) + (7/22) (9)
n = 3.71
e
15
-------
for the metalimnion:
nm _ (6/22) (6)
(9/22) (7.5) + (6/22) (6) + (7/22) (9)
nm=1.98
for the hypolimnion:
nh _ (7/22) (9)
n (9/22) (7.5) + (6/22) (6) + (7/22) (9)
nh = 3.47
Since samples can be taken only in integer units, and given the nature of the
results calculated above, we might recommend that 10 samples be taken, and
they be distributed 4, 2, and 4. in the epil imnion, metal imnion and
hypolimnion, respectively. As an approximate check, the following samples are
chosen randomly.
epilimnion (ug/1): 30, 13, 19, 24
metalimnion (pg/l): 40, 28
hypolimnion (ug/1): 52, 41, 50, 57
From this sample, the following statistics may be calculated.
1. a volume-weighted mean
where: subscript s refers to strata
subscript i refers to samples
xw = (1/4) (9/22) (30 + 13 + 19 + 24) + (1/2) (6/22) (40 + 28)
+ (1/4) (7/22) (52 + 41 + 50 + 57)
x = 34.0 |jg/l
W
16
-------
2. a volume-weighted standard deviation, which is estimated from one-half
the range within each stratum, because of the small sample size:
sw2 ~ [(9/22) (8.5)]2 + [(6/22) (6)]2 + [(7/22) (8)]2
sw -4.6 vg/~\
3. a volume-weighted standard error:
s-2 ~ %[(9/22) (8.5)]2 + Js[(6/22) (6)]2 + %[(7/22) (8)]2
x w
s- ~ 2.45 ug/1
w
It is shown in the next section that the precision of the estimate of the mean
at the 95% level is:
y + t <;-
05 x
• W*J A
For this problem, the precision is approximately:
xw ± 2 s-
or:
34.0 ug/1 ±4.9 ug/1
The means that the 95% confidence interval for the volume-weighted mean is
29.1 Mg/l < u < 38.9 ug/1
W
A couple of final observations are in order. First, note that the true
median (34 ug/1) is well within the 95% confidence limits but that the true
geometric mean (30.4 ug/1) is just slightly inside. Also note that both true
values are within the pre-specified confidence interval (± .005 ug/1). Our
actual interval at the 95% level (± .0049 ug/1) is lower than the pre-
specified value because we chose to take 10 samples (versus 9 or 9.16) and
because our sample turned out to be relatively homogeneous.
17
-------
In concluding this section it is worthwhile to mention useful references
for sampling design. Many excellent books and monographs have been written
about sampling design, and the reader should consult one or more of them if
additional details on this topic are desired. Among the recommended
references are Cochran (1963); Hansen, Hurwitz and Madow (1953); Jessen
(1978); Williams (1978); and Freese (1962). Noteworthy among these are
Williams, as an introduction to sampling design, and Cochran, as a more
advanced text and as an excellent reference. In addition, some statistics
books contain sections on sampling design; Snedecor and Cochran (1967) is one
recommended example.
3. Analysis of Lake Data
Once data have been acquired, either through a sampling program or from
existing sources, it is usually necessary to summarize the data in a few
well-chosen statistics to make the results most useful for planning. Trad-
itionally, these chosen statistics are those statistics important in expected
value theory and normal distribution theory (e.g., the mean and standard
deviation). Often real data sets are misrepresented by these "traditional"
statistics, so we adopt a different approach in this section. First, we
present the "vague concept" (Mosteller and Tukey, 1977) for which a particular
statistic (such as the mean, or median) is selected. Then we offer a few
options for statistics to represent the vague concept, mentioning some pros
and cons for each. Throughout this section, in fact, we try to present more
than one option for a statistical exercise. This should foster the correct
notion that use of the traditional methods should represent a choice. The
other options which we call robust statistics, robust methods, or non-
parametric techniques may in many instances be the superior choice, however.
The material presented in this section, and the references cited, should help
the reader make this choice.
The first exercise one should conduct with a set of data is to plot the
data on a graph. For data on a single variable, the frequency plot or
histogram is useful. A modification of the traditional bar histogram which we
present here is the stem and leaf plot (Tukey, 1977). Unlike the histogram,
however, the stem and leaf diagram retains the numbers (i.e., the individual
data points) in the display, and their relative abundance yields the
distribution shape.
Example 3
(From Reckhow and Chapra, 1980)
To illustrate an alternative to the bar histogram, let us take the data
in Exhibit 3 and create two stem and leaf diagrams. A stem and leaf diagram
(Tukey, 1977; Mosteller and Tukey, 1977) is constructed from a set of data
with the higher digits (the "tens" and "hundreds" digits in Exhibit 3) forming
the left side of a column as in Exhibit 4. On the right side of the column,
the lowest ("units") digit for each data point is placed in a row opposite the
18
-------
Exhibit 3. Phosphorus and Chlorophyll a data.
Total Phosphorus Concentration (ug/1) Chlorophyll a Concentration (mg/1)
5
7
8
10
10
15
18
24
29
30
32
33
38
41
42
43
48
68
84
92
96
1.4
3.0
1.7
2.1
2.0
6.0
4.9
22
8.2
12
25
14
12
20
24
30
20
42
84
103
120
Exhibit 4. Stem and leaf diagrams.
A) Phosphorus Concentration B) Chlorophyll a Concentration
0
1
2
3
4
5
6
7
8
9
10
11
12
578
0058
49
0238
1238
8
4
26
0
1
2
3
4
5
6
7
8
9
10
11
12
13222658
242
25040
0
2
4
3
0
19
-------
appropriate higher digit. Thus, in Exhibit 4A, the entries in the 0-row
represent 5, 7, and 8 ug/1 of phosphorus, and the entries in the 1-row
represent 10, 10, 15, and 18 ug/1 of phosphorus. In Exhibit 4B, concentra-
tions are rounded off to the nearest integer.
The advantage of a stem and leaf diagram is that it provides most of the
features of a histogram while retaining the numerical values of a table of
data. Like a histogram, the stem and leaf display can be constructed using
different data groupings (e.g., the right-side digit could be the tens digit,
or any other digit, if appropriate). However, the stem and leaf diagram is
not as flexible as the histogram, in that stem and leaf diagrams are
constrained to order-of-magnitude changes in groupings (e.g., histogram data
can be grouped: 1-4, 5-8, 9-12, ..., whereas stem and leaf data are always
grouped in some multiple of ten: 0-9, .10-19, 20-29, ...).
Another useful graphical procedure for univariate data is the box plot
(Tukey, 1977; McGill et al. , 1978). The box plot is constructed largely from
the order statistics, and it provides information on the median, spread or
variability, skew, size of data set, and statistical significance of the
median. All of this information may be conveyed on a graph in essentially the
same space used to plot the mean and standard deviation.
Box plots for the phosphorus data in Exhibit 5 for five lakes are drawn
in Exhibit 6 with median chlorophyll a on the x-axis. To construct a box plot
for a set of data on a single variable, the steps listed below may be fol-
lowed.
1. Order the data from lowest to highest.
2. Plot the lowest and highest values on the graph as short horizontal
lines. These represent the extreme values for each box plot, and
they identify the range.
3. Determine the upper and lower quartiles (the data points at the 25
and 75 percentiles) for the data set. These values bound the
interquartile range (I), which is the "distance" between quartiles.
The quartiles define the upper and lower box edges, and they are
connected to the respective range values.
4. Plot the median as a dashed horizontal line within the box.
5. Select a scale so that the width of the box represents the sample
size, or the size of the data set used to construct each box. For
example, the width of the boxes may be set as proportional to the
square root of the sample size (n). Then, if n = 10 is represented
by one centimeter of width, the width of all the boxes may be
calculated based on their sample size.
6. Determine the height of the notch (in the box at the median) based
on the statistical significance of the median. The standard
deviation (s) of the median may be estimated by:
20
-------
s = 1.25 I/1.35n (14)
for a range of distributions with normal-like centers (McGill et
al. , 1978). The height of the notch above and below the median is
± Cs:
Notch Limits = Median ± Cs (15)
Exhibit 5. Phosphorus and chlorophyll data for five lakes.
Lake A
5
8
11
12
15
16
7
7
7
4
6
10
11
2.6
4.1
3.5
9.0
5.6
7.4
1.9
2.3
2.6
2.8
1.7
6.1
7.7
Lake B
18
28
15
37
25
13
93
47
25
20
22
50
40
8.5
4.2
4.7
35.3
6.5
12.1
20.4
20.4
7.3
8.2
5.1
15.0
10.2
Phosphorus (|jg/1)
Lake C
180
116
176
117
118
113
115
132
125
110
115
145
140
Chorophyll a (|jg/l)
65.7
31.0
42.1
30.2
30.0
14.2
9.6
25.9
19.6
21.2
23.0
51.3
47.1
Lake D
54
23
49
20
34
52
27
20
46
22
25
44
38
39.0
16.2
42.0
14.4
23.5
20.4
31.5
28.9
20.9
18.2
23.0
35.4
31.8
Lake E
115
97
84
161
116
121
174
102
91
110
88
144
153
31.1
20.4
21.6
1.5
2.1
2.8
14.4
12.0
17.1
7.3
6.1
25.4
26.8
21
-------
200 -
180
c: 160
0>
o
D
140
~ 120
o>
o
o 100
CO
CL
CO
O
80
60
40
20
T
B
T
5 10 15 20 25 30
Chlorophyll o_ Concentration (ug/l)
Exhibit 6. Box plots for phosphorus concentration.
22
-------
C is a constant that lies between 1.96 (appropriate if the standard
deviations for the data sets are quite different) and 1.39 (prefer-
rable when the standard deviations are nearly identical). McGill et
al. chose a compromise value of 1.7 for their example, and that
value was also used in Exhibit 6. THus the notch heights are:
Median ± 1.7 (1.25 I/1.35n)
With this mathematical definition of the notch heights, the notch in
the box provides an approximate 95% confidence interval for
comparison of box medians. Therefore, when the notches for any two
boxes overlap in a vertical sense, these medians are not sig-
nificantly different at about the 95% level.
The box plots present the following information:
1. the median
2. the interquartile range, which is a measure of spread or variability
3. the range (maximum value minus minimum value), and an impression 6f skew
through a visual comparison of the symmetry above and below the median
4. the size of the data set, which is an indication of the robustness of the
statistics
5. the statistical significance of the median.
Box plots may be used for a variety of purposes both in the display of
data and in the examination of data. For example, Reckhow (1980) adds two
symbols to the box plot in Exhibit 6, representing average influent phosphorus
concentration and lake phosphorus concentration predicted to coincide with
significant hypolimnetic oxygen depletion. Since these modifications, coupled
with the box plot, probably represent a unique view of the data, new empirical
insights may be likely. A second addition to the box plot proposed by Reckhow
is an overlay of the prediction and prediction interval for a proposed
phosphorus lake model. This might represent another form of residuals
analysis in the model development process. Although the box (i.e., I) and the
prediction interval do not represent the same "level" of spread statistic, a
comparison of these two regions should enhance the traditional residuals
comparison of two points (predicted and observed location statistics).
Another use for the box plot has been recently proposed by Simpson and Reckhow
(1980) in their work on discriminant analysis of algal dominance in lakes.
They found the box plot extremely useful for the identification of variables
that may be used to discriminate between two pre-selected groups of cases.
These discriminating variables were identified by the degree of overlap of the
boxes and notches, when the box plots - one for each group and variable - are
compared (the greater the degree of overlap, the less discriminating the
23
-------
variable). Undoubtedly other applications of the box plot will be proposed,
but even in its unmodified form, the box plot should become a standard method
for the presentation of data.
After data have been plotted and the shape and/or trend of the distribu-
tion of data have been ascertained from the grpah(s), it is often desirable to
summarize the data in a few well-chosen statistics. These statistics should
be selected to represent certain "vague concepts" (Hosteller and Tukey, 1977)
concerning a set of data. The most important of the vague concepts are
"central tendency" and "spread." The central 'tendency, or center, of a set of
data can be represented by the mean, median, mode, geometric mean, and other
similar location statistics. The spread of a distribution of data is
indicated by the standard deviation, interquartile range, mean absolute
deviation, median absolute deviation, range, and other statistics representing
scale.
Since most scientists and engineers learn statistics from a basis of
normal distribution theory, there is a tendency to always summarize a set of
data with the mean and the standard deviation (or variance). This tendency
developed because the mean and standard deviation are "sufficient statistics"
for the normal distribution. In other words, the mean and the standard
deviation completely describe a distribution when it is normal. Unfor-
tunately, many sets of data representing acutal limnological characteristics
exhibit highly non-normal distributions. In those situations, the vague
concepts become important, and sample statistics should be chosen to represent
central tendency, spread, and other relevant characteristics of the distri-
bution.
Candidate statistics are presented below for central tendency and spread.
Certain of these statistics are called "robust" because they represent the
appropriate vague concept well for a variety of distribution shapes.
Selection of the best statistic to quantify a particular vague concept is
dependent upon the distribution of the sample data, the need for statistic
robustness, and mathematical convenience. As a general rule, the normal
theory statistics (mean and standard deviation) are favored in situations when
sample data are roughly normal or uniform in distribution and/or when
mathematical tractability is important. Robust statistics (e.g., the median
and interquartile range) are generally perferred when the data describe a
skewed or irregularly shaped distribution, or when insufficient information is
available to characterize the shape of a distribution. See Reckhow and Chapra
(1980) for additional discussion concerning the choice of appropriate
statistics.
1. Measures of Central Tendency
a. Mean, x:
; = x, (is)
24
-------
The mean is the most commonly used location statistic. Note that
since the mean is an equally-weighted sum of the observations,
extreme values of x. can have a strong influence on x. For this
reason, the mean is1 not robust under conditions of distribution
skew.
b. Median. The median is the middle value in a set of data when the
data are ordered from low to high. Since the median is unaffected
by the particular values assumed by the ordered data points, it is
robust in situations with extreme data (i.e., skewed distributions).
c. Mode. The mode is the single value most frequently observed. For a
probability density function or histogram, it corresponds with the
peak, or most likely value.
d. Geometric Mean. The geometric mean is equivalent to the antilog of
the mean of a set of log-transformed data. This is an important
statistic for many hydrologic and water quality variables that are
approximately characterized by lognormal distribution. For log-
normally-distributed data, the geometric mean is probably the best
central tendency statistic.
2. Measures of Spread
a. Standard Deviation, s:
(17)
Like the mean, the standard deviation is an often employed
statistic. Also like the mean, the standard deviation is not robust
under conditions of distribution skew. In particular, since the
deviations (from the mean) are squared, data points with large
deviations (outliers) have a strong impact on the magnitude of the
standard deviation.
b. Interquartile Range, I. When data are ordered from low value to
high value, the interquartile range is the difference between the
value at the 75% level and the value at the 25% level. Since the
interquartile range, like the median, is based upon order
statistics, it is robust in situations with extreme data.
c. Mean Absolute Deviation and Median Absolute Deviation. The absolute
deviation is defined as:
Absolute Deviation = x. - x| (18)
25
-------
The mean absolute deviation is the mean value among the absolute
deviation data points, while the median absolute deviation is the
median value among the absolute deviation data points. The choice
between these absolute deviation statistics is equivalent to the
choice between the mean and median as summarized above.
d. Range. The range is the difference between the highest value and
the lowest value. While it is an easy statistic to calculate, it is
obviously sensitive to extreme data. Nevertheless, the range is an
important indicator of distribution spread.
Following the selection and calculation of sample statistics, there is
frequently a need to test or quantify certain relationships about the
population(s) of concern. This exercise could take the form of an hypothesis
test, confidence intervals, or perhaps a goodness-of-fit test. In this brief
discussion, the presentation is limited to two methods for hypothesis testing.
However, it is important to realize that sometimes confidence intervals are
more appropriate when comparing statistics or data sets. Reckhow and Chapra
(1980), Wonnacott and Wonnacott (1972), and other statistics texts (identified
at the end of this section) examine the pros and cons of hypothesis testing,
and suggest appropriate uses for confidence limits and hypothesis tests.
The tests presented below are the t-test, the standard statistical test
associated with normal distribution theory, and the Mann-Whitney or Wilcoxon
test, the most commonly applied nonparametric, or distribution-free test. For
either procedure, the test is begun with the establishment of a "null"
hypothesis. This null hypothesis is often proposed as a "straw man," based on
a suspicion that it is false. Competing with the null hypothesis for
acceptance is the alternative hypothesis. Under this scheme, then, there are
four possible outcomes associated with the fundamental truth or falsity of the
hypotheses, and the success or failure of the hypothesis testing.
The t-test is based on assumptions of sampling from normal distributions,
homogeneity of variances, and independent errors. The Mann-Whitney test is
based on an assumption of independent, identically-distributed errors. In the
discussion following the examples, we examine the degree to which one must
comply with these assumptions, and we comment on the proper interpretation of
the results of hypothesis testing.
Example 4
Use the t-test to test the null hypothesis, at the 95% level, that the
true mean chlorophyll a concentration in Lake B (UR) is identical to that for
Lake C (uc) in Exhibit 5.
HQ: MB - MC = 0
Hr MB- HC*°
26
-------
For this problem, Student's t is calculated from:
Xr
t =
(y nc)
(nR) (nc)
where:
XB, xc = mean chlorophyll a concentrations (|jg/l) for lakes B and
C (estimated from sample data)
nD, n~ = the number of chlorophyll a observations for lakes B and C
D L ~
s2 = the pooled within-group variance (sample statistic).
To determine the pooled within-group variance, we must first calculate the
sums of squares (ss) within each group.
SSD = Ix 2 — = (8.5)2 + (4 2)2 + ... + (10.2)2 - u°;-gj = 936.75
B B nR U
«) (n~\r\ «"\2
ssr = Ixr2 - ^— = (65.6)2 + (31.0)2 + ... + (47.1)2 - l i'°; = 3044.84
\* U *^*
The pooled within-group variance is:
SSR + ss
"
(nB - 1) + (nc
2 _ 936.75 + 3044.84
12 + 12
s2 = 165.9
27
-------
Thus:
12.1 - 31.6
165.9 13+13
t = -3.85
This value of t has (rig - 1) + (ru - T), or 24, degrees of freedom.
Consulting a t- table (two-tailed), we Tind that for 24 degrees of freedom,
this value of t is significant 'at the 99%+ level. This test supports
rejection of the null hypothesis that the means are equal.
Example 5
Use the Mann-Whitney test to test the null hypothesis, at the 95% level,
that the mean chlorophyll a concentration in Lake B is identical to that for
Lake C in Exhibit 5.
HQ: MB - uc = 0
H
r MB - MC
The Mann-Whitney test is based on the W-statistic, which is the sum of the
combined ranks occupied by the data points from one of the samples. The
chlorophyll a observations are combined and ranked in Exhibit 7. To test the
hypothesis, the ranks, R. , associated with Lake B are summed.
nB
W = I R. (20)
1=1 1
W (lake B) = 110
At this point, the W-statistic may be compared to tabulated values to
determine its significance. Alternatively, for moderate to large samples
(n>10), W is approximately normal (if H is true). This means that the
W-statistic may be evaluated using a standard normal table and (Hollander and
Wolfe, 1973):
w* = W - E (W) (21)
[var (W)]'5
or:
W - [n. + (nR + n. + l)/2]
W* = £ 2 « _ (22)
28
-------
Exhibit 7. Chlorophyll a observation ranks for the Mann-Whitney test.
Combined Ordered Observations
Lake B Lake C
15.0
20.4
20.4
14.2
19.6
Combined Ranks
Lake B Lake
4.2
4.7
5.1
6.5
7.3
8.2
8.5
9.6
10.2
12.1
1
2
3
4
5
6
7
9
10
8
12
14
15
11
13
21.2
23.0
25.9
30.0
30.2
31.0
35.2
42.1
47.1
51.3
65.6
16
17
18
19
20
21
22
23
24
25
26
where:
W* is N (0,1) when H is true
W is calculated using Equation 20
E (W) is the expected value for the W-statistic
Var (W) is the variance for the W-statistic
n., nR are the number of observations in samples A and B.
Since ng, n~ > 10 for the problem posed, the significance of the W-statistic
is determined using Equation 22.
W* =
110 - [13 (13 + 13 + l)/2]
[(13) (13) (13 + 13 + 1)/12]J
= -3.36
29
-------
Consulting a standard normal distribution table (for a two-tailed test), it is
found that this value of W* is significant at the 99%+ level. The null
hypothesis is therefore rejected. Note the similarity in test statistic
values for the t-test and the W-test.
The assumptions inherent in the hypothesis tests, particularly in the
t-tests, are cause for possible concern because they may be difficult to
achieve. Fortunately, studies have been undertaken on the impact of violation
of the assumptions. For example, Box e_t aj. (1978) note that it is the act of
"randomization" in experimental design, and not the use of a non-parametric
technique, that makes a procedure insensitive to distribution assumptions.
With randomization, Box and associates illustrate that both the t-test and the
Wilcoxon test are relatively insensitive to the shape of the parent
distribution, but they are both sensitive to serial correlation among the
observations. In addition, Boneau (1962) found that the t-test is quite
robust to violations of the assumptions of a normal parent distribution and of
equal variances. Boneau concluded his study by noting that while the t-test
should not be rejected because of concern over the aforementioned assumptions,
neither should the Wilcoxon test be rejected because it is supposedly less
powerful than the t-test. Both claims are sometimes false. The recommenda-
tion advanced here is proposed by Blalock (1972); apply both tests when in
doubt about the assumptions. If the study is well-documented and the results
of both a t-test and a Wilcoxon-Mann-Whitney test are reported, then the
reader is provided with sufficient information for the analysis of the
hypothesis.
In addition to concern over the assumptions, the user of an hypothesis
test must be careful in the interpretation of the results. Specifically, an
hypothesis test can be incorrect if we reject H when it is true (type I
error), or if we accept H when it is false (type II error). The "signifi-
cance level" (95% for the two examples) sets the probability of making a type
I error. Since the significance level is known approximately, we know how
often we are likely to reject H when it is true. However, type II error,
evaluated by a test's "power," is dependent upon the true, but unknown,
solution to the issue being tested. Therefore one cannot be certain of the
likelihood of committing a type II error. There are power curve methods for
estimating the probability of the type II error associated with true values
for the issue being tested (see Wonnacott and Wonnacott, 1972). However, in
the absence of these power determinations, the following recommendations are
made. When the designated significance level is exceeded, the null hypothesis
may be termed "rejected," and the significance level reported. Acceptance of
H is another matter, however. When the alternative hypothesis covers a range
o9 values (as in Examples 3 and 4), and the test statistic is not significant,
then it is probably best to state that "H cannot be rejected." The
alternative, "H is accepted" is too strong in the absence of power deter-
minations. Additional testing would then be required, if a more definitive
conclusion is needed.
Hypothesis testing is a confirmatory method in data analysis. The study
of variable relationships may also occur in an exploratory mode as in certain
graphical and statistical techniques for the analysis of bivariate data.
30
-------
Among these techniques are correlation analysis, regression analysis, and
bivariate plotting. Extension of the bivariate form of these techniques to
multivariate data is straightforward but is not discussed here.
Correlation and regression analyses are frequently used in limnology for
the examination of bivariate data. The correlation coefficient is a measure
of the strength of a liner association, and it is an indicator of the
predictive effectiveness of a regression equation. Regression analysis may be
used to quantify the functional relationship (either linear or nonlinear)
between two variables.
Most correlation and regression analyses are conducted with the aid of a
calculator or digitial computer. It is unnecessary, therefore, to dwell on
the mathematics of these techniques. The analyst of limnological data using
one of these methods would be wise to devote some effort to understanding the
assumptions inherent in regression and correlation analyses which may guide
him/her in the interpretation of the results. For example, both regression
and correlation are sensitive to trend outliers. As a result, robust methods
have been proposed, in the form of rank-order correlation (Snedecor and
Cochran, 1967) and robust regression (Reckhow and Chapra, 1980). Adherence to
methodological assumptions is an important topic yet it is beyond the scope of
this limited treatment. Therefore it is recommended that the analyst consult
Reckhow and Chapra (1980), Kleinbaum and Kupper (1978), Wonnacott and
Wonnacott (1972), Hosteller and Tukey (1977) or some other text that addresses
the interpretation of correlation and regression relationships. Reckhow and
Chapra (1980) provide an example illustrating how regression analyses can be
quite misleading when the relationships are interpreted and applied, unless
attention is paid to the assumptions.
In this brief presentation of data analysis, with an obvious emphasis
toward concepts and robust methods, it seems appropriate to devote most of the
bivariate relationships subsection to a discussion of bivariate plots. The
analysis of bivariate relationships is quite common in limnological studies.
For example, the trophic state index described in the next section is based
upon three bivariate relationships among phosphorus concentration, chlorophyll
a level, and Secchi disc depth. In the lake modeling field, modelers have
debated the relationships between phosphorus concentration and mean depth,
phosphorus concentration and area! water loading, and phosphorus concentration
and hydraulic detention time. Often in these studies, correlation coef-
ficients or regression equations are used in support of a bivariate
relationship. The bivariate plot is also sometimes used, and it can be quite
effective both in exploratory work to uncover relationships and in diagnostic
work to study and check identified relationships. In fact it is recommended
here that bivariate plotting be a standard feature of bivariate or multi-
variate data analysis. Reliance on statistics alone (e.g., on correlation
coefficients only) can result in inaccurate analyses, as statistics can mask
unusual data set characteristics that are quite evident when graphed (see
Reckhow and Chapra, 1980).
Limnological data analysis has somewhat unusual features that might be
studied using bivariate plots. Specifically, limnological data are often
collected on a cross-section of lakes and then used to analyze relationships
31
-------
in a single lake longitudinally, or over time. In the original cross-
sectional analysis, each data point is not a single observation but rather a
summary statistic (for location) representing several observations. So, there
are two issues hidden in many bivariate limnological studies:
1. Is limnological behavior that is identified in a cross-sectional
(multi-lake) analysis meaningful when applied to a single lake over
time?
2. Is information lost when only summary statistics (for location) are
used in (multi-lake) cross-sectional studies? If so, are there
methods for recovering and examining this information while
preserving the basic features of the cross-sectional study?
While we cannot provide a definitive answer to these questions (in part,
because they are somewhat application-specific), an exploratory method related
to the box plot yields some insight. It is based on a graphical analysis of
the five order statistics (median, quartiles, and extreme values) employed in
the box plots. As an example, the medians, quartiles, and extreme values are
determined for the phosphorus data and for the chlorophyll a data presented in
Exhibit 5. These statistics are then paired for each lake and plotted in
Exhibits 8 and 9. In Exhibit 8, the five order statistics are connected for
single lakes, while in Exhibit 9 the points are connected on the basis of
matching statistics (medians with medians, etc.), across lakes. (Not all of
the points in Exhibit 9 are connected by lines. A visual smoothing technique
was employed to produce convex sections around the central tendency line. See
Tukey, 1977, for simple mathematical methods for smoothing curves.)
There are a number of attributes of these plots worth exploring. First,
the central tendency line (the line connecting the medians) in Exhibit 9 is
equivalent to the standard trend line for cross-sectional regressions of
chlorophyll a and phosphorus. The convex quartile and range lines surrounding
the median provide an indication of variability to be expected within a single
lake. Note that this is different from the scatter of data found in a cross-
sectional regression, which is a function of variability among lakes. In
Exhibit 8, the slope of the lines suggests the chlorophyll-phosphorus
relationship within lakes. (Although, it must be remembered that the data
points do not actually represent paired observations. Rather, the phosphorus
and chlorophyll a data were ordered separately and then paired as order
statistics, i.e., median with median.) A comparison of the slopes for single
lake relationships (like in Exhibit 8) with the slope for multi-lake cross-
sectional central trend is important. When the slopes are essentially
equivalent, the multi-lake relationship is informative for single lake trend
analysis. When the slopes are different, the multi-lake trend is misleading.
In either case, the multi-lake variability (which represents cross-sectional
differences, in part) and multi-lake prediction error are probably not too
indicative of single lake variability. Thus predictive equations for
bivariate relationships within lakes should probably be developed, when
possible, from single lakes or highly homogeneous data for unbiased, minimum
uncertainty predictions.
32
-------
D O
CO
O>
Z3 O
o ce
O
CD
O
LO
o
ro
O
c\j
O
O
CVJ
O
CO
O
CD
O o
CVJ —
2 |
O
CJ>
O co
OO =»
8 "
CD o
I/)
Q.
c
o
(1)
S-
5-
O
Q.
O
.C
D.
D.
O
O
O
O)
^
03
O)
C
O
"o.
OJ
•p
fO
00
33
-------
0>
§~ o
— «_ o>
"^3 O CZ
0> 3 O
S o or
D o <
o
o
o
GO
o
CD
o
n-
00
O <1>
o g
O
CO
Z3
l
CL
^
O
CD
O
in
O
ro
CO
o.
c.
o
a
=3
S_
O
.c
o.
in
O
x:
o.
Q.
O
J_
o
u
IT)
c
o
(J
cu
I/)
10
to
o
s_
+J
o
OJ
-p
rt
s-
05
01
-p
34
-------
Before ending this brief treatment of data analysis, some statistical
references should be mentioned and briefly annotated. Reckhow and Chapra
(1980) contain several chapters on data analysis and empirical modeling
presented in a style and philosophy similar to the approach employed in this
section. Tukey (1977) and Hosteller and Tukey (1977) are excellent references
on exploratory data analysis, while Hosteller and Rourke (1973) and Hollander
and Wolfe (1973) present nonparametric methods. Chatterjee and Price (1977)
and Kleinbaum and Kupper (1978) are excellent in their treatment of applied
regression analysis. Experimental design and other topics are covered in Box
et aJL (1978). Finally, Snedecor and Cochran (1967) and Wonnacott and
Wonnacott (1972) are good, general references for several topics in statistics
and data analysis.
In conclusion, three recommendations for data analysis should be apparent
from the placement of the emphasis in this section.
1. Select summary statistics according to the vague concept criterion.
That is, the statistic chosen to represent a data set should be the
best choice because it represents the concept (e.g., location) best,
and not because it is the natural choice in traditional statistical
analyses (i.e., normal distribution theory).
2. When in doubt about the underlying distribution of a set of data,
use robust statistics and methods.
3. The plotting of univariate, bivariate, and multivariate data is an
essential step in statistical analysis.
4. Indices of Lake Water Quality
Considerable attention in the previous section was devoted to methods and
statistics for summarizing data. Probably the most common of the summary
statistics are the various measures of location, such as the mean, median, and
mode. From an information perspective, we would call these location
statistics univariate indices.
An index is a summary statistic. Since it is rarely a sufficient
statistic, it contains less information than is available in the data set that
it summarizes. A univariate index is a location statistic for a single
variable, such as mean phosphorus concentration. A multivariate index is a
single number chosen to summarize data on two or more variables. It is the
multivariate index that is the focus of this section, although the univariate
analogy is sometimes useful for discussion purposes.
Indices are used presumably because the convenience of summarizing
information in a single number outweighs the disadvantage of information lost
due to the act of summarization. It was pointed out in Section 1 that lakes
provide for multiple uses which makes lake water quality a use-specific, or
perhaps a problem-specific, attribute. A true water quality index, therefore,
is multidimensional. The naturally subjective decisions as to which variables
35
-------
should be part of a water quality index and what schemes should be used to
combine the variables have been largely responsible for the dearth of widely
used indices.
In this section, as in other sections of this discussion, we consider a
specific lake quality problem: eutrophication. Now, the water quality index
may be renamed a trophic state index (TSI). In addition, the index is reduced
to essentially a single dimension associated with trophic state. Within this
single dimension the index may still be multivariate, which means that
variables are highly intercorrelated, representing the same basic concept
(eutrophication).
A number of attempts have been made to establish a trophic state index as
a function of commonly measured water quality variables. The EPA National
Eutrophication Survey (1974) has compared the work of some investigators
(Sakamoto, 1966; National Academy of Sciences, 1972; and Dobson et a_L , 1974)
on chlorophyll a levels versus trophic state. This presented in Exhibit lOa.
The EPA's own estimates of values of chlorophyll a, total phosphorus, and
Secchi disc depth indicative of trophic states are presented in Exhibit 10b.
Exhibit lOa. Trophic state vs. chlorophyll a (from EPA-NES, 1974).
Trophic
Condition
Oligotrophic
Mesotrophic
Eutrophi c
Chlorophyll a (ug/1)
Sakamoto
0.3-2.5
1-15
5-140
Academy
0-4
4-10
>10
Dobson
0-4.3
4.3-8.8
>8.8
EPA-NES
<7
7-12
>12
Exhibit lOb. EPA-NES trophic state delineation (from EPA-NES, 1974)
Trophic State
Oligotrophic
Mesotrophic
Eutrophic
Chlorophyll a
(ug/1)
<7
7-12
>12
Total Phosphorus
(ug/1)
<10
10-20
>20
Secchi Disc
Depth (m)
>3.7
2.0-3.7
<2.0
While there have been other attempts at single variable trophic state
criteria (or indices), all are relatively similar in approach (see Exhibit
10). More importantly, they represent subjective judgment, and possibly
limited geographic regions, so it is unlikely that universal agreement will
rest on one approach. Therefore, the selection of a univariate trophic state
criterion should be based primarily on personal acceptance and credibility.
36
-------
More robust trophic state criteria or indices may be developed with a
multivariate approach. Shannon and Brezonik (1972) constructed a trophic
index for Florida lakes composed of the variables: primary production (PP, in
mg, of carbon per cubic meter-hour), chlorophyll a (CA in mg/m3), total
organic nitrogen (TON, in mg/1 as N), total phosphorus (TP, in mg/1 as P),
Secchi disc transparency (SC, in meters), specific conductance (COND, in
umho/cm), and a cation ratio (CR, a dimensionless ratio of (Na + K)/(Ca +
Mg)). For lakes without appreciable organic color, the trophic state index
(TSI) was estimated as:
TSI = 0.936 (1/SD) + 0.827 (COND) + 0.907 (TON)
+ 0.748 (TP) + 0.938 (PP) + 0.892 (CA) (23)
+ 0.579 (1/CR) + 4.76
A TSI of about 3 to 5 defines the transition zone between eutrophy and
mesotrophy, and a TSI of 1.2 to 1.3 separates the mesotrophic and oligotrophic
classes.
The index was developed using principal component analysis, and the TSI
is the first principal component. This technique may be used to indentify
"common elements" among variables, and the first principal component is a
linear combination of the variables that best describes the most common
element. When all of the variables in an analysis are thought to be good
indicators of a concept called trophic state, then it is reasonable to assume
that the most common element extracted from this set of variables (the first
principal component) would be a good index of trophic state. In fact, this
component is more "robust" than any one variable as an indicator of trophic
state. This means that it is less likely, than a single variable index, to
misclassify a lake based on an erroneous measurement. Incorrect data on one
variable can lead to misclassification based on that variable, but it may not
lead to misclassification if the classification criterion is based on other
variables (correctly measured) as well.
Despite the fact that a principal component trophic state index has this
desirable feature of robustness, the TSI proposed by Shannon and Brezonik
cannot be recommended for use on north temperate lakes. The TSI was developed
from a data base of Florida lakes only, and the significant climatic (and thus
thermal) difference between that area and the north temperate region is likely
to affect the index. Since this effect is unclear, we are unable to interpret
the TSI in north temperate lakes. Equally important, most of the trophic
variables are log-normally distributed, which means that the best estimate for
the TSI should be made under a logarithmic transformation for these variables.
Without this transformation (as in the case of Shannon's and Brezonik's TSI),
the index may be biased and may appear misleadingly precise.
A trophic state index has been proposed by Carlson (1977) that may also
be considered multivariate. Carlson's index may be estimated from summer
values of Secchi disc depth (SD, in meters), summer total phosphorus con-
centration (TP, in mg/m3) or summer chlorophyll a concentration (CA, in
mg/m3), or a weighted combination of all three. Carlson used regression
37
-------
analysis to relate Secchi disc depth to total phosphorus concentration and to
chlorophyll a concentration. He then reasoned that a doubling of biomass
levels, or a halving of the Secchi disc depth, corresponds to a change in
trophic state. Carlson assigned a TSI scale of 0-100 to the three trophic
variables, such that a change of 10 units in TSI corresponds to a halving of
the Secchi disc depth and a change in trophic state. The regression equations
presented below then were used to relate the TSI to phosphorus and
chlorophyll.
TSI = 60 - 14.41 In SD = XSD (24)
TSI = 9.81 In CA + 30.6 = XCA (25)
TSI = 14.42 In TP + 4.15 = XTP (26)
Exhibit 11 contains the index values and variable relationships.
Exhibit 11. Carlson's trophic state index.
TSI
0
10
20
30
40
50
60
70
80
90
100
Secchi
Disc (m)
64
32
16
8
4
2
1
0.5
0.25
0.125
0.0625
Surface
Phosphorus
(mg/m3)
0.75
1.5
3
6
12
24
48
96
192
384
768
Surface
Chlorophyll
(mg/m3)
0.04
0.12
0.34
0.94
2.61
7.23
20
55.5
154
426
1,180
Carlson's TSI may be estimated from any of the three variables, using
Exhibit 11. Carlson felt that this was important as:
1. Secchi disc readings may be misleading as a trophic state indicator
in colored lakes or highly turbid (non-algal) lakes.
2. Chlorophyll a may be the best indicator during the growing season.
3. Phosphorus may not be a good indicator in non-phosphorus limited
lakes.
38
-------
Thus different variables may be used depending upon the season, lake, and
availability and quality of data. While Carlson suggests that the variable
that the index is based on be selected on a pragmatic basis, he recommends
that consideration be given to chlorophyll in the summer and to phosphorus in
the fall, winter, and spring.
Recently, Porcella e_t aJL (1980) have proposed a "Lake Evaluation Index"
(LEI), based in part on Carlson's trophic state index, to be used to describe
the effectiveness of lake restoration programs. The LEI, which Porcella et
al. admit is still under development, is composed of 5-6 variables (all
measured, preferably, between 1000 and 1400 standard time). They are:
1. Secchi Depth (SD). The LEI value (XSD), calculated in Equation 24,
is based on the mean SD measured during the months of July and
August. Color may be important (see below), so when present, it
should be documented.
2. Total Phosphorus (TP). The LEI value (XTP) is calculated from the
mean TP measured during July and August in the epilimnion. Equation
26 provides the index value.
3. Total Nitrogen (TN). At present nitrogen is not part of the LEI.
However, a nitrogen index statistic has been determined from the
mean TN measured during July and August in the epilimnion (in
mg/m3). This statistic is:
XTN = 14.427 In TN - 23.8 (27)
4. Chlorophyll a (CA). The LEI value (XCA) is calculated from the mean
CA measured during July and August in the epilimnion. Equation 25
provides the index value.
5. Dissolved Oxygen (DO). The LEI value (XDO) is equal to ten times
net DO (in mg/1), which is calculated from July-August data.
net DO =
max
I
1=0
(EDO - CDO)n.
V
(28)
where
zmax
i
AV.
maximum depth
index of depth contours
volume of depth contour i
equilibrium DO, calculated from atmospheric pressure and
temperature-depth profiles (kg/lake)
EDO
CDO = total lake DO (kg/lake).
39
-------
Volume sections should be selected so that supersaturation and
undersaturation do not cancel, if present, since they both are often
indicative of quality deterioration. This can be accomplished by
placing these "quantities" in different volume sections.
6. Macrophytes (MAC). The LEI value for macrophytes (XMAC) is defined
as the percent area deemed "available" for macrophyte growth that is
actually occupied by macrophytes. This available area is considered
to be "the area encompassed by the lake margin and either the 10
meter line or the depth at which light becomes limiting to vascular
plant distribution and growth (2 times SD) which ever is shallower"
(Porcella et al_., 1980).
The six "x-values" presented above convert the LEI variables to a 0-100
scale. These relationships are presented in Exhibit 11 for Carlson's
variables and in Exhibit 12 for the other three variables. The actual LEI
proposed by Porcella et al. is a composite variable, also on a 0-100 scale.
Exhibit 12. Rating scale for certain LEI variables
(from Porcella et al., 1980).
Rating
(X)
(minimally
0 impacted)
10
20
30
40
50
60
70
80
90
100 (maximally
impacted)
Total N
(mg/m3)
5.2
10.4
20.8
41.6
83.2
167.
333.
666.
1330.
2670.
>5330.
Net DO
(mg/1)
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
>10.0
Macrophytes
% Available
Lake Area
Covered
0
10
20
30
40
50
60
70
80
90
100
LEI = 0.25 [0.5 (XCA + XMAC) + XDO + XSD + XTP]
(29)
For both the LEI and the individual X-specified variables, an index value of
less than 40-45 represents oligotrophy and an index value of greater than 50
represents eutrophy. Porcella et al_. emphasize that the LEI is meant more as
a measure of lake restoration effectiveness than as a trophic state index per
se. Its usefulness, and linked to that—its acceptance, in either role remain
to be seen.
40
-------
Carlson's index is based around Secchi disc transparency. Recently, some
investigators (Lorenzen, 1980; Megard et al. , 1980; and Edmondson, 1980) have
suggested that a bias may exist or result in Secchi disc-based cross-sectional
trophic studies due to non-algal turbidity and color. Single lake longi-
tudinal relationships (see Section 3) are recommended instead.
An alternative index system has been proposed by Walker (1979), who also
recognized the problems inherent in Carlson's TSI due to non-algal light
alternating factors. Walker's index is based around chlorophyll a, which is
probably less influenced by non-biomass factors than is Secchi disc depth.
The components of Walker's index are:
ICA = 20.0 + 14.42 In CA (30)
ITp = -15.6 + 20.02 In TP (31)
ISD = 75.3 + 19.46 In (1/SD - a) (32)
where a is a term (m-1) representing non-algal influence on transparency.
Walker's index (I.,) is then simply:
JW =
-------
removed, however, by the tendency of researchers to mentally convert these
index units back to one of the three standard tropic states for ease of
interpretation. In summary, then the selection of a trophic state index from
among those discussed in this section should probably be made on the basis of
familiarity by the users, since no single index conveys appreciably more
information than any of the others.
5. Acquisition of Nutrient Budget Data
A necessary step in lake quality management planning is an analysis of
how present, and projected, watershed characteristics and activities affect
water quality. Given the construction of most trophic state assessment
schemes and the seasonal variability of nutrient sources, information on
nutrient flux is most useful when it is acquired in yearly increments. When
the issue of concern relates to present land use, the acquisition and
examination of existing nutrient flux data or the sampling of nutrient sources
on an annual basis is appropriate. When the latter course of action is
chosen, the methods described in Section 2 and at the end of this section are
useful.
Alternatively, water quality management planning for projected land uses
necessitates an "indirect" assessment of the annual nutrient budget. Since
measurements cannot be made for these nonexistent land uses, nutrient flux
estimates must be determined from the literature reporting measurements taken
at another location and/or time. Actually, the literature may be consulted on
nutrient export coefficients for all nutrient budget assessments (present or
projected). It must be noted, however, that use of non-application-specific
data has an associated risk. That is, if the literature values are not
representative of the application case, then bias is introduced into the
analysis. This creates risk. When the analyst has a choice (e.g., when
studying the impact of present land uses), the increased risk due to use of
literature export coefficients must be evaluated against the increased cost of
nutrient flux sampling. This is simply one of many situations in planning
when expected outcomes need to be examined so that the trade-off between cost
and risk may be analyzed. Clearly it is difficult to introduce much rigor or
precision into this trade-off study. However, even some rough calculations of
cost versus risk associated with alternative sources of nutrient budget data
may greatly improve planning. See Reckhow and Chapra (1980), Chapter 1, for a
discussion of this and other issues important in water quality modeling and
planning studies.
The selection of appropriate nutrient export coefficients is a difficult
task. Proper choice of export coefficients is a function of knowledge of the
application lake watershed and knowledge of the watersheds of candidate export
coefficients. It is through comparisons of these watersheds that the analyst
arrives at the appropriate coefficients. Since a critical aspect of a
watershed analysis/modeling exercise is the estimation of prediction error
(see Section 6), the analyst should realize that poor choice of export values
contributes to an increase in error. This contribution may be explicit or
implicit in the analysis, depending upon whether or not the analyst is aware
of all of the uncertainty introduced by his/her choice of phosphorus export
42
-------
coefficients. Clearly, experience in the application of this modeling
approach is a valuable attribute. Information on nutrient export coefficients
is available in Reckhow et al. (1980) which contains both a presentation of
candidate export coefficients and a description of the watershed character-
istics for the candidate coefficients.
Direct assessment of a lake's nutrient budget or of the nutrient flux
emanating from specific sources requires careful planning. Application of the
sampling design relationships presented in Section 2, or of the concepts
important in sampling design, can lead to efficient sampling programs based
upon explicit trade-offs among different sampling schemes. In addition, an
estimate of the uncertainty associated with carefully gathered data on
nutrient flux is valuable information for use in the models and classification
schemes presented in Sections 4 and 6.
Lake phosphorus budget sampling design is discussed in considerable
detail by Reckhow (1978e). The remainder of this section contains a summary
of some of the issues presented in that paper. Major sources of phosphous
considered were tributaries, sewage treatment plants, urban runoff,
precipitation, septic tanks - groundwater, and lake sediments. For each
source, the sampling design was based on an estimation technique, or model,
that converted the gathered data to an annual phosphorus flux estimate.
Concurrent with the design of a nutrient flux sampling program should be
the consideration of nutrient flux estimation techniques. Flux may be
estimated directly (as it is, generally, when the literature is consulted for
phosphorus loading estimates), or it may be determined from separate assess-
ments of nutrient concentration and volumetric water flow rate. The
estimation of flux from concentration and flow data, in turn, may be
accomplished in several ways (Reckhow, 1979e). Care must be observed in the
flux estimation procedure because certain procedures favor certain sampling
designs and because poor choice of estimation procedures can lead to bias and
greater uncertainty in the nutrient loading estimate.
Phosphorus flux in lake tributaries has been studied extensively, and
thus there is a substantial quantity of literature that may be used for the
estimation of the expected magnitude and variability of that flux. The EPA
National Eutrophication Survey is a good source of data, and many of the
EPA-NES streams have been classified by land use (Omernik, 1977). In general,
total phosphorus concentration (in streams) decreases with flow in streams
impacted by a sizeable point source, and increases with flow in streams
undisturbed by major point sources. On that basis, phosphorus flux is
probably best estimated by multiplying average flow times the flow-weighted
concentration or by a regression equation of flux on flow. Since those
calculations of flux require information on flow, it is recommended that
continuous flow measurements be made, or that a regression equation (of flow
on precipitation and watershed characteristics) be used to provide flow data.
Regression equations like that described are available from the U.S.
Geological Survey. Sampling for concentration should be allocated among
tributaries using stratified random sampling, and it should probably occur on
2-4 week intervals (with a random start, and allocated according to seasonal
flow variations). More frequent sampling results in auto-correlation among
43
-------
samples, and less frequent sampling may result in considerable error.
Finally, some consideration should be given to sampling major storm events, as
a large percentage of the phosphorus loading may occur during those times.
Much data also exist on wastewater treatment plants, and again the EPA-
NES is a good source. Treatment plant data exhibit a distinct diurnal cycle,
so composite sampling is preferrable. Phosphorus flux estimates may be made
from flow-weighted concentration time flow (continuous flow data should be
available). Existing EPA-NES data indicate that the average phosphorus
concentration varies considerably from plant to plant, while the coefficient
of variation of phosphorus concentration generally lies between .3 and .5.
Sampling among plants should be based on stratified random design, while
sampling over time should be based on random sampling to reach a desired or
minimum precision.
Urban runoff sampling clearly must be geared to storm events. Insuf-
ficient data exist to guide sampling designs in most situations. Therefore,
only some general recommendations can be made. Automatic sampling may be most
effective, since human response to a storm may miss a portion of the "first
flush." Composite sampling for concentration may be used to estimate flux, as
average flow times average concentration. Grab sampling can be used to fit an
exponentially-decaying concentration model (Marsalek, 1975), that may be used
to estimate flux with continuous flow data.
Existing data on phosphorus in bulk precipitation (precipitation plus dry
fallout) indicate considerable variability from year-to-year, site-to-site,
and storm-to-storm. Bulk precipitation phosphorus results from industrial air
pollution, bare agricultural fields, dirt roads, etc. In many lakes,
precipitation is a relatively minor source of phosphorus. Thus, literature
values for precipitation phosphorus (Reckhow et al., 1980) should probably be
compared to the expected flux of phosphorus (to the lake of study) from other
sources before a sampling program is undertaken for this source.
No satisfactory techniques have yet been developed to measure phosphorus
flux to a lake from septic tanks and groundwater. The most common technique
used is a soil retention coefficient, specific to a soil type. However, a
constant soil retention does not consider the time-dependency of retention,
the total volume of soil through which phosphorus in solution must pass, and
the loading of phosphorus to the soil. Probably a better technique at this
time is a system of "seepage meters" in the shallow lake sediments and wells
immediately onshore (Lee, 1977). The seepage meters are used to measure
groundwater flow (assumed to decrease exponentially with distance from shore),
and the wells are used to measure phosphorus concentration. Unfortunately,
insufficient data exist to design this program, but the concepts of stratified
random sampling (magnitude, variability, and cost) suggest that sampling units
should be most dense in areas with the greatest density of septic tanks and in
areas with soils of lowest retention coefficients.
Finally, the lake sediments are another source of phosphorus that is not
well-defined. As a rule, the sediments are considered to be a significant
source only under anerobic conditions. However, studies indicate (Snow and
DiGiano, 1976) that aerobic sediments often release phosphorus also. Esti-
44
-------
mation techniques, such as a constant daily release of phosphorus, or release
proportional to the concentration gradient between the water column and the
interstitial water, have been proposed (Reckhow, 1978e). Experimental
procedures have been developed for both the laboratory and the field (Snow and
DiGiano, 1976). It is suggested that "typical" release rates, presented in
Reckhow (1978e) and Snow and DiGiano (1976), be compared to expected
phosphorus flux from other sources, before a sampling program is undertaken
for the lake sediments.
As an example of lake phosphorus budget sampling design, the following
analysis was conducted to guide the sampling of phosphorus flux to Lake
Winnipesaukee in New Hampshire (Reckhow and Rice, 1975; Reckhow, 1978e). This
analysis emphasized the concepts of stratified random sampling (base a sample
design on flux magnitude, variability, and sample cost); it did not consist of
the explicit trade-offs and computations that might be possible with the
material presented above. Nonetheless, it does show, in a general sense, how
sampling design may develop.
Exhibit 13 presents the mean, standard deviation, and precision of
existing estimates for phosphorus flux from the tributaries and the wastewater
treatment plants. Informed judgment yielded the magnitude and range estimates
for the other three phosphorus sources; data were deemed insufficient to
specify these terms more precisely. This table, then, provided the basis for
general sampling design recommendations, summarized in the following
statements.
Exhibit 13. Initial uncertainty estimates for Winnipesaukee phosphorus loading.
Prior Estimates
Standard
Coefficient Error of Estimated
Term Magnitude of Variation the Mean (%) Range
1. Tributary Flux 16,000 Ib P/yr .65 ±20
2. Septic Tanks 4,000-30,000
Ib P/yr
3. Sewage Treatment
Plants 22,000 Ib P/yr .30 ±10
4. Precipitation 4,000-7,000
Ib P/yr
5. Sediment Release 700-7,000
Ib P/yr
45
-------
1. Existing estimates of the phosphorus flux from tributaries and
sewage treatment plants may be sufficient (i.e., no additional
sampling necessary), if they were obtained with an unbiased sampling
design, and if significant changes (land use, etc.) have not occur-
red.
2. Considerable sampling effort should be devoted to estimating the
mean and variance in phosphorus flux from septic tanks.
3. The other sources of phosphorus (sediments and precipitation) should
be investigated through the literature, but they may not require
sampling.
4. If tributary sampling is undertaken:
a) spatial coverage should be based on stratified random sampling
design (which may result in no sampling in the smallest streams
that are not culturally impacted).
b) temporal coverage should consist of a sampling interval of 2-3
weeks, with sampling being more frequent during high runoff
months and less frequent during months of low runoff.
In conclusion, a good sampling design requires the following information:
1. Prior knowledge of the factors that affect the characteristic(s) to
be sampled (e.g., sources of phosphorus for a phosphorus budget).
2. Some knowledge of the magnitude and variability of the character-
istic^) to be sampled.
3. Pre-specified needs for the collected data. For example, phosphorus
flux data may be used for a year-of-sampling estimate or for future
predictions. Different designs and estimation techniques may be
appropriate for each of these applications.
4. A knowledge of costs associated with sampling.
5. A model, or models, for estimation (when appropriate) that is
compatible with the chosen sampling design.
6. Lake Trophic Quality Modeling
The prediction of the impact of watershed characteristics and activities
on water quality is a necessary task in successful lake water quality
management planning. Prediction implies the use of a conceptual, and most
likely, mathematical, model to express variable relationships and make
projections. To this end, many mathematical models have been developed and
proposed for lake trophic quality management. Initially, most of these models
were presented in a deterministic mode. However, as modelers acquired more
information on the functioning of a lake and watershed system, and as
46
-------
engineers and planners inquired about the reliability of the models,
considerations of uncertainty began to appear. Modelers who examined
uncertainty in their models, and planners who demanded an estimate of the
uncertainty in the techniques that they used, realized that they must have a
measure of the reliability of their methods. Without this, there was no way
to assess the value of the information provided by a model. Under those
conditions, inefficient or incorrect decisions were more apt to be made
because the model results were given too much or too little weight.
Despite the fact that there are many water quality models in existence
and more being developed, this does not necessarily represent a significant
duplication of effort. Models are needed for a range of problems, and thus
they are developed to address a variety of issues at different levels of
mathematical complexity and for different degrees of spatial and temporal
resolution. Thus, for a model user, the choice of model to be applied will
depend upon:
1. the issue of concern,
2. the level of spatial and temporal aggregation appropriate to the
issue,
3. the familiarity of the users to a particular model, or the mathe-
matical sophistication of the user,
4. the cost and time required for acquisition of data necessary to run
the model, and
5. the cost of model acquisition and model runs.
In the field of lake trophic quality modeling, ecosystem models (Thomann
et al. , 1975; Scavia and Robertson, 1979) have been developed to address the
problem of eutrophication in a multi-dimensional manner, often with a fairly
high degree of spatial and temporal resolution. In order to make these models
more useful in the planning process, modelers have begun to quantify the error
terms for ecosystem models (Scavia, 1980). As this occurs, lake ecosystem
models will become more useful for the evaluation of lake management
strategies.
At the other end of the lake model complexity spectrum, black box
nutrient models have been proposed for the assessment of certain lake quality
issues where considerable spatial and temporal aggregation is permissible.
These models are attractive to many planners and engineers because they are
often more compatible with the position of the planner/engineer on the model
selection criteria mentioned above (particularly with regard to mathematical
background and financial support). Since it has been shown that uncertainty
analysis is relatively easily applied to the black box model, modeling with
error analysis is now being undertaken by a group of model users who might
otherwise work strictly with deterministic methods.
47
-------
This is not to say that all lake model users addressing management
concerns should be applying black box models. On the contrary, the first and
second model selection criteria identified above clearly state that the chosen
model should be appropriate to the issue of concern. Certainly there are many
issues of importance in lake quality that are not addressed well with the
black box model. Yet, at the same time, there are issues, and potential model
users, who need simple, aggregated models, because of model selection criteria
3, 4, and 5. Some of these users may demand an estimate of the
model uncertainty. It is more likely, however, that many of these users may
not have thought a great deal about uncertainty. A procedure that allows
these individuals to calculate a numerical value for an estimate of prediction
uncertainty can be a powerful tool for convincing engineers, planners, and
decision makers of the value of uncertainty. Therefore the emphasis in this
section is on a discussion of black- box lake models and associated error
analyses.
Empirically-based input-output lake models for phosphorus were first
proposed in the early 1960s (see Reckhow, 1979a). However, management and
planning applications of these methods were most stimulated by Vollenweider's
thorough analysis (1968) in which he suggested nutrient loading criteria for
lakes as a function of mean depth. In the past twelve years, several
variations (Vollenweider, 1975, 1976; Dillon and Rigler, 1975; Chapra, 1975;
Larsen and Mercier, 1976; Jones and Bachmann, 1976; Reckhow, 1977, 1979b;
Walker, 1977; Rast and Lee, 1978) of this basic theme have been proposed.
These variations have the common features that: (1) they were developed from
a cross-sectional analysis of lake data on annual phosphorus loading,
phosphorus concentration, and selected hydrologic and geomorphologic
variables; (2) empirical "curve-fitting" (objective or subjective) techniques
were used on the cross-sectional data base to relate phosphorus concentration
(sometimes equated with trophic state) to the other variables; and (3) for a
single lake, the methods developed all describe a constant proportional
relationship (expressed in terms of the hydro!ogic/geomorphologic variable(s))
between annual phosphorus loading and "average" lake phosphorus concentration.
These methods are sometimes expressed graphically (e.g., phosphorus loading
criteria) and sometimes expressed in equation form. Essentially the same
information is conveyed in either case, so the choice among presentation modes
is largely dictated by the needs of a particular application.
Probably the major difference among the input-output models (and
graphical procedures) is the variation among the cross-sectional data bases
used to estimate the model parameter(s) (or to locate the trophic state
transition lines). As Reckhow (1979a) notes, some of the models were
empirically fitted on a homogeneous data base and are uncorroborated for use
on lakes with characteristics different from those of the model development
data set. This could result in prediction bias in the uncorroborated cases.
On the other hand, models developed from a homogeneous data base often have
smaller standard errors than do heterogeneously-based models. As a general
rule, a preferred model is one developed using a homogeneous data base from
the subpopulation of lakes containing the application lake(s). In that
situation, some exogeneous variables, important in a heterogeneous data base,
are effectively "controlled for" by reducing lake type variability within the
model development data set.
48
-------
In mathematical terms, the input-output phosphorus lake model may be
expressed in three basic forms:
L
P = r1 (1-R) (35)
P' ^ <37)
where, on an annual basis,
P = lake phosphorus concentration (mg/1)
L = annual areal phosphorus loading (g/m2-yr)
z = lake mean depth (m)
T = hydraulic detention time (yr)
R = lake phosphorus retention coefficient (dimensionless)
q = areal water loading (m/yr)
v = apparent settling velocity (m/yr)
a = sedimentation coefficient (yr-1)
L
— = average influent phosphorus concentration (mg/1).
The model parameters are R, v , and a, respectively. Traditionally, v and a
have been estimated by constants, while R has been fitted as a function of q
or t. Comparisons (Reckhow, 1979a, and Reckhow and Chapra, 1980) among the
fitted models have been made to indicate lake types for which the models are
in relative agreement or disagreement.
An example of the graphical form of the intput-output models in
Vollenweider's phosphorus loading criterion relating L and q . A version of
the loading criterion is presented in Exhibit 14, with %he solid lines
distinguishing oligotrophic, mesotrophic, and eutrophic states. The dashed
and dotted lines reflect the model estimation error associated with the
prediction of trophic state (set equivalent to the phosphorus trophic state
49
-------
10.0
.0
L
(g/m2-yr)
0.1
.01
EUTROPHIC
OLI60TROPHIC
1.0 2.0 5.0 10 20 50 100 200
qs (m/yr)
Exhibit 14. Vollenweider1s phosphorus loading criterion
with model estimation error.
50
-------
criterion in Exhibit lOb) from L and q . This is only a portion of prediction
uncertainty for most applications, arTd Reckhow (1979d) proposes a graphical
method for estimating the magnitude of the additional uncertainty. An
alternative methodology for complete model prediction uncertainty estimation
is presented below. First, a short discussion of uncertainty is in order,
however.
There is always uncertainty in the prediction of a model. Quantification
of this uncertainty can be a useful exercise, because the level of uncertainty
is inversely related to the value of the information contained in the
prediction. Uncertainty in modeling arises from three primary sources: the
input data for the model, the model parameters, and the model itself. One
approach that may be used to estimate prediction uncertainty is first order
error analysis (Cornell, 1972). Under this method, the error in a character-
istic (variable or parameter) is defined by its first nonzero moment (the
variance). Errors are propagated through the model using the first order
terms in the Taylor series, and the variances are then combined to yield the
total prediction uncertainty.
An alternative approach to model prediction error analysis is Monte Carlo
simulation. Under this technique, probability density functions are assigned
to each characteristic (variable or parameter), reflecting the uncertainty in
that characteristic. Then, under the Monte Carlo procedure, values are
randomly selected from the distribution for each term. These values are
inserted into the model, and a prediction is calculated. After this is
repeated a large number of times, a distribution of predicted values results,
which reflects the combined uncertainties.
The quantification of uncertainty associated with the application of lake
models is a relatively recent development. Apparently the first work on this
topic was undertaken by Reckhow (1977) and by Walker (1977). In the past few
years, Reckhow (1979abcd), Chapra and Reckhow (1979), and Reckhow and Chapra
(1979) have expanded upon the use of first order error analysis with input-
output lake models. Much of this is summarized in Reckhow and Chapra (1980).
In addition, O'Hayre and Dowd (1978), Duckstein and Bogardi (1978), Reckhow et
al. (1980a), and Montgomery et al_. (1980) have employed Monte Carlo simulation
to quantify lake modeling errors. Of these four, the first two have proposed
a Bayesian approach. Recently, Scavia (1980) described work on quantifying
lake ecosystem model prediction error using Monte Carlo simulation and the
Extended Kalman Filter (a dynamic counterpart for first order analysis). This
is among the first attempts at error analysis for a relatively complex lake
model.
Five years ago, Dillon and Rigler (1975) proposed a step-by-step
procedure for the estimation of lake phosphorus concentration using a simple
input-output model. When employed in prediction or lake quality management
planning, the methodology included steps for the selection of annual
phosphorus export coefficients (see Uttormark et al. , 1974) associated with
each land use. This procedure has proven to be quite popular as a relatively
comprehensive guide to the use of nutrient export coefficients and an input-
output lake model.
51
-------
One important feature missing from the Dillion-Rigler methodology is a
step for the estimation of prediction uncertainty. Therefore, a procedure has
recently been proposed (Reckhow and Simpson, 1980) that includes a step
describing the estimation and combination of errors for the calculation of a
nonparametric prediction interval. This procedure employs a phosphorus lake
model of the form presented in Equation 36. Using nonlinear least squares,
the model parameter, v , was estimated (Reckhow, 1979d) as:
v = 11.6 + 0.2 a (38)
3 j
resulting in the empirical phosphorus lake model:
L
P =
11.6+1.2q
The Reckhow-Simpson procedure is described in step-by-step detail in the
original reference and elsewhere (Reckhow et al. , 1980b; Reckhow and Chapra,
1980), so an overview of the phosphorus loading estimation methods are
presented below followed by a detailed explanation of the error calculation
steps. For phosphorus loading determination, it is recommended that high,
most likely, and low export coefficients be selected for the phosphorus source
categories. This allows the calculation of high, most likely, and low total
loading estimates. The high and low loading estimates represent the
additional phosphorus loading error that must be added to the model error for
the calculation of total prediction uncertainty. It is important that the
high and low loadings primarily represent uncertainty due to (1) projection
uncertainty associated with anticipated land use and population changes during
the planning period, and (2) extrapolation uncertainty associated with the use
of phosphorus export data measured at another point in space and/or time.
This requirement exists because to a great extent, the error in the phosphorus
loading estimates is already contained in the model error. Additional loading
error for an application lake must be included only when the loading is
estimated (using the procedure herein) in a different (and less precise)
manner than it was estimated for the model development data set. The
references mentioned above offer additional guidance in the choice of
phosphorus export coefficients, and Exhibit 15 presents some typical values.
The selection of appropriate phosphorus export coefficients is a dif-
ficult task. It is largely contingent upon the analyst matching the
application lake watershed with candidate export coefficient watersheds
according to characteristics that determine phosphorus export from the land.
A close match should insure that the selected export coefficients are
reasonably representative of conditions in the application lake watershed.
Since a critical aspect of this modeling exercise is the estimation of
prediction errors, the analyst should realize that poor choice of export
values contributes to an increase in error. This contribution may be explicit
or implicit in the analysis, depending upon whether or not the analyst is
aware of all of the uncertainty introduced by his/her choice of phosphorus
export coefficients. Clearly, experience in the application of this modeling
approach is a valuable attribute. Following selection of phosphorus export
coefficients and calculation of the total phosphorus loadings, the three total
phosphorus loading estimates are then separately inserted into the model
52
-------
Exhibit 15. Phosphorus export coefficients (units are Kg/106m2-yr,
except septic tank as indicated; values are adopted
from Uttormark et al., 1974, and Reckhow et al. , 1980).
Input to
Septic Tank
Agriculture Forest Precipitation Urban (Kg/capita-yr)
High
Mid
Low
300
40-170
10
45
15-30
2
60
20-50
15
500
80-300
50
1.8
0.4-0.9*
0.3
* The value selected will depend, in part, upon whether or not phosphate
detergents are permitted.
(Equation 39), and "high," "most likely" and "low" (P(h- h), P(ml), and
respectively) lake phosphorus concentrations are calculated.
In order to estimate the uncertainty associated with a prediction
calculated using the phosphorus model, estimates are needed for the error, or
uncertainty, in all terms in the model, and in the model itself. However, it
has been shown by Reckhow (1979b) that for most applications of this model,
the error in the parameter v is small. Further, error in q is primarily a
function of flow measurement error and hydrologic variability, which also
affect L. Since L and q are in the numerator and denominator, respectively,
in the model, the errors affecting both tend to cancel when they are combined
to yield the resultant error in P. In addition, hydrologic variability is
unimportant in lakes with low flushing rates. Therefore, it is assumed here
that the prediction error is a function only of model error and of aspects of
phosphorus loading uncertainty that are identified in Reckhow and Simpson
(1980). If the application lake flushes rapidly and is subject to great
variations in year-to-year precipitation, then the modeler is urged to include
hydrologic variation in the error analysis using the error propagation
equation (Reckhow et al., 1980, outlined the appropriate procedure).
The model error is represented by s , in the equations below and is
expressed in logarithmic units of phosphorus concentration error. The loading
error, s. , on the other hand, is expressed in untransformed units of
phosphorus loading error. Therefore, to combine these two values for an
estimate of total prediction uncertainty, some calculations are necessary.
The procedure presented below is based on first order error analysis
(Benjamin and Cornell, 1970). In this particular application, three
assumptions are of some importance:
1. Model error, expressed in log-transformed concentration units, is
appropriately combined with variable error terms after the
transformation is removed.
53
-------
2. The "range" ("high" minus "low"), for phosphorus loading error, is
approximately two times the standard deviation. This is based
loosely on the characteristics of the Chebyshev inequality
identified below, where about 90% of the distribution is contained
within ±2 standard deviations of the mean.
3. The individual error components are adequately described by their
variances (standard deviations).
In order to relax a previously imposed (Reckhow, 1979b) yet tenuous normality
assumption, the confidence intervals constructed below are based on a
modification of the Chebyshev inequality (Benjamin and Cornell, 1970).
Therefore, it is no longer required that the total error term be normally
distributed. Instead its distribution must only be unimodal and have "high
order contact" with the abscissa in the distribution tails. These are
achievable assumptions under almost all conditions, and it is recommended that
this type of nonparametric approach be adopted until the distributions have
been adequately studied and characterized.
Step A: Calculation of log P, ,,.
Take the logarithm of the most likely phosphorus concentration, P, -,-..
Step B: Estimation of s + ("positive" model error)
The model error, (s , ), was determined to be 0.128. Add s -. to log
9, -.} and take the antilog of this value. Now calculate the difference
between this antilog value and P/-m-i)- Label this difference s +; it
represents the "positive" model error.
S[/ = antilog [logP(ml) + smlog] - P(ml) (40)
Step C: Estimation of s - ("negative" model error)
Subtract s , from log P, -.x and take the antilog of this value. Now
m log (m I )
calculate the difference between this antilog and P, -,x, and label this
difference s -.
m
V = ant110g °9 D " Smiog - 1) (41)
Step D: Estimation of s.+ ("positive" loading error)
Now, one must convert the loading error estimate into units compatible
with the model error. Use the P,.. .x concentration estimated earlier and
54
-------
calculate the difference between P,. . ,% and P, ,.; then divide this dif-
ference by 2. Label this value s.+; it represents the "positive" loading
error contribution.
s + = P(high) " P(ml) (42)
Step E: Estimation of s.- ("negative" loading error)
Repeat Step D substituting the low concentration value P., x for
P,. . ,,. Label the resultant value s,-; it represents the "negative" loading
error contribution.
" Pdow)
Step F: Estimation of ST+ (total "positive" uncertainty)
Total positive prediction uncertainty is calculated using the equation:
ST+ = V (sm+)* + (SL+)* (44)
Step G: Estimation of ST~ (total "negative" uncertainty)
Total negative prediction uncertainty is calculated using the equation:
ST- = V (sm-)* + (sL-)2 (45)
Step H: Calculation of confidence limits.
The prediction uncertainty may be expressed in terms of "confidence
limits" which represent the prediction plus or minus the prediction
uncertainty. Confidence limits have a definite meaning in classical
statistical inference; they define a region in which the true value will lie a
pre-specified percentage of the time.
Using the modification of the Chebyshev inequality (Benjamin and Cornell,
1970), the confidence limits may be written as:
Prob [P(ml) - hsT-) < P < (P(ml) + hsT+) > 1
55
(46)
-------
Equation 46 states that the probability that the true phosphorus concentration
lies within certain bounds, defined by a multiple, h, of the prediction error,
is greater than or equal to 1-1/2. 25h2. (This relationship loses its sig-
nificance as h drops much below one.) Substituting values for h into Equation
46 reveals that a value of one for h corresponds to a probability of about 55%
(.556 to be exact), and a value of two for h corresponds to a probability of
about 90% (.889 to be exact). Thus the 55% confidence limits are:
Once specific values for the prediction error have been inserted into the
confidence limits expression, its interpretation changes somewhat. It is:
"about 55% of the time (that confidence limits are estimated), one can expect
that the actual phosphorus concentration will lie within bounds defined by the
prediction plus or minus the prediction uncertainty." This same interpre-
tation format applies when the confidence limits are widened to the 90% level
(h=2), and specific data are inserted:
Prob [(P(ml) - 2sT-) < P < (P(ml) + 2sT+)] > .90 (48)
7. Concluding Comments
Mathematical models and statistical methods can be quite helpful for the
analysis of quantitative problems. When used incorrectly, however, these
techniques can yield misleading results that ironically have high credibility
due to their mathematical or statistical basis. Therefore, it is important
that the analyst understand the inherent assumptions, the limitations, and the
proper use of the methods presented herein. To underscore some of the issues
concerning the use of the models and statistics, some concluding thoughts are
offered below:
1. There are certain procedures to be followed in scientific studies,
and these procedures are collectively called the scientific method.
Analysts engaging in scientific endeavors should be cognizant of
proper definition of vague terms, specification of assumptions
inherent in their work, considerations of uncertainty and risk,
causality, and testing or corroboration of models.
2. The acquisition of data is frequently a problem of statistical
sampling design. Often the design choice reflects a trade-off
between the cost of analysis and the resultant uncertainty
associated with the acquired data. There are both concepts and
mathematical relationships that can be helpful in designing these
programs.
3. Data analysis should be undertaken with consideration of the "vague
concept" of interest. Graphical analysis of data is often helpful.
4. Since phosphorus loading/lake response modeling is probably a
principal concern to users of this document, several comments are
presented on this topic:
56
-------
Reckhow (1979d) and Reckhow and Simpson (1980) identify the
major application limitations for the modeling/uncertainty
analysis procedure presented in the last section. In
fundamental terms, the limitations are generally associated
with the fact that the model development data set for any
particular model represents a subpopulation of lakes.
Application lakes that differ substantially from the model
development subpopulation may not be modeled well (i.e.,
results may be biased). Any limnologic characteristic that is
a causal determinant of lake phosphorus concentration is a
candidate as a limiting, or constraint, variable. These
include constraints on the model variables (e.g., all model
development data set lakes have P < .135 mg/1), constraints on
hydrology (e.g., there are no closed lakes in the model
development data set), or constraints on climate (e.g., the
model development data set contains only north temperate
lakes).
The methodology described in the previous section can be used
to quantify the relationship between watershed land use and
lake phosphorus concentration. Yet phosphorus by itself is not
an objectionable water quality characteristic. The real
quality variable of concern (i.e., the characteristic(s) that
lend(s) value or human benefit to the water body, abbreviated
"qvc") may be algal biomass, water clarity, dissolved oxygen
levels, or fish populations. Therefore the modeling method-
ology and the error analysis do not include all of the
calculations necessary to link control variables (land use)
with the qvc. This means that the relevant prediction error
(on the qvc) is underestimated by the phosphorus model
prediction error, and planning and management risks are
inadequately specified. More useful methodologies are needed
that quantitatively link control variables with the qvc for a
particular application.1
The error analysis procedure suggested by Reckhow and Simpson
should provide a reasonable estimate of prediction uncertainty.
However, there are still problems in interpretation and
application. For instance, the model error component was
estimated from a least squares analysis on a multi-lake
(cross-sectional) data set. This error is then applied to a
single lake in a longitudinal sense. Thus, much of the model
error term actually results from multi-lake variability,
whereas when the model is applied to a single lake, the model
error term should consist primarily of lack-of-fit bias and
single lake variability. On the basis of present knowledge, it
Certain complex models (see Scavia and Robertson, 1979) are comprehensive in
system coverage from control variables to qvc. However, these models
possess other shortcomings (large error terms and inadequate testing or
corroboration) that affect their utility in lake quality management
planning.
57
-------
is not clear how a multi-lake-derived error relates to a single
lake analysis.
d. A second issue associated with the error analysis concerns the
subjective determinations of phosphorus loading and hence,
loading estimation error. Statisticians and modelers generally
prefer objective measures of uncertainty, such as calculated
variability in a set of data. However, both limited available
data and the obviously unmeasurable nature of future impacts
favor (or necessitate) subjective estimates. Given this
subjectivity, and the inexperience of most planners and
analysts with phsophorus loading estimation, there may be
uncertainty in the uncertainty estimates. This is exacerbated
by the potential for loading error "double counting" (see
Reckhow, 1979d), although the Reckhow/Simpson procedure is
designed to reduce error double counting. It is likely that as
analysts gain experience in loading and error estimation, this
problem will be of less importance.
e. At this time a comment on model selection is in order, given
the number of models developed in recent years. It is probably
presumptuous of a modeler to label his/her model as "best"
without stating some relevant qualifications or criteria. A
"best" model is generally best according to some error
criterion (like least squares) and for some subpopulation of
the cases modeled. The planner/analyst should select a model
that has been documented as best for conditions identical or
similar to those of concern. Reckhow and Chapra (1980) discuss
several characteristics and criteria that should be included in
the model developer's documentation of his/her water quality
management planning model. The prospective model user would be
wise to request and examine this documentation before selecting
the application-specific best model.
5. Ultimately the analyses conducted under the guidance of this
document will be used to aid lake quality management planning.
Therefore, given this planning objective, two final thoughts are
offered for the analyst to consider:
a. Water quality management planning and modeling incur a cost
that is presumably justified in terms of the value of the
information provided. The actual achievement of a water
quality level often requires management and pollutant abatement
costs but also carries with it various benefits. The analyst
must be cognizant of the fundamental economic nature of
environmental management, planning, and decision making. The
acquisition of additional data or the conduct of additional
modeling anti planning studies should be justified in terms of
information return for improved decision making.
b. The planner or analyst conducting a lake data analysis or
modeling study has as his/her primary goal the effective com-
58
-------
munication of the work carried out. This does not simply mean
documentation of the calculations and presentation of the
statistics or the prediction and prediction uncertainty.
Rather, effective communication requires consideration of the
knowledge and concerns of the likely audience. The analyst
must then describe his/her study so that the audience can
comprehend the results, can understand the study's limitations,
and can act (if necessary) in an informed manner. As a rule,
this means that the analyst should completely describe
procedural limitations and assumptions made in conducting the
study. Beyond that, the analyst should explain how the
limitations and assumptions affect the interpretation of the
results for planning. A comprehensive discussion of the
application of the statistical analysis or the modeling
methodology that meets the needs of the intended audience
facilitates good water quality management planning.
59
-------
References
Ackoff, R. L. 1962. Scientific Method: Optimizing Applied Research
Decisions. J. Wiley & Sons, Inc., New Ybrk, 464 pp.
Benjamin, J. R., and C. A. Cornell. 1970. Probability, Statistics, and
Decision for Civil Engineers. McGraw-Hill, New York, 684 pp.
Blalock, H. M. Jr. 1972. Social Statistics. McGraw-Hill, New York, 583pp.
Boneau, C. A. 1962. A Comparison of the Power of U and t Tests.
Psychological Review, 69(3):246-256.
Box, G. E. P., W. G. Hunter, and J. S. Hunter. 1978. Statistics for
Experimenters: An Introduction to Design, Data Analysis, and Model
Building. J. Wiley & Sons, Inc., New York, 653 pp.
Carlson, Robert E. 1977. A Trophic State Index for Lakes. Limnol. Oceanogr.
22(2):361-369.
Chapra, S. C. 1975. Comment on an Empirical Method of Estimating the
Retention of Phosphorus in Lakes by W. B. Kirchner and P. J. Dillon.
Water Resour. Res., 11(6):1033-1034.
Chapra, S. C. , and K. H. Reckhow. 1979. Expressing the Phosphorus Loading
Concept in Probabilistic Terms. J. Fish. Res. Board Can., 36(2):255-229.
Chatterjee, S. and B. Price. 1977. Regression Analysis by Example. J. Wiley
& Sons, Inc., New York, 228 pp.
Cochran, W. G. 1963. Sampling Techniques. J. Wiley & Sons, Inc., New York.
Cornell, C. All in. 1972. First-Order Analysis of Model and Parameter
Uncertainty. In: Proceedings of the International Symposium on
Uncertainties in Hydrologic and Water Resources Systems. University of
Arizona, Tucson, Arizona. Vol. Ill, 1245-1274.
Dillon, P. J., and F. H. Rigler. 1975. A Test of a Simple Nutrient Budget
Model Predicting the Phosphorus Concentration in Lake Water. J. Fish.
Res. Board Can., 31(11):1711-1778.
Dobson, H. F. H., M. Gilbertson, and P. G. Sly. 1974. A Summary and
Comparison of Nutrients and Related Water Quality in Lakes Erie, Ontario,
Huron, and Superior. J. Fish. Res. Bd. Can. 31:731-738.
60
-------
Duckstein, L. , and I. Bogardi. 1978. Uncertainties in Lake Management. In:
Proceedings of the International Symposium on Risk and Reliability in
Water Resources. University of Waterloo, Waterloo, Ontario. pp. 638-
661.
Edmondson, W. T. 1980. Secchi Disk and Chlorophyll. Limnol. Oceanogr.
25(2): 378-379.
Freese, F. 1962. Elementary Forest Sampling. U.S. Dept. of Agriculture,
Forest Service, Agriculture Handbook No. 232, 91 pp.
Hansen, M. H. , W. N. Hurwitz, and W. G. Madow. 1953. Sample Survey Methods
and Theory. J. Wiley & Sons, Inc., New York, 638 pp.
Hollander, M. , and D. A. Wolfe. 1973. Nonparametric Statistical Methods.
J. Wiley & Sons, Inc., New York, 503 pp.
Jessen, R. J. 1978. Statistical Survey Techniques. J. Wiley & Sons, Inc.,
New York, 520 pp.
Jones, J. R. , and R. W. Bachmann. 1976. Prediction of Phosphorus and
Chlorophyll Levels in Lakes. J. Water Pollut. Control Fed., 48(9):2176-
2182.
Kleinbaum, D. G. , and L. L. Kupper. 1978. Applied Regression Analysis and
Other Multivariable Methods. Duxbury Press, North Scituate, Mass., 556
pp.
Larsen, D. P. 1980. personal communication.
Larsen, D. P. , and H. T. Mercier. 1976. Phosphorus Retention Capacity of
Lakes. J. Fish. Res. Board Can., 33(8):1742-1750.
Lee, David R. 1977. A Device for Measuring Seepage Flux in Lakes and
Estuaries. Limnol. Oceanogr. 22(1):140-147.
Lorenzen, M. W. 1980. Use of Chlorophyll-Secchi Disc Relationships. Limnol.
Oceanogr. 25(2):371-372.
Marsalek, J. 1975. Sampling Techniques in Urban Runoff Quality Studies. In:
Water Quality Parameters, ASTM STP 573, American Society for Testing and
Materials, pp. 526-542.
McGill, R. , J. W. Tukey, and W. A. Larsen. 1978. Variations of Box Plots.
Am. Stat. 32:12-16.
Megard, R. 0., J. C. Settles, H. A. Boyer, and W. S. Combs, Jr. 1980. Light,
Secchi Disks, and Trophic States. Limnol. Oceanogr. 25(2):373-377.
61
-------
Montgomery, R. H. , V. D. Lee, and K. H. Reckhow. 1980. A Comparison of
Uncertainty Analysis Techniques: First Order Analysis vs. Monte Carlo
Simulation. Paper presented at the International Association for Great
Lakes Research Conference, Kingston, Ontario.
Mosteller, F. , and R. E. K. Rourke. 1973. Sturdy Statistics: Nonparametric
and Order Statistics. Addison-Wesley, Reading, Mass., 395 pp.
Mosteller, F. , and J. W. Tukey. 1977. Data Analysis and Regression: A
Second Course in Statistics. Addison-Wesley, Reading, Mass., 588 pp.
National Academy of Science and National Academy of Engineering. 1972. A
Report of the Committee on Water Quality Criteria. Washington, D.C.
O'Hayre, A. P., and J. F. Dowd. 1978. Planning Methodology for Analysis and
Management of Lake Eutrophication. Water Resources Bulletin 14(l):72-82.
Omernik, J. M. 1977. Nonpoint Source - Stream Nutrient Level Relationships:
A Nationwide Study. U.S. Environmental Protection Agency, EPA-600/3-77-
105. Corvallis, Oregon.
Popper, K. R. 1968. The Logic of Scientific Discovery. Harper Torchbooks,
New York, 480 pp.
Porcella, D. B. , S. A. Peterson, and D. P. Larsen. 1980. An Index to
Evaluate Lake Restoration. Am. Soc. Civil Eng. Jour. (In Press).
Rast, W. , and G. F. Lee. 1978. Summary Analysis of the North American (U.S.
Portion) OECD Eutrophication Project: Nutrient Loading - Lake Response
Relationships and Trophic State Indices. U.S. Environmental Protection
Agency, EPA-600/3-79-008. Corvallis, Oregon.
Reckhow, K. H. 1977. Phosphorus Models for Lake Management. Ph.D.
dissertation. Harvard Univ. , Cambridge, Mass. 304 pp.
Reckhow, K. H. 1979a. Empirical Lake Models for Phosphorus: Development,
Applications, Limitations, and Uncertainty, In: Perspectives on Lake
Ecosystem Modeling, pp. 193-221. Edited by D. Scavia and A. Robertson.
Ann Arbor Science Publishers, Ann Arbor, Mich.
Reckhow, K. H. 1979b. Quantitative Techniques for the Assessment of Lake
Quality. U.S. Environmental Protection Agency, EPA-440/5-79-015.
Washington, D.C. 146 pp.
Reckhow, K. H. 1979c. The Use of a Simple Model and Uncertainty Analysis in
Lake Management. Water Resour. Bull, 15(3):601-611.
Reckhow, K. H. 1979d. 'Uncertainty Analysis Applied to Vollenweider's
Phosphorus Loading Criterion. J. Water Pollut. Control Fed., 51(8):2123-
2128.
62
-------
Reckhow, K. H. 1979e. Sampling Designs for Lake Phosphorus Budgets. In:
Proceedings of the Symposium on the Establishment of Water Quality
Monitoring Programs, pp. 285-306. American Water Resources Association,
Minneapolis, Minn.
Reckhow, K. H. 1980. Techniques for Exploring and Presenting Data Applied to
Lake Phosphorus Concentration. Can. J. Fish. Aq. Sci. 37(2):290-294.
Reckhow, K. H. , and Harbert Rice. 1975. 208 Modeling Approach. Report
Prepared for the New Hampshire Lakes Region Planning Commission.
Reckhow, K. H. , and S. C. Chapra. 1979. A Note on Error Analysis for a
Phosphorus Retention Model. Water Resour. Res. 15(6):1643-1646.
Reckhow, K. H. , and S. C. Chapra. 1980. Engineering Approaches for Lake
Management: Data Analysis and Modeling. Ann Arbor Science, Ann Arbor,
Mich. (In Press).
Reckhow, K. H. , and J. T. Simpson. 1980. A Procedure Using Modeling and
Error Analysis for the Prediction of Lake Phosphorus Concentration from
Land Use Information. Can. J. Fish. Aq. Sci. 37(9):1439-1448.
Rechow, K. H. , V. D. Lee, and S. C. Chapra. 1980a. An Examination of Lake
Model Prediction Uncertainty Using First Order Analysis and Monte Carlo
Simulation. Paper presented at the American Society of Limnology and
Oceanography Annual Meeting. Los Angeles, Calif.
Reckhow, K. H. , M. N. Beaulac, and J. T. Simpson. 1980b. Modeling Phosphorus
Loading and Lake Response Under Uncertainty: A Manual and Compilation of
Export Coefficients. U.S. Environmental Protection Agency, EPA-440/5-
80-011. Washington, D.C. 214 pp.
Sakamoto, M. 1966. Primary Production by Phytoplankton Community in Some
Japanese Lakes and Its Dependence on Lake Depth. Arch. Hydrobiol.
62:1-28.
Scavia, D. 1980. Uncertainty Analysis for a Lake Eutrophication Model.
Ph.D. dissertation. University of Michigan, Ann Arbor, Mich.
Scavia, D. , and A. Robertson, Eds. 1979. Perspectives on Lake Ecosystem
Modeling. Ann Arbor Science Publishers, Inc., Ann Arbor, Mich.
Shannon, Earl E. , and Patrick L. Brezonik. 1972. Eutrophication Analysis: A
Multivariate Approach. Journ. San. Eng'g. Div. ASCE 98(1):37-57.
Simpson, J. T. , and K. H. Reckhow. 1980. An Empirical Study of Factors
Affecting Blue-Green versus Nonblue-Green Algal Dominance in Lakes.
Office of Water Research and Technology, Dept. of Interior. Available
from NTIS (PB 80169311).
Snedecor, G. W. , and W. G. Cochran. 1967. Statistical Methods. Iowa State
University Press, Ames, Iowa.
63
-------
Snow, Phillip D. , and Francis A. DiGiano. 1976. Mathematical Modeling of
Phosphorus Exchange between Sediments and Overlying Water in Shallow
Eutrophic Lakes. Report No. Env. E. 54-76-3. Dept. of Civil
Engineering, University of Massachusetts. Amherst, Mass.
Thomann, R. V., D. M. DiToro, R. P. Winfield, and D. J. O'Connor. 1975.
Mathematical Modeling of Phytoplankton in Lake Ontario, Part 1. Model
Development and Verification. U.S. Environmental Protection Agency,
EPA-660/3-75-005. Corvallis, Oregon.
Tukey, J. W. 1977. Exploratory Data Analysis. Addison-Wesley, Reading,
Mass. 688 pp.
U.S. Environmental Protection Agency. 1974. The Relationships of Phosphorus
and Nitrogen to the Trophic State of Northeast and North-Central Lakes
and Reservoirs. National Eutrophication Survey Working Paper No. 23,
USEPA, Corvallis, Oregon.
Uttormark, P. D. , J. D. Chapin, and K. M. Green. 1974. Estimating Nutrient
Loading of Lakes from Nonpoint Sources. U.S. Environmental Protection
Agency, EPA-660/13-74-020. Corvallis, Oregon.
Vollenweider, R. A. 1968. The Scientific Basis of Lake and Stream
Eutrophication, with Particular Reference to Phosphorus and Nitrogen as
Eutrophication Factors, OECD (Organ. Econ. Coop. Dev.) Paris Tech. Rep.
DAS/DSI/68.
Vollenweider, R. A. 1975. Input-Output Models With Special Reference to the
Phosphorus Loading Concept in Limnology. Schweiz. Z. Hydrol., 37:53-84.
Vollenweider, R. A. 1976. Advances in Defining Critical Loading Levels for
Phosphorus in Lake Eutrophication. Mem. 1st. Ital. Idrobiol., 33:53-83.
Walker, W. W. , Jr. 1977. Some Analytical Methods Applied to Lake Water
Quality Problems. Ph.D. dissertation. Harvard University, Cambridge,
Mass. 528 pp.
Walker, W. W., Jr., 1979. Use of Hypolimnetic Oxygen Depletion Rate as a
Trophic State Index for Lakes. Water Resour. Res. 15(6):1463-1470.
Williams, B. 1978. A Sampler on Sampling. J. Wiley & Sons, Inc., New York,
254 pp.
Wonnacott, T. H. , and R. J. Wonnacott. 1972. Introductory Statistics. J.
Wiley & Sons, Inc., New York, 510 pp.
64 * US GOVERNMENT PRINTINO OFFICE 1861 -757-064/0318
------- |