-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Results
Data collected from the pre-treatment period (1993 to 1997) were compared to those from the
treatment and control basins to determine the effectiveness of streambank fencing. Changes in
the yields of nutrients and suspended sediment during low flow and storm flow events were
quantified using ANCOVA. ANCOVA was also used to quantify changes between pre- and post-
treatment concentrations of nutrients, water quality, and fecal streptococcus in collected water
samples, as well as nested wells inside and outside of the treatment area. Canonical
correspondence analysis (CCA) was used to determine the effects of streambank fencing on
instream biological conditions as characterized by benthic macroinvertebrates. A brief overview
of major findings from analysis of water chemistry data is presented here, followed by a more
detailed summary of results from macroinvertebrate monitoring.
It was concluded that water chemistry results indicated that riparian fencing had fairly consistent
effects on suspended sediment but less clear effects on nutrients. Post-treatment period
improvements were evident at site T-l for both nutrients and sediments; however, site T-2
showed reductions only in suspended sediment. The average reduction in suspended sediment
yield for the treated sites was about 40 percent. N species at T-l showed reductions of 18 percent
(dissolved NO3) to 36 percent (dissolved ammonia); yields of TP were reduced by 14 percent.
Conversely, site T-2 showed increases in N species of 10 percent (dissolved ammonia) to 43
percent (total ammonia plus organic N), and a 51-percent increase in yield of TP. The different
results for nutrients at T-2 and T-l were attributed to ground water contributions and the failure
to implement nutrient management along with the fencing. Shallow ground water flow
contributed to stream flow at T-2, but the stream was losing water to the shallow ground water
system at T-l. It is believed that an upland agricultural field caused increased dissolved P levels in
shallow ground water at T-2, resulting in a transport of P from ground water to the stream that
increased stream P levels. In addition, cattle contributed nutrients directly to the stream via
excretion at the embedded stream crossing at T-2.
Analysis of the benthic macroinvertebrate samples showed some apparent improvements relative
to the control sites in riparian and instream habitat (sites T2-3 and T-l versus C-l). Some
differences in bottom substrate, bank stability, available cover, and scouring and deposition were
observed in the downstream and upstream locations within the treatment basin that could
potentially be considered slight improvements. Water quality data collected during the benthic
macroinvertebrate sampling suggested the overall improvement to instream habitat was due to
the decreased load of suspended sediment (Figure CS3-2). The fenced riparian buffer, despite
being narrower than what was considered optimal, allowed vegetation to become fully
established and bank stability to improve. It was particularly evident at site T2-3 where it became
overgrown with vegetation and blocked the stream from view.
The composite benthic index, which combines all metrics, is called the "Macroinvertebrate
Aggregated Index" (MAI). For both spring and fall samples, the index showed some improvement
for the treatment sites relative to control sites, though, trends were mixed overall. For the
treatment basin, sites Tl-3 and T2-3 showed no change with spring samples, while the outlet site,
T-l, showed a slight 1 unit increase. From the pre-treatment to the post-treatment period, fall
index scores changed in the control basin by 1 unit, increasing at C-l and decreasing at Cl-2. Fall
scores for the treatment basin also changed over this timeframe, increasing by 2 units for T-l and
Tl-3, but decreasing by 1 unit at T2-3.
Disaggregating the index into individual metrics allows evaluation of different components of the
benthic macroinvertebrate assemblage. In this dataset, there are different responses by different
4-25
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
metrics. No difference was seen for five of the 10 genus-level metrics, which included percent
dominant taxa (generic level) (PDTG), EPTtaxa, percent EPTtaxa, percent shredders, and ratio of
scrapers to filterers. Thus, treatment elicited no effect for 50 percent of the metrics. Two of the
metrics (EPT/Chironomidae ratio, Hilsenhoff Biotic Index [genus level]; 20 percent) suggested
some, or slight, effect of the streambank fencing on treatment sites relative to control; and,
distinct effects were seen for the remaining three metrics: percent Chironomidae (Figure CS3-3),
taxa richness, and percent Oligochaeta.
Further evaluation of taxa lists for dominance, occurrence, and uniqueness of and by individual
taxa can help illuminate differences, in particular, for those taxa that are known to be more
pollution-tolerant or sensitive. Spring samples were numerically dominated by worms (Naididae,
Tubificidae), scud (Amphipoda: Gammaridae), several different midges (Diptera: Chironomidae:
i.e., Cricotopus, Orthocladius, Dicrotendipes, Micropsectra), and blackflies (Diptera: Simuliidae:
Simulium). The fall samples illuminated a shift in the actual taxonomic composition of the site to a
greater diversity (i.e., a larger number of taxa) and dominance by largely different taxa, including
riffle beetles (Coleoptera: Elmidae: Dubiraphia, Stenelmis), net-spinning caddisflies (Trichoptera:
Hydropsychidae: Hydropsyche, Cheumatopsyche), midges (Chironomus, Dicrotendipes,
Polypedilum, Rheotanytarsus), and blackflies. The only taxa that retained any kind of dominance
for this site across seasons were Dicrotendipes and blackflies. The overall differences are driven by
seasonality of the system, simply showing a greater diversity during the fall season and not by any
changes in stressor load. Although the two outlet sites (C-l, T-l) basically had the same taxa
dominating sample data, each had a greater diversity than the upstream sites. Elevated
taxonomic/biological diversity can be an indicator of greater diversity and complexity of habitat
characteristics; lacking other types of stressor loads, this diversity is likely what is reflected at the
outlet sites. The upstream sites in the treatment basin (T2-3, Tl-3) consistently showed more
diversity than Cl-2, but the overall PDTG means for Tl-3 and T2-3 increased slightly (<1 percent)
and by 13 percent, respectively, from the pre-to post-treatment period. Because differences in
assemblage makeup are largely explainable by factors other than what might be introduced by
streambank fencing, such as seasonality, and expected physical habitat characteristics, the
authors concluded that the treatment did not seem to improve benthic-macroinvertebrate
community structure based on PDTG.
Across the full dataset, the dominant family-level taxa in spring samples were Chironomidae,
Gammaridae, Naididae, and Tubificidae, all recognized as being semi-tolerant to organic
enrichment. The dominant families in fall samples were Gammaridae, Tubificidae, Elmidae,
Physidae, Baetidae, Chironomidae, and Simuliidae, all considered as being moderately to very
tolerant of organic enrichment. This indicates that the more sensitive taxa were not able to
become dominant members of the benthic-macroinvertebrate assemblage after the fences were
installed in the treatment basin. Sensitive taxa may not have been present or only a few
individuals were present during the post-treatment period because of 1) not enough time for the
system to equilibrate to the new conditions, or 2) because these are spring-fed, first- to second-
order limestone streams. Limestone streams typically support assemblages including mayflies
(Ephemeroptera), midges (Chironomidae), scud (Amphipoda: Gammaridae), and pillbugs/sowbugs
(Isopoda: Asellidae), all of which were present in treatment and control basins. Some of the more
stressor-sensitive taxa that were present in small numbers included Promoresia (Elmidae),
Oxyethira (Hydroptilidae), Antocha (Tipulidae), and two species of Chironomidae (Pagastia and
Prodiamesa).
4-26
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
LU
^
LU
100
Q 80
Q
- 60
LL
O
E
i
20
O 0
4
I
4
i S
^ —
K
ni
ROVEMI
Q_
1
O 80
^y
g
5 60
LL
O
1 4°
|
JO 20
LU
D.
0
-
-
I
-
1
I^H
I
I
C-1
.
I
j.
a
L! ! 4
"•
Lpl
i l*d
T
i
T
I
4
31
i
T-1 C1-2 T1-3
SITE
SEPT. .
-
4 :!
n
I
-
E
i
T2-3
Figure CS3-3. Distribution of the percentage of Chironomidae to total number
of individuals for May and September sampling events at benthic-
macroinvertebrate sites for the pre- and post-treatment periods in the Big
Spring Run Basin, Lancaster County, PA. Light shaded boxes are pretreatment,
dark shaded are post-treatment, C represents Control, and T represents
Treatment.
Overall, the authors conclude that streambank fencing had a positive influence on the taxonomic
diversity of benthic-macroinvertebrates, both at genus and family levels. This positive influence is
interpreted as primarily resulting from stabilization of the riparian zone, allowing growth of
streamside vegetation to progress, and ultimately allowing better habitat to develop and support
more taxa.
Previous studies suggest that for optimal reduction of nutrient loads into nearby aquatic systems,
the buffer size should be greater than the 1.5- to 3.6-m buffer used in this study. Because there
4-27
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
was such a wide range of what would be considered an adequate buffer size, it was uncertain
which nutrient types, if any, could be controlled or reduced with this approach. Study results
show that while the fenced streambank buffer was relatively small, it still was substantially
effective in improving surface and near-stream shallow ground water quality, and led to some
improvement in instream biology. Small-scale stream buffers and exclosure fencing both have
limited effectiveness in controlling high nutrient input ultimately transported through subsurface
flows; the most pronounced effects of exclosures are in reducing suspended sediment inputs and
consequently leading to improved habitat. There may be some effectiveness in controlling
excessive nutrient flows, but the benefits are likely minor in comparison to the habitat effects.
Literature
Galeone, D.G., R.A. Brightbill, D.J. Low, and D.L. O'Brien. 2006. Effects of Streambank Fencing of
Pasture/and on Benthic Macroinvertebrates and the Quality of Surface Water and Shallow
Ground Water in the Big Spring Run Basin of Mill Creek Watershed, Lancaster County,
Pennsylvania, 1993-2001. Scientific Investigations Report 2006-5141. U. S. Geological
Survey, Reston, VA.
Galeone, D.G. and E.H. Koerkle. 1996. Study design and Preliminary Data Analysis for a
Streambank Fencing Project in the Mill Creek Basin, Pennsylvania. Fact Sheet FS-193-96.
U. S. Geological Survey, Lemoyne, PA.
4-28
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
Table 4-3. Waterbody stratification hierarchy
Population
Operational
Sampling Unit (SU)
Strata (or higher
stages) comprising
SUs
Habitat within SUs
Streams
Channel segment
(i.e., a length of river
channel into which no
tributaries flow)
Ecoregion
Watershed
Stream/river channel
(ordinal/areal)
Segment
Characteristic water
quality (natural
conditions) Hydraulic
conductivity
(homogeneous,
heterogeneous,
isotropic, anisotropic)
Macrohabitat
(pool/riffle; Shorezone
vegetation;
submerged aquatic
macrophytes)
Microhabitat
Lakes'^
Self-contained basin
Ecoregion
Size/surface area of
lake (km2 )
Lake hydrology
(retention time,
thermal stratification)
Characteristic water
quality (natural
conditions)
Depth zone (eulittoral
6 profundal)
Substrate/
microhabitat
Reservoirs'^
Self-contained basin
(hydrologically
isolated from other
basins)
Ecoregion
Size/surface area of
reservoir (km2 )
(watershed area/basin
surface area)
Hydrology (water level
fluctuation/
drawdown; retention
time stratification)
Characteristic water
quality (natural
conditions)
Longitudinal zone
(riverine; transitional;
lacustrine; tail waters
(can be more riverine
but always associated
with dams))
Depth
Substrate/
microhabitat
Estuaries6
Self-contained basin
Biogeographic
province
Watershed
Watershed area (km2 )
Zones (tidal basin,
depth, salinity)
Substrate/habitat
Wetlands'
Transect upland or
deep water boundaries
Wetland system type
(marine, estuarine,
riverine, lacustrine,
palustrine)
Watershed recharge,
discharge or both
Class (based on
vegetative type;
substrate and flooding
regime; hydroperiod)
Flooding regime water
chemistry soil type
Subsystem (subtidal,
intertidal, tidal, lower
perennial, upper
perennial, intermittent,
littoral, limnetic)
Substrate/
microhabitat
"Frisselletal. 1986; bGerritsenetal. 1996;cWetzel 1983;d Thornton etal. 1990;e Day etal. 1989;fCowardinetal. 1979
4-29
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
4.4 Biological Assessment Protocols
Biological indicators are widely recognized as being critical for evaluating ecological conditions, helping
identify and prioritize problems, designing controls or other solutions, and in evaluating effectiveness of
management efforts. However, prior to being able to make management decisions using these indicators,
two things need to occur. First, the indicator must be calibrated; and, second, sampling and analysis must
be instituted in a routine and consistent manner to directly address management objectives. To obtain
valid assessment results, and to optimize defensibility of decisions based on them, it is imperative that
monitoring be founded on data of known quality (Flotemersch et al. 2006b, Stribling 2011). In this
section, our discussion focuses on wadeable streams, and we assume that the user has access to indicators
that have been calibrated and are applicable to their region and water body (-ies) of concern. Many states
have developed MMIs using one or more biological groups, but most typically benthic
macroinvertebrates, and sequentially less so, fish and periphyton (algae and diatoms). Carter and Resh
(2013) surveyed state agencies about different characteristics of their biological monitoring programs,
and, although there are differences in some of the specific techniques, there has also been considerable
convergence among methods during the past 10 to 15 years. Monitoring programs that have gone through
the index calibration process have worked or are working through technical issues associated with
customizing sampling techniques to water body type, prevailing climatic conditions, and programmatic
capacity; using field data to characterize environmental/ecological variability; defining mathematical
terms of the indicator (metrics and index make-up); and defining thresholds for judging degradation.
Part of customizing field sampling and laboratory analysis methods for a program involves understanding
the range of variability of field conditions and the data/assessments that arise from them. One approach
for developing such an understanding is to recognize that biological monitoring and assessment protocols
are made up of a series of methods (Flotemersch et al. 2006b, Stribling 2011) generally corresponding to
different steps of the overall process: field sampling, sample preparation, taxonomic identification,
enumeration, data entry, data reduction, and site assessment/interpretation. In this section, we present
descriptions of the background, purpose, application, and output of the methods along with relevant
procedures for documenting data quality associated with each.
4.4.1 Field Sampling
In the context of the field sampling approach being used for biological assessment, taking or observing
organisms from a defined sample location is intended to provide a representation of the biological
assemblage supported by that water body, whether benthic macroinvertebrates, fish, or periphyton. These
three assemblages are emphasized because they are most commonly used in routine monitoring and
assessment programs in the US, and methods for them are relatively well-documented (Barbour et al.
1999, Moulton et al. 2002, Stribling 2011, Carter and Resh 2013).
4-30
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
4.4.1.1 Benthic macroinvertebrates
Benthic macroinvertebrate samples are taken from multiple habitat types, and composited in a single
sample container (Figure 4-3). If sampled from transects as per the USEPA national surveys (USEPA
2009), they are collected along 11 transects evenly distributed throughout the reach length, using a
D-frame net with 500-^m mesh openings (Klemm et al. 1998, Flotemersch et al. 2006b). An alternative to
transects is to estimate the proportion of different habitat types in a defined reach (e.g., 100m), and
distribute a fixed level of sampling effort proportional to their frequency of occurrence throughout the
reach (Barbour et al. 1999, 2006). Whether using transects or proportional distribution, organic and
inorganic sample material (leaf litter, small woody twigs, silt, and sand; also includes all invertebrate
specimens) are composited in one or more containers, preserved with 95% denatured ethanol, and
delivered to laboratories for processing (Figure 4-4). A composite sample over multiple habitats in a reach
is a common protocol feature of many monitoring program throughout the US (Carter and Resh 2013),
although some programs choose to keep samples and data from different habitat types segregated.
Figure 4-3. Removing a benthic macroinvertebrate sample
from a sieve bucket and placing the sample material in a 1-liter
container with approximately 95% ethanol preservative
4-31
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
Figure 4-4. Labelling benthic macroinvertebrate sample containers and recording field data
4.4.1.2 Fish
Fish sampling is designed to provide a sample that is representative of the fish community inhabiting the
reach, and which assumed to reasonably represent species richness, guilds, relative abundance, size, and
anomalies. The goal is to collect fish community data that will allow the calculation of an IBI and
observed/expected (O/E) models. Electrofishing is the preferred method of sampling, involving the
operator and (ideally) two netters, and occurs in a downstream direction at all habitats along alternating
banks, over a length of 20 times the mean channel width at designated transects (USEPA 2009).
Collection of a minimum of 500 fish is the target number of specimens (USEPA 2009), and in the event
this is not attained, sampling will continue until 500 individuals are captured or the downstream extent of
the site is reached.
4.4.1.3 Periphyton
Periphyton collections are made from shallow areas near each of the sampling locations on the 11 cross-
section transects established within the sampling reach and are collected at the same time as the benthic
macroinvertebrate samples (USEPA 2009). There is one composite sample of periphyton for each site,
from which separate types of laboratory samples can be prepared, if necessary. The different sample type
could include a) an ID/enumeration sample to determine taxonomic composition and relative abundances,
b) a chlorophyll sample, c) a biomass sample (for ash-free dry mass [AFDM]), or d) an acid/alkaline
phosphatase activity [APA] sample). There are potentially other analysis types that could be performed,
thus requiring additional sample segregates.
4-32
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
4.4.1.4 Quality control measures
Other than a qualitative judgment that field personnel are adequately trained, have sufficient experience,
and have been successfully audited as having completely and accurately applied the correct SOP, the
quality of field sampling cannot be determined without sample processing. The consistency of field
sampling is a measure of data quality quantified by precision calculations using indicator values -
individual metrics, IBI scores, or predictive models - collected from adjacent stream reaches (i. e., two
stream channel lengths where the second [B] begins at the endpoint of the first [A]). As a rule-of-thumb,
we recommend a site duplication rate of 10 percent, where duplicate locations are randomly selected from
the full sample lot, and fieldwork occurs as routine. Terms calculated from the duplicate sample results
include median relative percent difference (mRPD), 90 percent confidence intervals/minimum detectable
difference (CI90/DD90), and coefficient of variation (CV) (Flotemersch et al. 2006b, Stribling et al.
2008a, Stribling 2011). Depending on programmatic application, natural variability of the landscape the
watershed is draining, density and distribution of potential stressor sources, number of field crews, and, of
course, budgetary resources, it can be useful to stratify distribution of duplicate reaches. This will allow
programmatic measurement quality objectives (MQO) to be established for objective benchmarks for
acceptable quality of data. Typical MQO for field sampling precision (Stribling et al. 2008a, Stribling
2011) might be:
« mRPD<15,
« CI90<15 index points on a 100-point scale, and
• CV< 10% for a sampling event
Depending on programmatic needs, values exceeding these MQO could highlight samples for more
detailed scrutiny to determine causes for the exceedances, and the need for corrective actions.
4.4.2 Sample processing/laboratory analysis
For biological monitoring and assessment programs, sample processing employs procedures for
organizing sample contents so that analysis is possible. For benthic macroinvertebrate and periphyton
samples, those procedures are laboratory-based; however, for fish, they are performed primarily in the
field (USEPA 2004, 2009).
4.4.2.1 Benthic macroinvertebrates
The three aspects of sample processing for benthic macroinvertebrate samples are a) sorting, which serves
to separate the organisms from other sample material, specifically, organic detritus inorganic silt, and
other materials (Figure 4-5), b) subsampling, which isolates a representative sample fraction from the
whole, and c) taxonomic identification, which characterizes the (sub)sample by naming and counting
individuals in it.
4-33
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
Figure 4-5. Examining, washing, and removing large components of sample material prior
to putting in sample container
4.4.2.1.1 Sorting and subsampling
The sorting/subsampling procedure is based on randomly selecting portions of the sample material spread
over a gridded Caton screen (Caton 1991, Barbour et al. 1999, Flotemersch et al. 2006b, Stribling 2011),
and fully removing (picking) all organisms from the selected fractions. The screen is divided into 30 grid
squares, each individual grid square measuring 6 cm x 6 cm, or 36 cm2 (note that it is not 6 cm2 as
indicated in Figure 6-4b of Flotemersch et al. [2006b]). Prior to beginning the sorting/subsampling
process, it is important that the sample is mixed thoroughly and distributed evenly across the screen to
reduce the effect of organism clumping that may have occurred in the sample container. Depending on the
density of organisms in the sample, multiple levels of sorting may be necessary, the purpose of which is
to minimize the likelihood that the entire sample to be identified comes from a very small number of
grids. Initially, four grids are randomly selected from the 6x5 array, removed from the screen, placed in
a sorting tray, and coarsely examined. If the density of organisms is high enough that there are many more
than the target number in the four selected grids (i.e., greatly exceeding by twofold or more the 100-,
200-, 300-, 500-organisms, or more, depending on the project), that material is re-spread on a second
gridded screen and the process repeated (second level sort). This is repeated until it is apparent that the
density of specimens will require at least four grids to be sorted to attain the target number (±20%). Once
re-spreading is no longer needed, all organisms are removed from the four grids using forceps. If the final
rough count is ±20 of the target subsample size, then subsampling is complete; if >20% less than the
target subsample size, then additional, single grids of material are moved from the tray, and picked in
entirety. This is repeated, one grid at a time, until within 20 percent of the target number. Following
4-34
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
picking, the sort residue should be transferred to a separate container labeled with complete sample
information, and the words "SORT RESIDUE" clearly visible. Completely record the number of sort
levels and grids processed. The sorting and subsampling process should result in at least three containers:
a) clean (sub)sample, b) sort residue, and c) unsorted sample remains. Container 'a' is provided to the
taxonomist for identification and counting, 'b' is available for QC sort re-check, and 'c' is archived until
all QC checks are complete. In the event of certain QC failures, it may be necessary to process portions of
the unsorted remains.
Fixed count subsamples - Fixed organism counts vary among monitoring programs (Carter and Resh
2013), with 100, 200, 300 and 500 counts being most often used (Barbour et al. 1999, Cao and Hawkins
2005, Flotemersch et al. 2006a). Flotemersch et al. (2006a) concluded that a 500-organism count was
most appropriate for large/nonwadeable river systems, based on examination of the relative increase in
richness metric values (< 2%) between sequential 100-organism counts. However, they also suggested
that 300-organism count is sufficient for most study needs. Others have recommended higher fixed
counts, including a minimum of 600 for wadeable streams (Cao and Hawkins 2005). The subsample
count used for the USEPA national surveys is 500 organisms (USEPA 2004); many states use 200 or
300 counts.
4.4.2.1.2 Taxonomic identification
Genus level taxonomy is the principal hierarchical level used by most routine biological monitoring
programs for benthic macroinvertebrates (Carter and Resh 2013), although occasionally family level
taxonomy is used. For genus level to be attained, most direct observations can be accomplished with
dissecting stereomicroscopes with magnification ranges of 7-112x; however, midges (Chironomidae) and
worms (Oligochaeta) need to be slide-mounted and viewed through compound microscopes that have
magnification ranging 40-1500x, under oil. Slide-mounting specimens in these two groups is usually
(though, not always) necessary to attain genus level nomenclature, and sometimes even more coarse level
for midges (i.e., less specific). Taxonomic classification is a major potential source of error in any kind of
biological monitoring data sets (Stribling et al. 2008b, Bortolus 2008) and the rates of error can be
managed by specifying both hierarchical targets and counting rules. Hierarchical targets define the level
of effort that should be applied to each specimen but may often not be possible for some specimens due to
poor slide mounts, damaged, or their being juvenile (early instars). Further, the requirement for some taxa
may be more coarse, such as genus-group, tribe, subfamily, or even family. In any case, the principal
responsibility of the taxonomist is to record and report the taxa in the sample and the number of
individuals of each taxon. Consistency in the nomenclature used is more important than the actual keys
that are used, although, some programmatic SOPs may specify the technical literature. For example, the
identification manual "An Introduction to the Aquatic Insects of North America" (Merritt et al. 2008) is
useful for identifying the majority of aquatic insects in North America to genus level. However, because
many taxonomic groups are often (correctly) under perpetual revision and updates, the nomenclatural
foundation of many may have changed, thus requiring familiarity of the taxonomist with more current
primary taxonomic literature. Merritt et al. (2008) is not applicable to non-insect macroinvertebrate taxa
that are often captured in routine sampling, including Oligochaeta, Mollusca, Acari, Crustacea,
Platyhelminthes, and others; exhaustive lists of literature for all invertebrate groups are provided by
Klemm et al. (1990) and Thorp and Covich (2010). Identification staff may also need information on
accepted nomenclature, including validity, authorship, and spelling, all of which could be found in the
integrated taxonomic information system (ITIS; http://www.itis.gov/). Although it is a nomenclatural
clearinghouse, it should be recognized that it is not completely current for all taxa potentially requiring
independent confirmation.
4-35
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
It should be noted that some volunteer monitoring programs, such as the Izaak Walton League (IWL) and
the Maryland DNR Stream Waders Program (MDNR), use simpler taxonomic procedures. The IWL uses
field identification of a small number of organisms that are of a limited number of kinds, like mayflies,
stoneflies, caddisflies, beetles, and mollusks, and just note their presence or absence. The MDNR Stream
Waders program, including metrics and index, are based on family level data.
4.4.2.2 Fish (field taxonomic identification)
Identification and processing offish occur at the completion of each transect (USEPA 2009), where the
data recorded include species names, number of individuals of each, length, and BELT anomalies
(Deformities, Eroded fins, Lesions and Tumors). Taxonomic identification and processing should only be
completed on specimens >25 mm total length and by qualified staff. Common names of species should
follow those established under the American Fisheries Society's publication, "Common and Scientific
Names of Fishes from the United States, Canada and Mexico" (Nelson et al. 2004). Species not positively
identified in the field should be separately retained for laboratory identification (up to 20 individuals per
species). For programs not using the transect method of sample reach layout, electrofishing will cover all
habitat throughout the reach. Further, fish sample vouchers are developed for a minimum of 10% of the
sites sampled (USEPA 2012).
4.4.2.3 Periphyton
Two activities making up sample processing for periphyton are further segregated into those for a) soft-
bodied algal forms and b) diatoms. Although methods for both are presented by USEPA (2012), diatom
procedures are based principally on those of the US Geological Survey National Water Quality
Assessment Program (USGS/NAWQA) (Charles et al. 2002). Microscopic diatoms encountered are
identified (to lowest possible taxon level), enumerated and recorded. Estimates of the biovolume of
dominant species are made using existing parameters, or those found in the literature, and used to
determine the biovolume of the sample. Detailed information on the different procedures, especially on
the analytical approaches for soft algae using the Sedgewick-Rafter and extended Palmer-Maloney count
techniques, can be found in USEPA (2012) and Charles et al. (2002).
4.4.2.4 Quality control measures/data quality documentation
Quality control (QC) for sample processing for these three taxonomic groups is, in some respects, similar,
but in others, different. Some of the similarities are that several aspects quality evaluations are based on
repeating processes; specifically, duplicating field samples, or repeating of sample processing activities
(sorting, identification, and counting). Differences arise out of the fact that there are not always analogous
methods for dealing with the different organism types. Specifically, fish are identified in the field,
whereas, benthic invertebrates and algae/diatoms are laboratory-identified. Logistical constraints prevent
whole-sample re-identification offish, whereas it is easily done for the other groups. And, subsampling is
not done with fish samples, where it is explicitly done for benthic invertebrates, and functionally done for
algal/diatom samples.
Sorting QC (benthic macroinvertebrates [only]). - Sorting QC is accomplished through rechecking the
sample sort residue from 10% of the samples, randomly selected, and calculating the term 'percent sorting
efficiency' (PSE) (Stribling 2011, Flotemersch et al. 2006b). This value reports the number of specimens
missed during primary sorting as a proportion of the original number of specimens found. A typical MQO
for this is PSE>90%, with the goal of minimizing the number of samples that fail. Individual programs
4-36
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
must specify what is acceptable, but generally, the goal should be to have <10% of the samples fail. It is a
measure of bias associated with sample sorting.
Taxonomic QC. - As a measure of precision in taxonomic identifications, consistency is quantified by
independent re-identification of whole (sub)samples, where those samples are randomly selected (10%, as
a rule-of-thumb) from the full sample lot. Sample results from the QC taxonomist are directly compared
to those of the primary taxonomist, and differences quantified as 'percent taxonomic disagreement' (PTD)
for identifications, as 'percent difference in enumeration' (PDE) for counts, and as 'absolute difference in
percent taxonomic completeness' (afejTTC]). Typical MQO for these are PTD<15%, PDE<5%, and
afePTC<5% (Stribling 2011).
If MQO thresholds are routinely or broadly exceeded, samples failing should be examined in more detail
to determine causes of the problem, and what corrective actions may be necessary.
4.4.3 Data reduction/indicator calculation
Once necessary corrective actions for sample processing and taxonomic identifications have been
implemented and effectiveness confirmed, data quality is known and acceptable, sample data are
converted into the primary terms to be used for analysis. As stated above, monitoring practitioners usually
have access to published MMI for application to sample data, as well as sometimes predictive models and
established decision analysis systems. Indicators most often take the form of a multimetric Index of
Biological Integrity (IBI; Karr et al. 1986, Hughes et al. 1998, Barbour et al. 1999, Hill et al. 2000, 2003)
or a predictive observed/expected (O/E) model based on the River Invertebrate Prediction and
Classification System (RIVPACS; Clarke et al. 1996, 2003, Hawkins et al. 2000b, Hawkins 2006). The
Illinois Department of Natural Resources used the Macroinvertebrate Biotic Index (MBI) in their analysis
of restoration effectiveness in the Waukegan River (see Case Study 4).
4.4.3.1 Multimetric indexes
The purpose of any MMI is to summarize complex biological and environmental information into a form
and format that can be used for management decision-making (Karr and Dudley 1981, Karr 1991,
Angermeier and Karr 1994), and doing so in a manner that allows uncertainty associated with those
decisions to be known and communicated. Index calibration is the empirical process of determining
which measures are best suited for that purpose, specifically in terms of their capacity for detecting
biological changes in response to environmental variables of concern (pollutants, or stressors). For the
purpose of this guidance it is assumed that the calibration procedure (Hughes et al. 1998, Barbour et al.
1999, McCormick et al. 2001) has been completed, and that an MMI is available to monitoring
practitioners for application. The reader should be aware that no attempt is made to be comprehensive in
discussing either metric diversity or index formulation, or of reviewing supporting technical literature. As
such, only selected examples are used below to illustrate different aspects of MMI application.
4-37
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
CASE STUDY 4: BIOLOGICAL AND PHYSICAL MONITORING OF WAUKEGAN
RIVER RESTORATION EFFORTS, LAKE COUNTY, ILLINOIS
The Waukegan River watershed is located on the
western shore of Lake Michigan, about 56 km
(35 mi) north of Chicago in Lake County (Figure CS4-
1). It is approximately 20 km (12.4 mi) long and has
a drainage area of 2,994 ha (7,397 ac). The river
channel drops from an approximate 222 m (730 ft)
headwaters elevation to around 177 m (580 ft)
above sea level, before discharging directly into
Lake Michigan through Waukegan Harbor on its
western shore. The Waukegan River watershed
receives a mean annual precipitation of 834
millimeters (mm) (32.8 in) and has a mean annual
temperature of 8.8°C (47.8°F). Historical records
(circa 1840) indicate substantial marshes in the
area, and recent soils studies indicate that wetlands
covered approximately 15 percent of the
watershed.
Urbanized watershed
Severely degraded stream habitat;
channel instability/bank erosion;
high velocity runoff
Bank stabilization using LUNKERS
and riparian re-vegetation
Grade control using rock weirs and
artificial riffles
Benthic macroinvertebrate
assemblage monitoring
Effectiveness evaluation
The north and south branches
of the basin, including the
mainstem to Waukegan
Harbor, comprise
approximately 20 channel km
(12.5 mi), excluding Yeoman
Creek from the north. The
mean channel width ranges
from 4.4 to 6.7 m (14.6-
21.9 ft) and has a mean depth
from 0.07-0.28m (0.23-
0.92 ft) (White etal. 2003).
There is more substantial
shading from riparian
vegetation in the North Branch
subwatershed than in the
south. The South Branch has a
greater discharge than the
North Branch, approximately
0.1 cubic meters per second
(cms) (3.4 cfs) versus 0.01 cms
(0.4 cfs). Dominant substrate
types range from sand to large
cobble and boulder, with some
bedrock. Within the project area, there was one control monitoring site (S2) and three sites
where stream restoration was carried out (SI, Ml, N2) (Figure CS4-2).
' Waukegan
Figure CS4-1. The Waukegan River watershed in northeastern Illinois
(White et al. 2010)
4-38
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
Figure CS4-2. Aerial map taken in 1998 of a portion of the Waukegan River watershed, showing
sampling locations and restoration project areas (White et al. 2010)
4-39
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
As of 2005, major land uses in the Waukegan River watershed included approximately
36.7 percent residential; 21.5 percent transportation; 12 percent commercial, retail, government
and institutional; and about 20 percent open space, forest, grasslands, and beaches. The
remaining land uses are associated with small amounts of disturbed lands (3.6 percent), industrial
(2.8 percent), wetlands/water (2.0 percent), and communication/ utilities (1.7 percent) (Lake
County SWMC 2008). Approximately 80 percent of the densely urbanized City of Waukegan (2014
population ~89,000) lies within the watershed.
The pace of development and sprawl of Waukegan was substantial throughout the latter part of
the 19th century, and until the 1970s and 1980s, when current stormwater regulations began to
take effect. Not surprisingly, streams were heavily impacted by urbanization. Minimal
management of the stormwater quantity and quality led to flashy stormwater runoff conditions
and elevated pollutant loads. Additional water resource issues associated with urban
development included combined sewer overflows (CSO), stream channel instability (accelerated
vertical and lateral erosion processes), nutrient enrichment, and contamination by metals,
pesticides/herbicides, Pharmaceuticals and personal care products (PPCPs), and endocrine
disrupters (White et al. 2003, 2010). Channel erosion processes were accelerated by flashy
stormflows, contributing to degraded physical habitat and decreased capacity of the stream to
support the survival and reproduction of stream biota.
Monitoring and Sampling Design
The overall restoration goal in the North Branch and South Branch was to rehabilitate physical
habitat and hydrologic conditions to support recovery of benthic macroinvertebrate and fish
assemblages (White et al. 2010). Project leaders chose to design and install biotechnical
stabilization, a combination of stream bank physical stabilization and riparian re-vegetation, to
address the severe channel instability and erosion problems in Washington Park and Powell Park
(Figure CS4-2). The objectives of these techniques are complementary. A stream channel
experiencing severe lateral and vertical erosion (mass-wasting and down-cutting, respectively), by
definition, is losing habitat suitable for stream biota. Damaged or missing riparian vegetation
results in diminished root mass to hold soil together, lowered inputs of leaf litter and woody
materials (food source and habitat structure), and less shading, which can lead to warmer water
and increased photosynthetic activity and algal growth. Combinations of LUNKERS1, a-jacks,
stone, coconut fiber rolls, dogwoods, willows, and grasses were installed at selected locations on
the North Branch/Powell Park (1992-93) and on the South Branch/Washington Park (1995).
There were four sample locations, two each on the South Branch and the North Branch (Figure
CS4-2). For the South Branch, station S2 was the upstream control reach, and SI was the
downstream treatment reach. On the North Branch, the two sample locations (Nl and N2) were
located to coincide with treatment reaches. Wooden LUNKERS were used as the principal
rehabilitation feature at Nl, while recycled plastic lumber and concrete a-jacks were used for
LUNKERS construction at N2. All four locations were sampled annually during spring, summer, and
fall over a 13-year period (1994-2006).
This case study focuses solely on responses of benthic macroinvertebrates, although it should be
recognized that fish were also evaluated for both branches. Benthic macroinvertebrates, physical
habitat, and chemical water quality were sampled, measured, and characterized at each stream
location. Three macroinvertebrate samples were taken at each location using a Hess sampler with
little LJnderwater Neighborhood Keepers Encompassing Rheotaxic Salmonids' (Vetrano 1988)
4-40
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
a 500 micron mesh net. Sample material was preserved in 95 percent ethanol; organisms were
sorted to segregate individuals from non-target material, and subsequently identified to genus
level.
Sampling data were used to calculate the Macroinvertebrate Biotic Index (MBI). The MBI is based
on the Hilsenhoff Biotic Index (HBI; Hilsenhoff 1977, 1982) but uses an 11-point scale, rather than
the HBI 5-point scale. Lower MBI scores indicate better or less-degraded water quality. Physical
habitat was characterized using the Potential Index of Biological Integrity (PIBI) which
incorporates percent substrate particle sizes, magnitude of sediment deposition, pool substrate
quality, and substrate stability, a series of hydraulic and morphometric measures, riparian
features, and various aspects of instream cover (White et al. 2003). The PIBI was calculated from
measurements and field observations made on each of 10 equal-length segments established by
the 11-transect method. The purpose of the PIBI is to help illuminate which habitat features, if
any, might be limiting survival, growth, and reproduction of stream biota.
Results
Overall, among all four locations, MBI scores ranged from around 5 to just below 10 (good to very
poor), with an average around 7.2, or "fair" (see Figure CS4-3). On the South Branch, station S2
exhibited the highest mean score for a single year, 7.5, indicating "fair" stream condition, slightly
better than "poor." MBI scores indicate worsening conditions over time at stations SI and Nl.
There was virtually no change over the 13 years for stations S2 and N2, with average MBI scores
in the "fair" and "poor" ranges.
All sites were dominated by stressor tolerant taxa, with sample data comprised of 82-89 percent
non-biting midges (Insecta: Diptera: Chironomidae), segmented worms (Annelida: Oligochaeta),
and aquatic sowbugs (Crustacea: Isopoda: Asellidae). The dominance of these animals in the
North and South branches clearly shows stressed or degraded conditions before, during, and after
any kind of habitat restoration or other remedial activities. Mean taxa richness (number of
distinct taxa) over the sampling period was 10 for site SI, and 8 for the other three locations.
Ninety-two percent of the samples fell in the "poor" or "very poor" categories. There were very
low numbers of stressor-sensitive EPTtaxa (mayflies [Ephemeroptera], stoneflies [Plecoptera],
and caddisflies [Trichoptera]) throughout the monitoring period. This is generally indicative of
elevated pollutant levels and greater degradation. Some improvement in physical habitat quality
was observed at treatment stations SI and Nl, likely due to improvements in bank stability and
decreases in overall proportions of percent fines, silt, and mud. The other treatment station, N2,
which was bank-armored, remained relatively consistent over the full period of record, as did the
non-treatment control, S2.
Improvement in physical habitat quality and overall biological diversity was achieved as a result of
these restoration activities, but improvement in biodiversity, primarily relative to the fish
assemblage (not discussed in this case study) was only temporary (White et al. 2010). The authors
acknowledged that sustainable biological diversity in a damaged watershed will require more
complete understanding of landscape and watershed processes, their degree of degradation, and a
comprehensive approach to conservation that addresses the system in its entirety. In the case of
the Waukegan River watershed, this calls for a systematic approach to correcting other sources of
hydrologic and chemical water quality stressors associated with water and sewer management
operations, channel and flow alterations, and extensive aquifer drawdown. One result of this
project was initiation of a comprehensive watershed plan, selection of a coordinator, development
of stakeholder and technical planning committees, and creation of a long-term action plan.
4-41
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
CO
CO
10
8 -
6 -
4 -
2 -
South Branch Station S1
1994
2000
2006
10
8 -
6 -
4 -
North Branch Station N1
1994
2000
2006
10
8 -
6 -
4 -
2 -
South Branch Station S2
1994
2000
2006
10
8 -
North Branch Station N2
1994
2000
2006
Figure CS4-3. MBI scores from monitoring stations in Waukegan River (White et al. 2010). Assessment classes
(narrative ratings) for stream condition based on MBI scores are: very poor, 9.0-11.0; poor, 7.6-8.9; fair, 6.0-7.5;
good, 5.0-5.9; very good, <5.0.
Literature
Hilsenhoff, W.L. 1977. Use of Arthropods to Evaluate Water Quality of Streams. Technical Bulletin
100. Wisconsin Department of Natural Resources. Madison, Wl.
Hilsenhoff, W.L. 1982. Using a Biotic Index to Evaluate Water Quality in Streams. Technical
Bulletin 132. Wisconsin Department of Natural Resources. Madison, Wl.
Lake County SWMC. 2008. Waukegan River Watershed, Lake County, Illinois. Lake County
Stormwater Management Commission, Lake County Department of Information
and Technology, GIS & Mapping Division. Map. Accessed February 9, 2016.
4-42
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Vetrano, D.H. 1988. Unit Construction for Trout Habitat Improvement Structures for Wisconsin
Coulee Streams. Administrative Report No. 27. Wisconsin Department of Natural Resources,
Madison, Wl.
White, W.P., J.D. Beardsley, J.A. Rodsater, and LT. Duong. 2003. Biological and Physical
Monitoring of Waukegan River Restoration Efforts in Biotechnical Bank Protection and
Pool/Riffle Creation. National Watershed Monitoring Project. Annual Report. Prepared by:
Watershed Science Section, Illinois State Water Survey, Illinois Department of Natural
Resources. Prepared for: Illinois Environmental Protection Agency and U.S. Environmental
Protection Agency (Region 5).
White, W.P., J. Beardsley, and S. Tomkins. 2010. Waukegan River, Illinois national nonpoint source
monitoring program project. NWQEP NOTES. The NCSU Water Quality Group Newsletter.
No. 133. April 2010. ISSN 1062-9149. North Carolina State University Water Quality Group,
Campus Box 7637, North Carolina State University, Raleigh, NC 27695-7637. Accessed
February 9, 2016. http://www.ncsu.edu/waterquality/.
4-43
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
4.4.3.1.1 Metric and index calculations
Metrics are mathematical terms calculated directly from sample data, with resulting values scored relative
to quantitative criteria. The Table 4-5 below presents an example of four different metric sets representing
site classes (or bioregions) within a particular US State. Each set of either 5 or 6 metrics forms the basis
of an MMI previously calibrated to wadeable streams of the class. Metric values result from direct
calculations on raw sample data, taxonomic identifications and counts (list of taxa and number of
individuals of each, by sample).
Table 4-4. Metrics and associated scoring formulas for four site classes from an example
monitoring and assessment program
Metrics
Scoring formulas
Site class A
1.
2.
3.
4.
5.
Total taxa
Percent EPT individuals, as sensitive
Percent Coleoptera individuals, as sensitive
Beck's biotic index
Percent of taxa, as tolerant
100*(mefrfci/a/ue)/51.5
100*(mefrfci/a/ue)/39
100*(mefrfci/a/ue)/10.5
100*(mefrfci/a/ue)/31
100*(43-[mefrfci/a/ue])/40
Site class B
1.
2.
3.
4.
5.
6.
Total number of taxa
Number of EPT taxa
Percent individuals Cricotopus/Orthocladius + Chironomus, of total Chironomidae
Percent EPT individuals, as sensitive
Number of taxa, as shredders
Hilsenhoff Biotic Index
100*(mefrfci/a/ue)/51.5
100*(mefrfci/a/ue)/14
100*(45-[mefrfci/a/ue])/45
100*(mefrfci/a/ue)/39
100*(mefrfci/a/ue)/7
100*(8.5-[mefrvci/a/ue])/5
Site class C
1.
2.
3.
4.
5.
6.
Total taxa
Percent of taxa, as non-insects
Percent individuals Cricotopus/Orthocladius + Chironomus, of total Chironomidae
Percent of individuals, as filterers
Number of taxa, as sprawlers
Hilsenhoff Biotic Index
100*(mefrfci/a/ue)/51.5)
100*(46-[mefrfci/a/ue])/40
100*(24-[mefrfci/a/ue])/24
100*(mefrfci/a/ue)/70
100*(mefrfci/a/ue)/14
100*(8.5-[mefrvci/a/ue])/5
Site class D
1.
2.
3.
4.
5.
6
Number of Oligochaeta taxa
Percent EPT individuals, as sensitive
Percent individuals, as Crustacea and Mollusca
Percent individuals, as Odonata
Number of taxa, as collectors
Percent individuals, as swimmers
100*(6-(mefrfci/a/ue])/6
100*(mefrfci/a/ue)/15
100*(mefrfci/a/ue)/30
100*(16.5-[mefrvci/a/ue])/16.5
100*(20-[mefrfci/a/ue])/19.5
100*(12-[mefrfci/a/ue])/12
4-44
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Many metrics require assigning different characteristics or traits to taxa in the dataset prior to calculation.
These characteristics include functional feeding groups (FFGs), habit, and stressor tolerance values. Many
states use various literature resources to develop traits databases (e.g., Merritt et al. 2008, Barbour et al.
1999, Vieira et al. 2006, Carter and Resh 2013).
The formulas in the table resulted from the calibration process, and serve to convert the calculated metric
value to a normalized, unitless score on a 100-point scale. The multiple metric values are then combined
for each sample by simple averaging. Additionally, formulas are developed, in part, so that individual
metrics are scored depending on their direction of change in the presence of stressors.
4.4.3.1.2 Quality control measure
Metric calculations are typically performed in spreadsheets or relational databases with embedded
queries. To ensure that resulting calculations are correct and provide the intended metric values, a subset
of them should be recalculated by hand. A reliable approach is to calculate a) one metric across all
samples, followed by b) all metrics for one sample. When recalculated values differ from those values in
the output matrix, reasons for the disagreement are determined and corrections are made. Reports on
performance include the total number of reduced values as a percentage of the total, how many errors
were found in the queries, and the corrective actions specifically documented.
4.4.3.2 Predictive models (observed/expected [O/E])
Predictive models are based on the premise that the taxa occurring in a minimally disturbed system can be
predicted based on multiple measures of the environmental setting and that if the predicted taxa are not
observed in an evaluation site, then disturbance can be suspected. The ratio of the number of observed
taxa to that expected to occur in the absence of human-caused stress is an intuitive and ecologically
meaningful measure of biological integrity. Low observed-to-expected ratios (O/E « 1.0) imply that test
sites are adversely affected by some environmental stressor. The models are commonly called RIVPACS
models (River Invertebrate Prediction And Classification System [Wright 1995]) based on
observed:expected taxa (Clarke et al. 1996, 2003, Hawkins et al. 2000b). Because they are based on taxa
in reference sites, the predictive models are not well suited to assemblages with naturally low diversity (as
in oligotrophic fish communities). The loss of reference taxa is difficult to detect when only few taxa are
expected.
The number of taxa expected at a site is calculated as the sum of individual probabilities of capture for all
taxa found in reference sites in the region of interest. All probabilities greater than a designated threshold
are summed to calculate the expected number of taxa (E), and this number is compared to the reference
taxa observed (O) at a site. Because these models predict the actual taxonomic composition of a site, they
also provide information about the presence or absence of specific taxa. If the sensitivities of taxa to
different stressors are known, this information can lead to derived indices and diagnoses of the stressors
most likely affecting a site. In addition, taxa can be identified as increasers or decreasers with respect to
general environmental stress encountered in the model development data set. A variation of the O/E
models measures the Bray-Curtis compositional dissimilarity between an observed and expected
assemblage directly, which detects stress-induced shifts in taxonomic composition that leave assemblage
richness unchanged (Van Sickle 2008).
The steps that go into building a predictive model include 1) classifying reference sites into biologically
similar groups, 2) creating discriminant functions models to estimate group membership of sites from
environmental data, 3) establishing taxon-specific probabilities of capture for individual sites,
4-45
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
4) identifying taxa expected and comparing to those observed, 5) estimating model error, and 6) applying
the model in test sites. These steps can be automated to allow exploration of model performance for
multiple subsets of environmental predictor variables.
4.4.3.3 Quantitative decision analysis systems (biological condition gradient [BCG])
The Biological Condition Gradient (BCG) was described by USEPA (2011) as "a conceptual model that
describes how biological attributes of aquatic ecosystems might change along a gradient of increasing
anthropogenic stress." The model can serve as a template for organizing field data (biological, chemical,
physical, landscape) at an ecoregional, basin, watershed, or stream segment level. The BCG was
developed by EPA and other agencies to support tiered aquatic life uses in state water quality standards
and criteria. It was developed through a series of workshops, and described fully by Davies and Jackson
(2006). The BCG includes a narrative description of ecological condition that can be translated across
regions, assemblages, and assessment programs. The descriptions recognize six levels of quality in
ecological condition ranging from "1" (most desirable) where natural structural, functional, and
taxonomic integrity is preserved to "6" (least desirable) in which there are extreme changes in structure
and ecosystem function and wholesale changes in taxonomic composition.
The quantitative decision analysis systems approach explicitly uses the BCG as a scale for biological
assessment. It differs from the multimetric and predictive model approaches in that it is not dependent on
definition of reference sites (although that can be useful), and development relies on consensus of experts
instead of an individual or a few analysts. It is similar to the multimetric approach in its reliance on
distinct site classes. It is similar to the predictive modeling approach in its examination of individual taxa
(though metrics are also incorporated in the models).
Calibrating a BCG to local conditions begins with the assembly and analysis of biological monitoring
data. Following data assembly, a calibration workshop is held in which experts familiar with local biotic
assemblages of the region review the data and the general descriptions of each of the BCG levels. The
expert panel then uses the data to define the ecological attributes of taxa, and to develop narrative
statements of BCG levels based on sample taxa lists. The expert panel is usually convened multiple times
to refine decisions, to react to interim results, and to assign BCG levels to new sites. The steps typically
taken during a calibration workshop include the following:
1) An overview presentation of the BCG and the process for calibration;
2) A "warm-up" data exercise to further familiarize participants with the process;
3) Assignment of taxa to BCG taxonomic attributes (based on known tolerance and rarity);
4) Description of biota in undisturbed conditions (best professional judgment [BPJ]; regardless of
whether such conditions still exist in observed reference sites);
5) Assignment of sites in the data set to BCG levels; and
6) Elicitation of rules used by participants in assigning sites to levels.
Documentation of expert opinion in assigning attributes to taxa and BCG levels to sites is a critical part of
the process. Facilitators elicit from participants sets of operational rules for assigning levels to sites. As
the panel assigns example sites to BCG levels, the members are polled on the critical information and
criteria they used to make their decisions. These form preliminary, narrative rules that explained how
panel members make decisions. Rule development requires discussion and documentation of BCG level
assignment decisions and the reasoning behind the decisions. During these discussions, records are kept
4-46
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
on each participant's decision ("vote") for the site; the critical or most important information for the
decision; and any confounding or conflicting information and how this is resolved.
A decision model is then developed that encompasses the taxa attributes and quantitatively replicates the
rules used by the expert panel in assigning BCG levels to sites. The decision model is tested with
independent data sets as a validation step. A quantitative biological assessment program can then be
developed using the rule-based model for consistent decision-making in water quality management.
The decision analysis models can be based on mathematical fuzzy-set theory (citation) to replicate the
expert panel decisions. Such models explicitly use linguistic rules or logic statements, e.g., "If taxa
richness is high, then condition is good" for quantitative, computerized decisions. The models can usually
be calibrated to closely match panel decisions in most cases, where "closely matched" means the model
either exactly matched the panel, or selected the panel's minority decision as its level of greatest
membership. The decision analysis models can also be cross calibrated to other assessment tools, such as
the MMI. Models can be developed as spreadsheet tools to facilitate programmatic application.
4.4.4 Index scoring and site assessment
The site-specific MMI score, as calculated above in section 4.4.3, is compared to degradation thresholds
(Table 4-6) to determine whether biological degradation exists relative to minimally degraded reference
conditions (Barbour et al. 1999, Stoddard et al. 2006). The range of potential scores in Table 4-6 is
0 (most degraded) to 100 (least degraded). The 90 percent confidence intervals (CI90) are calculated
using sample repeats (see section 4.4.2.4). Defining the numeric values of degradation thresholds is an
integral phase of index calibration, and is affected by regional and climatic conditions, along with the
overall level and consistency of landscape alteration and available data to characterize the broad range of
degradation.
Table 4-5. Degradation thresholds to which MMI score
are compared for determination of status
Site class
A
B
C
D
Degradation threshold
52.3
65.7
66.0
55.9
The confidence interval (CI), also known as detectable difference (DD) (Stark 1993), is associated with
individual MMI scores and represents the magnitude of separation between two values before they can be
considered truly different (Stribling et al. 2008a). Reported values falling below the threshold are
considered degraded, those above are non-degraded, while site index values falling near a threshold may
require additional samples to determine final rating category (Stribling et al. 2008a, Zuellig et al. 2012).
Some programs, if not most, also subdivide value ranges above and below the degradation threshold to
allow communication of multiple levels of non-degradation and degradation, e.g., very good, good, fair,
poor, or very poor.
4-47
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
4.4.5 Reporting assessment results at multiple spatial scales
Depending on the monitoring design, it is possible to use assessment results for several purposes, some of
which may have been previously unanticipated. For example, a probability-based design provides
assessments that can be aggregated for assessments broader than the individual location from which a
sample was taken (Larsen 1997, Urquhart et al. 1998). Simultaneously, each sample from that kind of
design provides information useful for interpreting conditions at individual sample locations.
Assessments from a targeted design provide information about the sites sampled, and although they
cannot be used for broader scale assessments, can assist with confirming the effects of known stressors
and stressor sources.
4.4.5.1 Watershed or area-wide
For programs using a stratified random monitoring design, a simple inference model similar to that
described by Olsen and Peck (2008) and Van Sickle and Paulsen (2008), can be used to estimate the
number of degraded stream miles (D) for a watershed or area-wide region with the formula:
D = (N/T}x L
where:
TV is the number of sites rated by the MMI as degraded,
T is the total number of sites assessed for the sampling unit (subwatershed or watershed group),
and
L is the total number of stream miles in the sampling unit.
Total stream channel miles (L) should be estimated with GIS using the National Hydrography Dataset
(NHD), or other stream data layer appropriate to the watershed or region of interest. Note that replicate
samples taken for QC purposes are not included in these calculations. Results can also be presented as
percent degradation (%D) by using the calculation:
%D = (N/T)xlQQ
For the Lake Allatoona/Upper Etowah River watershed, site selection and monitoring was stratified by
the 53 HUC subwatersheds, and cumulative assessments showed distinctive patterns of degradation
(Figure 4-6). More intensive development and imperviousness are closer to transportation corridors.
Trends in %D over time can be evaluated using test such as the Kendall tau test (Helsel and Hirsch 2002).
It should be noted that for very small sample sizes (i.e., 3 or 4), all values would need to be consecutively
decreasing to reject a one-sided null hypothesis with ap equal to 0.167 and 0.042, respectively.
4-48
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
Degraded Watersheds
HUC-12
% Degraded Stream Miles
Major Streams
I lHUC-10
HUC-12, not assessed
2.5
10
15
Figure 4-6. Percent degradation of subwatersheds as measured by biological monitoring and
assessment, Lake Allatoona/Upper Etowah River watershed (Millard et al. 2011)
4-49
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
4.4.5.2 Stream- or site-specific
The MMI scores and status ratings are approaches useful for summarizing and communicating site
specific conditions; this is the application for which they are designed, and for which they are ideally
suited. However, a well-organized and functional database allows the index to be disaggregated where
individual metrics and even taxa can be evaluated by biologists to help determine those which are most
influencing an assessment. Presenting site-specific point assessments from the previous example (Figure
4-7) shows the specific distribution of the most- and least-degraded streams, and more detailed
examination can begin to reveal proximity of potential stressor sources (Figure 4-8). At this stage of
evaluating watershed-based stream assessments, if necessary, the assessor can turn to the USEPA stressor
identification process, also known as "The Causal Analysis/Diagnosis Decision Information System", or
CADDIS (http://www.epa.gov/caddis/). to assist in determining the most probable causes of biological
degradation. It is using this process, including evaluating the relative dominance of the various taxa that
taxon-specific environmental requirements, stressor tolerances, feeding types, and habits, which can lead
to more defensible decisions on stressor control actions, such as BMPs or stream/watershed restoration
activities. MMI confidence intervals can be computed and used for point comparisons in the same manner
as other water quality variables (see section 7.3).
4.4.5.3 Relative to specific sources
Monitoring objectives requiring documentation of instream biological condition relative to a specific and
known source of stressors require that sample data be drawn from one or more locations exposed to those
stressors. In particular, if the source is an area of specific land use, a type of BMP, or a point source,
confidence in the result will likely be enhanced by a thorough and quantitative description of the source.
However, it should be recognized that lack of a clear site-specific response to either measured or assumed
exposure to stressors does not mean that the biota are non-responsive. Neither does it mean that the BMP
is ineffective. The BMP can likely be proven effective at reducing the single or multiple stressors for
which it is intended and designed.
4-50
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
1 Miles
0 2.5
10
15
Narrative Ratings
0 Very Good
• Good
O Fair
O Poor
• Very Poor
- Major Streams
I | County Lines
I lHUC-10
I | HUC-12
Figure 4-7. Distribution of stream biological assessments in the Lake Allatoona/Upper Etowah
River watershed, using a benthic MMI developed by the Georgia Environmental Protection
Division (Millard et al. 2011)
4-51
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 4
Figure 4-8. More detailed examination of the Yellow Creek subwatershed, Lake Allatoona/Upper
Etowah River watershed, Georgia, reveals a sample location, rated biologically as "poor," is on a
stream flowing through a poultry production operation (Millard et al. 2011)
4-52
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
4.5 References
Allen, D.M., S.K. Service, and M.V. Ogburn-Matthews. 1992. Factors influencing the collection
efficiency of estuarine fishes. Transactions of the American Fisheries Society 121(2): 234-244.
Angermeier, P., and J. Karr. 1994. Biological integrity versus biological diversity as policy directives:
protecting biotic resources. BioScience 44(10): 690-697.
Bahls, L. 1993. Periphyton Bioassessment Methods for Montana Streams. Department of Health and
Environmental Science, Water Quality Bureau, Helena, MT.
Ball, J. 1982. Stream Classification Guidelines for Wisconsin. Technical Bulletin. Wisconsin Department
of Natural Resources, Madison, WI.
Barbour, M.T., and J.B. Stribling. 1991. Use of Habitat Assessment in Evaluating the Biological Integrity
of Stream Communities. In Biological Criteria: Research and Regulation Proceedings of a
Symposium. EPA-440/5-91-005. U.S. Environmental Protection Agency, Office of Water,
Washington, DC.
Barbour, M.T., J. Gerritsen, B.D. Snyder, and J.B. Stribling. 1999. Rapid Bioassessment Protocols for
Streams and Wadeable Rivers: Periphyton, Benthic Macroinvertebrates and Fish. 2nd ed.
EPA/84 l-D-99-002. U.S. Environmental Protection Agency, Office of Water, Washington, DC.
Accessed February 9, 2016. http://water.epa.gov/scitech/monitoring/rsl/bioassessment/index.cfm.
Barbour, M.T., J.B. Stribling, and P.F.M. Verdonschot. 2006. The multihabitat approach of USEPA's
rapid bioassessment protocols: benthic macroinvertebrates. Limnetica 25(3-4): 229-240.
Bortolus, A. 2008. Error cascades in the biological sciences: the unwanted consequences of using bad
taxonomy in ecology. Ambio 37(2): 114-118.
Cao, Y., and C.P. Hawkins. 2005. Simulating biological impairment to evaluate the accuracy of
ecological indicators. Journal of Applied Ecology 42: 954-965.
Carter, J.L., and V.H. Resh. 2013. Analytical Approaches used in Stream Benthic Macroinvertebrate
Biomonitoring Programs of State Agencies in the United States. Open File Report 2013-1129.
U.S. Geological Survey, Reston, VA. Accessed April 22, 2016.
http://pubs.usgs.gOv/of/2013/l 129/pdf/ofr20131129.pdf.
Caton, L. R. 1991. Improved subsampling methods for the EPA rapid bioassessment protocols. Bulletin of
the North American Benthological Society 8: 317-319.
Charles, D.F., C. Knowles, and R.S. Davis, ed. 2002. Protocols for the Analysis of Algal Samples
Collected as Part of the U.S. Geological Survey National Water-Quality Assessment Program.
Report No. 02-06. Prepared for U.S. Geological Survey, by The Academy of Natural Sciences,
Patrick Center for Environmental Research-Phycology Section, Philadelphia, PA.
Clarke, R.T., M.T. Furse, J.F. Wright, and D. Moss. 1996. Derivation of a biological quality index for
river sites: comparison of the observed with the expected fauna. Journal of Applied Statistics 23:
311-332.
4-53
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Clarke, R.T., J.F. Wright, and M.T. Furse. 2003. RTVPACS models for predicting the expected
macroinvertebrate fauna and assessing the ecological quality of rivers. Ecological Modeling 160:
219-233.
Conquest, L.L., S.C. Ralph, and R.J. Naiman. 1994. Implementation of Large-Scale Stream Monitoring
Efforts: Sampling Design and Data Analysis Issues. In Biological Monitoring of Aquatic Systems,
ed. S.L. Loeb and A. Spacie. Lewis Publishers, Boca Raton, FL.
Cowardin, L.M., V. Carter, F.C. Golet, and E.T. LaRoe. 1979. Classification of Wetlands andDeepwater
Habitats of the United States. FWS/OBS-79/31. U.S. Department of the Interior, U.S. Fish and
Wildlife Service, Biological Services Program, Washington, DC.
Cowie, G.M., J.L. Cooley, and A. Dutt. 1991. Use of Modified Benthic Bioassessment Protocols for
Evaluation of Water Quality Trends in Georgia. Technical Completion Report USDI/GS Project
14-08-0001-G1556 (03); ERC-06-91; USGS/G-1556-03. University of Georgia, Institute of
Community and Area Development, Athens, GA, and Georgia Institute of Technology,
Environmental Resources Center, Atlanta, GA.
Cummins, K.W. 1994. Bioassessment and Analysis of Functional Organization of Running Water
Ecosystems. In Biological Monitoring of Aquatic Ecosystems, ed. S.L. Loeb and A. Spacie. Lewis
Publishers, Ann Arbor, MI.
Cupp, C.E. 1989. Stream Corridor Classification for Forested Lands of Washington. Washington Forest
Protection Association, Olympia, WA.
Dauble, D.D., and R.H. Gray. 1980. Comparison of a small seine and a backpack electroshocker to
evaluate near shore fish populations in rivers. Progressive Fish-Culturist 42: 93-95.
Davies, S.P., and S.K. Jackson. 2006. The biological condition gradient: A descriptive model for
interpreting change in aquatic ecosystems. Ecological Applications 16(4): 1251-1266.
Day, J.W., Jr., C.A.S. Hall, W.M. Kemp, and A. Yannez-Arancibia. 1989. Estuarine Ecology. John Wiley
and Sons, Inc., New York, NY.
Dewey, M.R., L.E. Holland-Bartel, and S.T. Zigler. 1989. Comparison offish catches with buoyant pop
nets and seines in vegetated and nonvegetated habitats. North American Journal of Fisheries
Management 9: 249-253.
Flotemersch, J.E., K.A. Blocksom, J.J. Hutchens, and B.C. Autrey. 2006a. Development of a standardized
large river bioassessment protocol (LR-BP) for macroinvertebrate assemblages. River Research
and Applications 22: 775-790.
Flotemersch, J.E., J.B. Stribling, and M.J. Paul. 2006b. Concepts and Approaches for the Bioassessment
of Non-Wadeable Streams and Rivers. EPA/600/R-06/127. U.S. Environmental Protection
Agency, Office of Research and Development, National Exposure Research Laboratory,
Cincinnati, OH. Accessed February 10, 2016.
Flotemersch, J.E., J.B. Stribling, R.M. Hughes, L. Reynolds, M.J. Paul, and C. Wolter. 2011. Reach
length for biological assessment of beatable rivers. River Research and Applications 27(4): 520-
535.
4-54
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Fore, L., and C.O. Yoder. 2003. The Design of a Biological Community Trend Monitoring Program for
Michigan Wadeable Streams (DRAFT). MI/DEQ/WD-03/086. Prepared for Michigan
Department of Environmental Quality, by Statistical Design and Midwest Biodiversity Institute.
Frissell, C.A., W.J. Liss, C.E. Warren, and M.D. Hurley. 1986. A hierarchical framework for stream
habitat classification: viewing streams in a watershed context. Environmental Management 10(2):
199-214.
Gerritsen, J., J.S. White, E.W. Leppo, J.B. Stribling, and M.T. Barbour. 1996. Examination of the Effects
of Habitat Quality, Land Use, and Acidification on the Macroinvertebrate Communities of
Coastal Plain Streams. CBWP-MANTA-TR-95-2. Maryland Department of Natural Resources,
Chesapeake Bay Research and Monitoring Division, Annapolis, MD.
Gerritsen, J., M.T. Barbour, and K. King. 2000. Apples, oranges, and ecoregions: on determining pattern
in aquatic assemblages. Journal of the North American Benthological Society 19(3): 487-496.
Gibson, G.R., M.T. Barbour, J.B. Stribling, J. Gerritsen, and J.R. Karr. 1996. Biological Criteria:
Technical Guidance for Streams and Small Rivers. EPA 822-B-96-001. U.S. Environmental
Protection Agency, Office of Water, Washington, DC.
Gilbert, R.O. 1987. Statistical Methods for Environmental Pollution Monitoring. VanNostrand Reinhold
Company, New York, NY.
Gulland, J.A. 1983. Fish Stock Assessment: A Manual of Basic Methods. FAO/Wiley Series, Vol. 1.
Wiley & Sons, New York, NY.
Hawkins, C.P. 2006. Quantifying biological integrity by taxonomic completeness: evaluation of a
potential indicator for use in regional- and global-scale assessments. Ecological Applications
16:1277-1294.
Hawkins, C.P., J.L. Kershner, P.A. Bisson, M.D. Bryant, L.M. Decker, S.V. Gregory, D.A. McCullough,
C.K. Overton, G.H. Reeves, R.J. Steedman, and M.K. Young. 1993. A hierarchical approach to
classifying stream habitat features. Fisheries 18(6): 3-12.
Hawkins, C.P., R.H. Norris, J. Gerritsen, R.M. Hughes, S.K. Jackson, R.K. Johnson, and R.J. Stevenson.
2000a. Evaluation of the use of landscape classifications for the prediction of freshwater biota:
synthesis and recommendations. Journal of the North American Benthological Society 19(3):
541-556.
Hawkins, C.P., R.H. Norris, J.N. Hogue, and J.W. Feminella. 2000b. Development and evaluation of
predictive models for measuring the biological integrity of streams. Ecological Applications
10(5): 1456-1477.
Hayes, M.L. 1983. Active Fish Capture Methods. In Fisheries Techniques, ed. L.A. Nielsen and D.L.
Johnson, pp. 123-145. American Fisheries Society, Bethesda, MD.
Helsel, D.R., and R.M. Hirsch. 2002. Statistical Methods in Water Resources. Book 4, Chapter A3 in
Techniques of Water-Resources Investigations. U.S. Geological Survey, Reston, VA. Accessed
February 10, 2016. http://pubs.usgs.gov/twri/twri4a3/.
4-55
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Hill, B.H., A.T. Herlihy, P.R. Kaufmann, R.J. Stevenson, F.H. McCormick, and C.B. Johnson. 2000. Use
of periphyton assemblage data as an index of biotic integrity. Journal of the North American
Benthological Society 19: 50-67.
Hill, B.H., A.T. Herlihy, P.R. Kaufmann, S.J. Decelles, and M.A. Vander Borgh. 2003. Assessment of
streams of the eastern United States using a periphyton index of biotic integrity. Ecological
Indicators 2: 325-338.
Hubert, W.A. 1983. Passive Capture Techniques. In Fisheries Techniques, ed. L.A. Nielsen and D.L.
Johnson, pp. 95-122. American Fisheries Society, Bethesda, MD.
Hughes, R.M. 1995. Defining Acceptable Biological Status by Comparing with Reference Conditions. In
Biological Assessment and Criteria: Tools for Water Resource Planning and Decision Making,
ed. W.S. Davis and T.P. Simons, pp. 31-47. Lewis Publishers. Boca Raton, FL.
Hughes, R.M., D.P. Larsen, and J.M. Omernik. 1986. Regional reference sites: a method for assessing
stream potentials. Environmental Management 10: 629-635.
Hughes, R.M., P.R. Kaufmann, A.T. Herlihy, T.M. Kincaid, L. Reynolds, and D.P. Larsen. 1998. A
process for developing and evaluating indices offish assemblage integrity. Canadian Journal of
Fisheries and Aquatic Sciences 55: 1618-1631.
Hurlbert, S.H. 1984. Pseudoreplication and the design of ecological field experiments. Ecological
Monographs 54(2): 187-211.
Jerald, A., Jr. 1983. Age Determination. In Fisheries Techniques, ed. L.A. Nielsen and D.L. Johnson, pp.
301-324. American Fisheries Society, Bethesda, MD.
Karr, J.R. 1991. Biological integrity: a long neglected aspect of water resource management. Ecological
Applications 1: 66-84.
Karr, J.R., and D.R. Dudley. 1981. Ecological perspective on water quality goals. Environmental
Management 5: 55-68.
Karr, J.R., K.D. Fausch, P.L. Angermeier, P.R. Yant, and I.J. Schlosser. 1986. Assessing Biological
Integrity in Running Waters: A Method and its Rationale. Special publication 5. Illinois Natural
History Survey, Champaign, IL.
Klemm, D.J., P.A. Lewis, F. Fulk, and J.M. Lazorchak. \99Q.Macroinvertebrate Field and Laboratory
Methods for Evaluating the Biological Integrity of Surface Waters. EPA-600-4-90-030.
U.S. Environmental Protection Agency, Environmental Monitoring and Support Laboratory,
Cincinnati, OH.
Klemm, D.J., Q.J. Stober, and J.M. Lazorschak. 1992. Fish Field and Laboratory Methods for Evaluating
the Biological Integrity of Surface Waters. EPA/600/R-92/111. U.S. Environmental Protection
Agency, Environmental Monitoring and Support Laboratory, Cincinnati, OH.
Klemm, D.J., J.M. Lazorchak, and P.A. Lewis. 1998. Benthic Macroinvertebrates. In Environmental
Monitoring and Assessment Program—Surface Waters: Field Operations and Methods for
Measuring the Ecological Condition ofWadeable Streams, ed. J.M. Lazorchak, D.J. Klemm, and
4-56
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
D.V. Peck, pp. 147-182. EPA/620/R-94/ 004F. U.S. Environmental Protection Agency,
Washington, DC.
Larsen, D.P. 1997. Sample survey design issues for bioassessment of inland aquatic ecosystems. Human
and Ecological Risk Assessment 3(6): 979-991.
McCormick, F.H., R.M. Hughes, P.R Kaufmann, D.V. Peck, J.L. Stoddard, and A.T. Herlihy. 2001.
Development of an index of biotic integrity for the mid-Atlantic highlands region. Transactions
of the American Fisheries Society 130: 857-877.
Meador, M.R., T.F. Cuffney, and M.E. Gurtz. 1993. Methods for Sampling Fish Communities as Part of
the National Water Assessment Program. Open File Report 93-104. U.S. Geological Survey,
Raleigh, NC.
Merritt, R.W., K.W. Cummins, and M.B. Berg, ed. 2008. An Introduction to the Aquatic Insects of North
America. 4th ed. Kendall/Hunt Publishing Company, Dubuque, IA. ISBN 978-0-7575-5049-2.
Millard, C.J., N. Jokay, C. Wharton, and J.B. Stribling. 2011. Ecological Assessment of the Lake
Allatoona/Upper Etowah River Watershed - Year 5 of the Long-Term Monitoring
Program, 2009-2010. Prepared for Lake Allatoona/Upper Etowah River Watershed Partnership,
Canton, GA, by Tetra Tech, Inc., Owings Mills, MD, and Atlanta, GA.
Moulton, S.R. II, J.G. Kennen, R.M. Goldstein, and J.A. Hambrook. 2002. Revised Protocols for
Sampling Algal, Invertebrate, and Fish Communities as Part of the National Water-Quality
Assessment Program. Open File Report 02-150. U.S. Geological Survey, Reston, VA.
Nelson, J.S., E.J. Grossman, H. Espinosa-Perez, L.T. Findley, C.R. Gilbert, R.N. Lea, and J.D. Williams.
2004. Common and Scientific Names of Fishes from the United States, Canada, and Mexico.
Special Publication 29. American Fisheries Society, Bethesda, MD.
OEPA (Ohio Environmental Protection Agency). 1987. Biological Criteria for the Protection of Aquatic
Life: Volumes 1-111. Updated 1988, 1989, and 2015. Ohio Environmental Protection Agency,
Division of Water Quality Monitoring and Assessment, Surface Water Section, Columbus, OH.
Accessed February 10, 2016.
http://www.epa.state.oh.us/dsw/bioassess/BioCriteriaProtAqLife.aspx.
Olsen, A.R., and D.V. Peck. 2008. Survey design and extent estimates for the Wadeable Streams
Assessment. Journal of the North American Benthological Society 27(4): 822-836.
Plafkin, J.L., M.T. Barbour, K.D. Porter, S.K. Gross, and RM. Hughes. 1989. Rapid Bioassessment
Protocols for Use in Streams and Rivers: Benthic Macroinvertebrates and Fish. EPA/440/4-89-
001. U.S. Environmental Protection Agency, Office of Water, Washington, DC.
Platts, W.S., W.F. Megahan, and G.W. Minshall. 1983. Methods for Evaluating Stream, Riparian, and
Biotic Conditions. General Technical Report INT-138. U.S. Department of Agriculture, Forest
Service, Ogden, UT.
Raven, P.J., N.T.H. Holmes, F.H. Dawson, P.J.A. Fox, M. Everard, I.R. Fozzard, and K.J. Rowen. 1998.
River Habitat Quality: the physical character of rivers and streams in the UK and Isle of Man.
Environment Agency, Bristol, England. ISBN1 873760 42 9.
4-57
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
Richards, C., and G.W. Minshall. 1992. Spatial and temporal trends in stream macroinvertebrate
communities: the influence of catchment disturbance. Hydrobiologia 241(3): 173-184.
Skalski, J.R., and D.H. McKenzie. 1982. A design for aquatic monitoring programs. Journal of
Environmental Management 14: 237-251.
Southwood, T.R.E. 1977. Habitat, the templet for ecological strategies? Journal Animal Ecology 46: 337-
365.
Stark, J.D. 1993. Performance of the macroinvertebrate community index: effects of sampling method,
sample replication, water depth, current velocity, and substratum on index values. New Zealand
Journal of Marine and Freshwater Research 27: 463-478.
Stevens, D.L., and A.R. Olsen. 2004. Spatially balanced sampling of natural resources. Journal of the
American Statistical Association 99(465): 262-278.
Stoddard, J. L., D.P. Larsen, C.P. Hawkins, R.K. Johnson, and R.H. Norris. 2006. Setting expectations for
the ecological condition of streams: the concept of reference condition. Ecological Applications
16(4): 1267-1276.
Stribling, J.B. 2011. Partitioning Error Sources for Quality Control and Comparability Analysis in
Biological Monitoring and Assessment. Chapter 4 in Modern Approaches to Quality Control, ed.
A.B. Eldin, pp. 59-84. INTECH Open-Access Publisher. ISBN 978-953-307-971-4. doi:
10.5772/22388. Accessed February 10, 2016. http://www.intechopen.com/books/modern-
approaches-to-qualitv-control/partitioning-error-sources-for-qualitv-control-and-comparability-
analysis-in-biological-monitoring-a.
Stribling, J.B., B.K. Jessup, and D.L. Feldman. 2008a. Precision of benthic macroinvertebrate indicators
of stream condition in Montana. Journal of the North American Benthological Society 27(1): 58-
67.
Stribling, J.B., K.L. Pavlik, S.M. Holdsworth, and E.W. Leppo. 2008b. Data quality, performance, and
uncertainty in taxonomic identification for biological assessments. Journal of the North American
Benthological Society 27(4): 906-919.
Teply, M., and L. Bahls. 2005. Diatom Biocriteria for Montana Streams. Prepared for Montana
Department of Environmental Quality, by Larix Systems, Inc., Helena, MT. 2005.
Teply, M., and L. Bahls. 2007. Statistical Evaluation ofPeriphyton Samples from Montana Reference
Streams. Prepared for Montana Department of Environmental Quality, by Larix Systems, Inc.,
Helena, MT. 2007.
Thornton, K.W., B.L. Kimmer, and F.E. Payne, ed. 1990. Reservoir Limnology: Ecological Perspectives.
John Wiley and Sons, Inc., New York, NY.
Thorp, J.H., and A.P. Covich, ed. 2010. Ecology and Classification of North American Freshwater
Invertebrates. Academic Press (Elsevier).
Urquhart, N. S., S. G. Paulsen, and D. P. Larsen. 1998. Monitoring for policy-relevant regional trends
overtime. Ecological Applications 8(2): 246-257.
4-58
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
USEPA (U.S. Environmental Protection Agency). 2002. Summary of Biological Assessment Programs
and Biocriteria Development for States, Tribes, Territories, and Interstate Commissions: Streams
and Wadeable Rivers. EPA-822-R-02-048. U.S. Environmental Protection Agency, Office of
Environmental Information and Office of Water, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2004. Wadeable Stream Assessment: Benthic
Laboratory Methods. EPA 841-B-04007. U.S. Environmental Protection Agency, Office of Water
and Office of Research and Development, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2009. National Rivers and Streams Assessment:
Field Operations Manual. EPA-841-B-07-009. U.S. Environmental Protection Agency,
Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2011. A Primer on Using Biological Assessments to
Support Water Quality Management. EPA-810-R-11-01. U.S. Environmental Protection Agency,
Office of Science and Technology, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2012. National Rivers and Streams Assessment
2013-2014: Laboratory Operations Manual. EPA-841-B-12-010. U.S. Environmental Protection
Agency, Office of Water, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2013. Biological Assessment Program Review:
Assessing Level of Technical Rigor to Support Water Quality Management. EPA-820-R-13-001.
U.S. Environmental Protection Agency, Office of Science and Technology, Washington, DC.
USFWS (U.S. Fish and Wildlife Service). 1991. Principles and Techniques ofElectrofishing. U.S. Fish
and Wildlife Service, Fisheries Academy, Office of Technical Fisheries Training,
Kearneysville, WV.
Van Sickle, J. 2008. An index of compositional dissimilarity between observed and expected
assemblages. Journal of the North American Benthological Society 27(2): 227-235.
Van Sickle, J., and S.G. Paulsen. 2008. Assessing the attributable risks, relative risks, and regional extents
of aquatic stressors. Journal of the North American Benthological Society 27(4): 920-930.
Vieira, N.K.M., N.L. Poff, D.M. Carlisle, S.R Moulton II, M.K. Koski, and B.C. Kondratieff 2006. A
Database of Lotic Invertebrate Traits for North America. Data Series 187. U.S. Geological
Survey, Reston, VA. Accessed February 10, 2016. http://pubs.water.usgs.gov/ds 187.
Weatherley, A.H. 1972. Growth and Ecology of Fish Populations. Academic Press, New York, NY.
Wetzel, R.G. 1983. Limnology. Saunders College Publishing, New York, NY.
Wright, J.F. 1995. Development and use of a system for predicting the macroinvertebrate fauna in
flowing waters. Australian Journal of Ecology 20:181-197.
Zuellig, R.E., D.M. Carlisle, M.R. Meador, and M. Potapova. 2012. Variance partitioning of stream
diatom, fish, and invertebrate indicators of biological condition. Freshwater Science 31(1): 182-
190. doi: 10.1899/11-040.1.
4-59
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 4
This page intentionally left blank.
4-60
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 5
5 Photo-Point Monitoring
By S.A. Dressing and D.W. Meals
5.1 Introduction
Good photographs can yield much information, and photography can play important roles in watershed
projects because the technology is available to everyone, training is simple, and the cost is relatively low
(USEPA 2008, ERS 2010). In the assessment phase, photos can help identify problem areas both within
the water resource (e.g., algal blooms, streambank erosion) and within the drainage area (e.g., cattle in
streams, discharge pipes). These same photos can also be very helpful in generating interest in the project
because they can convey easily understood information to a wide audience. In addition, photos can be
used to document implementation of practices including contour strip-cropping, stream buffers, rain
gardens, and other practices where physical changes are observable. Finally, photos can be used in project
evaluation. For example, photos taken before and after implementation of some types of remedial efforts
(e.g., trash removal and prevention) provide an indicator of progress that can be communicated easily to
most people.
To be useful, however, photographs must be taken in accordance with a protocol that ensures the
photographic database accurately represents watershed conditions and is suitable for meeting stated
objectives. This section provides an overview of ground-based photographic, or photo-point, monitoring,
including specific elements of an acceptable protocol and example applications.
5.2 Procedure
Photo-point monitoring requires careful planning to
ensure that meaningful information is provided to
assess condition or trends (Bauer and Burton 1993).
Monitoring design begins with a set of clear objectives,
and different objectives will generally require different
photo points (Hamilton, n.d.).
There are two basic methods of photo-point monitoring
- comparison photography and repeat photography -
but these methods can be used in combination
(i.e., comparison photography repeated overtime).
Method selection should generally precede other design
decisions but choices made in one step of monitoring
plan design can affect the options in other steps, so
flexibility is necessary. Selection of monitoring areas,
identification of the specific features to photograph,
camera placement, and the timing and frequency of
photography are all typically determined after
monitoring objectives and basic method are addressed
(after Hamilton n.d.).
Photo-Point Monitoring
Set objectives
Select method
Select monitoring areas
Establish, mark, and assign
identification numbers to photo and
camera points
Identify a witness site
Record site information and create a
site locator field book
Determine timing and frequency of
photographs
Define data analysis plans
Establish data management system
Take and document photos
5-1
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
All photo and camera locations must be marked, monitoring site characteristics must be recorded, and a
field book or similar documentation should be created to assist those taking photographs at the sites over
time. This is critical if different people will be taking photographs throughout the course of a project.
Plans for analysis of the photos and use of any photo-derived information must be determined and
documented before photo-point monitoring begins. The data analysis plan will also help determine how
best to organize and file photos and metadata.
These various design steps are described in greater detail below.
5.2.1 Setting Objectives
There will most likely be different photo-point monitoring objectives for project assessment, planning,
implementation, and evaluation. Objectives for all project phases should be defined as early on in the
project as possible, however, to maximize the efficiency of the photo-point monitoring effort. It may be
possible, for example, to use photos from problem assessment or planning as pre-implementation photos
for tracking implementation.
Realistic objectives begin with an understanding of what is likely to be seen and measured with
photographs. Cameras exist that can take photos in both the visible and the non-visible spectrum
(e.g., infrared or ultraviolet). For example, aerial photography has been used successfully to identify
sediment sources at the watershed scale through correlation of photo density readings from the
transparencies of color-infrared photographs with suspended sediment measurements (Rosgen 1973
1976). In addition, Hively et al. (2009a 2009b) combined cost-share program enrollment data with
satellite imagery and on-farm sampling to evaluate cover crop N uptake on 136 fields within the
Choptank River watershed on Maryland's eastern shore. Thermal infrared (TIR) images acquired from
airborne platforms have been used in stream temperature monitoring and analysis programs, detecting and
quantifying warm and cool water sources, calibrating stream temperature models, and identifying thermal
processes (Faux et al. 2001). TIR imagery has also been used in the mapping of groundwater inflows and
the analysis of floodplain hydrology. While such applications are indeed useful, this guidance and the
example objectives that follow focus solely on ground-based photography in the visible spectrum.
An array of observable features listed in various guidance documents includes pasture condition, livestock
distribution in a meadow, ground cover, tree canopy and health, vegetation density, woody vegetation,
native vegetation area, wetland area, native plant richness, large trees, stream profile, streambank
stability, streambank cover, fallen woody material and in-stream habitat, farm water flow, gully erosion,
hill slope erosion, wind erosion, weed cover and species (Bauer and Burton 1993, ERS 2010, Hall 2001,
Shaff et al. 2007). In addition, Hall (2001) provides numerous examples of successes and failures to
measure changes in observable features with photo-point monitoring.
Examples of potential objectives for photo-point monitoring at various project stages include the
following.
Assessment
" Document trash levels on beaches or in urban settings
" Document stream features
" Document algal blooms in waterbodies
" Identify sources of sediment plumes
5-2
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
• Document livestock activity near waterbodies
• Identify gullies and areas of streambank instability
• Identify areas in greatest need of urban runoff control measures
Planning
• Help locate areas were streambank protection and stream restoration are needed
• Document livestock operation needs to assist in budget development
• Provide evidence of watershed problems and potential solutions for public outreach
• Provide photos to assist the design of urban runoff control measures
Implementation
• Document tree growth in riparian zone over time
• Document implementation of rain gardens
• Document stream restoration activities
• Document and track changes in percent residue at representative agricultural sites across a
watershed
Evaluation
• Document changes in streambank cover or stream profile as a result of stream restoration
• Demonstrate the effects of different grazing management systems on pasture condition
• Illustrate how a stream handles high-flow events before and after restoration
• Document changes in beach trash over time
The type and rigor of photo-point monitoring needed to meet these objectives varies. Alternative methods
are described below.
5.2.2 Selecting Methods
As defined by Hall (2001), ground-based photo monitoring involves "using photographs taken at a
specific site to monitor conditions or change," something that is accomplished by one of two methods:
comparison or repeat photography. Comparison photography typically involves the creation of a photo
guide from a set of standard photos taken to represent the expected range of an attribute (or condition) of
interest (e.g., utilization of grazing plants). Field measurements are taken to establish values for the
attribute of interest at levels represented by each of the photos in the guide. Figure 5-1 illustrates the
concept whereby the value (percentage of area covered with dots) is determined from field measurement
of the attribute of interest (dots/unit area in this conceptual example). The comparison photos in the guide
are then used in the field to perform on-site assessment.
5-3
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 5
: Photo: 1:
>;•:•:•:•:•:•:•:•:•:•:;,
nmm
^ ;:;:;:;:;:;:;:;:;;;
Photos
i|||||il||||;
Photo 4
Photo 5
l_ _J
I
Figure 5-1. Comparison photos
In repeat photography, photos are taken of the subject over time at the same location to document change
or monitor activity. Repeat photography has been used to document landscape change, including the
advance and retreat of glaciers (Key et al. 2002). This method has also been used extensively to document
progress in dam removal (USDA-FS 2007), riparian area protection (Bauer and Burton 1993), and stream
restoration projects (Bledsoe and Meyer 2005).
A third type of photography is opportunistic photography. As described by Shaff et al. (2007),
opportunistic photos are not taken from a permanently marked location, and they are not part of a repeat
photography effort. There is also no photo guide as is used in comparison photography. Examples of
subjects that can be addressed with opportunistic photography include a site during construction or an
area after a significant natural or human-induced event.
Comparison photography is generally well suited to meeting assessment objectives in cases where
photography is an appropriate monitoring approach. Opportunistic photography also usually plays a role
in problem assessment. Both methods can be used for qualitative purposes, and comparison photography
can be used in quantitative analyses to a limited degree (see "Qualitative" and "Quantitative" below).
Opportunistic photography is not designed for quantitative analyses, however. Other information sources
(e.g., livestock inventories, street maps, and permitted discharge reports) and monitoring data (e.g., water
chemistry, aquatic biology, and habitat) will be needed in combination with photos to meet assessment
objectives.
A combination of comparison and opportunistic photography can be helpful in achieving planning
objectives, coupled with information from other sources. Opportunistic photos, in particular, can be quite
helpful in communicating to the general public and stakeholders the need for restoration or BMPs to
achieve watershed objectives. Visual inventories can be helpful in estimating implementation costs but
should be used in combination with more traditional approaches to assessing need.
5-4
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
Repeat photography is generally most useful for tracking restoration and implementation of BMPs.
Comparison photos can be used to assess such important indicators as the extent that conservation tillage
has resulted in increased percent residue. Opportunistic photos can help show how restored stream
reaches or urban runoff practices handle high-flow events.
While photo-point monitoring can be very helpful, it should be kept in mind that tracking implementation
of rain gardens, for example, does not require photos. Observers could simply record in a database that a
rain garden has been implemented at a specific address or global positioning system (GPS) location, but a
photograph might add valuable information about the rain garden (e.g., size, location, plant selection and
density) that could be explored at a later date if water quality data raise questions about rain garden
performance.
Watershed projects cannot rely on photographs as the sole source of information for problem assessment
or planning. Project implementation is nearly always tracked by means other than photo-point monitoring,
but the addition of photographs can be the best way to document the installation of structural practices
(e.g., lagoons, constructed wetlands) or the growth of vegetation associated with stream restoration or
grazing management. It is important to keep in mind that photo-point monitoring should always be
considered as a cost-effective tool for providing information in conjunction with other monitoring and
information gathering efforts. While there are examples where photo-point monitoring is relied on as the
primary monitoring method due to budgetary constraints, it is not recommended.
All three photo-point monitoring methods - comparison, repeat, and opportunistic - can support
qualitative analyses, and comparison and repeat photography can also be used in quantitative analyses.
5.2.2.1 Qualitative Monitoring
Photographic monitoring methods usually generate qualitative information (e.g., Shaff et al. 2007).
Creating a pictorial record of changing conditions, showing major changes in shrub and tree populations,
visually representing physical measurements taken at a location, or recording particular events such as
floods are typical of the types of photo-point monitoring objectives stated for these projects (ERS 2010).
Those who have used photographic monitoring for watershed projects have generally used this method to
document implementation of practices, typically the growth of vegetation associated with
stream/streambank restoration or grazing management. These qualitative findings have been used most
frequently to corroborate findings from more quantitative monitoring methods.
Photos are recommended for long-term monitoring of grassland, shrubland, and savanna ecosystems but
simply as a qualitative indicator of large changes in vegetation structure and for visually documenting
changes measured with other methods (Herrick et al. 2005 2005a). Photos should not be considered as a
substitute for quantitative data; it is very difficult to obtain reliable quantitative data from photos unless
conditions are controlled. Bledsoe and Meyer (2005) used photographs to compare changes from year to
year, document noteworthy morphologic adjustments, document features of interest at various locations
and times during the year, and analyze vegetation establishments as part of monitoring channel stability.
5.2.2.2 Quantitative Monitoring
Quantitative monitoring involves either measurement or counting. When measurement is desired it is
important to use meter boards (field rulers mounted vertically) or other size control boards to provide a
reference for measurement (Hall 2001 2002). Small frames (1 m2) have been used for closeup or plot
studies, while meter boards and Robel poles are often used for more distant studies. These standard
5-5
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 5
references are captured within the photographs to provide a means of measuring features of interest.
Counts of items of interest (e.g., trees of varying heights) can be obtained through visual observation of
images. Another alternative for obtaining counts or percentages for quantitative analysis is to count digital
image pixels that fall within a specified color range (see Digital Image Analysis below).
Meter boards can also provide a consistent point for camera orientation and a point on which to focus the
camera (Hamilton n.d.). Figure 5-2 illustrates the use of a meter board and photo identification card (see
section 5.2.13). The following are methods described by Hall (2001) that incorporate varying degrees of
quantitative analysis. It should be noted that while these methods all support some level of quantification,
documentation of precision and accuracy is generally lacking.
'
Photo Point Number
Cam era Point ID:
Photographer
Figure 5-2. Illustration of a photo identification card and a meter board
5.2.2.2.1 Photo Grid Analysis
Photo grid analysis involves placing a standardized grid over a photo and counting the number of
intersects between the grid lines and features of interest. When photo grid analysis is planned, it is very
important that the distance between the camera and meter board is constant (Hall 2001 2002). It is
recommended that the camera height is held constant, but it is only required to be constant if the grid is
used to track position (in addition to size) of features over time. The size control board should cover at
least 25 percent of the photo height, with the optimum range being 35 to 50 percent. The board, however,
5-6
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
cannot obstruct the features of interest that will be measured. A level meter control board is preferred
because it will match up more easily with a superimposed grid. Vegetation around the front of the meter
board should be removed to expose the bottom measurement line to provide maximum precision in grid
adjustment.
Hall (2001 2002) notes that both grid precision and observer variability are major factors in determining
the ability to measure change. The percentage of photo height taken by the meter board is a very
important factor in the precision with which grids are fit. It should be noted that changes in technology
(cameras and software) may provide better results than found by Hall. For example, testing showed that a
meter board that covers 35 percent of the photo height was 1.3 times more precise than a board that
covered 25 percent of the photo height. Testing on observer variability also indicated that, on average, a
change >12 percent in intersects for all shrubs (a measurement for grid analysis) was needed to
demonstrate change at the 5 percent confidence level. Additional details and examples of photo grid
analysis are provided by Hall (2001 2002).
5.2.2.2.2 Transect Photo Sampling
Photo points can also be established along a transect to obtain more quantitative information (Hamilton
n.d.). Hall (2001) describes in detail five kinds of photo transects: (1) 1-ft2 frequency photographed with
or without a stereo attachment on the camera, (2) nested frequency using four plot sizes in a 0.5- by 0.5-m
frame, (3) 1-m2 plot frame photographed at an angle, (4) vertical photographs of tree canopy cover, and
(5) measurement of herbaceous stubble height using the Robel pole system.
Transect installation is straightforward, requiring skillsets and procedures similar to those for the
establishment of photo-point and camera sites (see sections 5.2.4 and 5.2.5). Equipment needs are similar
as well. Size control boards are required, and they can serve multiple purposes, including estimation of
height of grass and shrubs, orientation (for consistency) and focus (for greatest depth of field) of the
camera, and grid analysis (Hall 2001). Key features of the five kinds of photo transects are provided
below, but the reader should not select any of these methods until reviewing the detailed discussion of
each by Hall.
5.2.2.2.2.1 One-Square-Foot Sampling
This method uses a 1-ft2 plot placed every 5 ft along a 100-ft transect. The 20 plots are monitored to
document changes in species, species density, and frequency as a means to estimate change in vegetation
and soil surface conditions. Statistical analysis of data generated by this method is not possible.
5.2.2.2.2.2 Nested Frequency
This method uses a sample frame with four nested plot sizes to document change in species frequency
along five 100-ft transects of 20 plots each. Statistical analysis suggests significant change in frequency
(the number of times a species occurs in a given number of plots) at the 80-percent level of probability.
5.2.2.2.2.3 Nine-Square-Foot Transects
This plot system uses five 9-ft2 plots along a 100-ft transect to document changes in species frequency.
Photographs are taken of the plot frame at an oblique angle rather than from directly above. Interpretation
of change is based not on statistical analysis but on professional judgment and interpretation of the
photos.
5-7
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
5.2.2.2.2.4 Tree Canopy
It is recommended that any transect placed in a forest setting should have tree cover sampled because of
its effects on the density and composition of ground vegetation. Tree canopies are photographed from
ground level by using a camera leveling board or other means to ensure that the camera is pointing
directly above. The method requires photographs of tree cover at the 0-, 25-, 50-, 75-, and 100-ft locations
on transects used for any of the three methods described above. Because photo grid analysis is used to
estimate tree cover, the same focal length must be used for all photos and the long axis of the camera
should be perpendicular to the transect.
5.2.2.2.2.5 Robel Pole
A Robel pole is a 4-ft pole with 1-in bands painted in alternating colors (USDA-CES et al. 1999).
Vegetation height is measured by photographing the pole from a specific distance and height above the
ground. This is accomplished by attaching a 4-m-long line between the 1-m mark on the Robel pole and
the top of a 1-m-tall line pole. The Robel pole is placed at the sample location and the line is stretched
out. The camera is set on top of the line pole and a photo is taken. By consistently using the 4-m line and
1-m camera height (4-to-l ratio), the same angle is obtained for all photos.
5.2.2.2.3 Digital Image Analysis
Many of the methods described by Hall (2001) were centered on film-based photography, and they often
require a substantial amount of measurement and analysis by hand. Newer methods such as digital image
analysis (DIA) use computers to analyze digital images, offering the potential advantages of improved
objectivity, accuracy, and precision. In one form of DIA, color images are converted to grayscale
(monochrome) images using an algorithm that converts each pixel to white or black based on the color
content of the original pixel. The algorithm in this case is designed to select those colors that represent the
feature to be counted. For example, Rasmussen et al. (2007) used DIA to determine the proportion of
pixels in digital images that were green to estimate crop soil cover in weed harrowing research.
There are significant hurdles to overcome in applying DIA to photo-point monitoring for watershed
projects. Factors such as lighting, camera angle, size of the area photographed, and the growth stage of
plants should be evaluated to quantify their effects on the accuracy or precision of the method
(Rasmussen et al. 2007). It is also important to have a true value to compare against the DIA-based results
to assess the accuracy of the method (Richardson et al. 2001).
A significant contribution to DIA made by Rasmussen et al. (2007) was automated determination of the
gray-level threshold which defines the difference between vegetation (the subject of interest in their
study) and non-vegetation. This is especially important when lighting conditions vary in the field. With
this capability, the researchers were able to develop an automated DIA procedure for converting each
digital image into a single leaf cover (proportion of pixels that are green) value for analysis. Their
research used the MATLAB Image Processing Toolbox (MathWorks 2012) but other options include
Mathematica (Wolfram 2012) and a wide range of image processing products developed for a large
number of applications.
5-8
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
5.2.3 Selecting Areas to Monitor
The areas selected for photo-point monitoring must be appropriate for the stated objectives and consistent
with the data analysis plans (section 5.2.11). Depending on the monitoring objectives, suitable sampling
locations may be chosen to represent average or extreme conditions.
For problem assessment where opportunistic photography is used, site selection may be similar to that
employed in a synoptic survey for water quality monitoring. Photos may be taken by individuals walking
the stream to identify areas of streambank erosion or point source discharges. Photography of sources
could involve a windshield-survey approach where photos are taken on a pre-determined route. Each
opportunistic photo would need to be properly labeled as described in section 5.2.13.
When tracking project implementation (e.g., BMPs, restoration) or evaluating project success, it is most
important to select an area that is most likely to undergo the physical transformations that can and must be
tracked in order to support these objectives. Hall (2001) notes that this task may be straightforward (e.g.,
measuring the impact of stream restoration on the segment restored) or somewhat more complicated (e.g.,
documenting the impacts of livestock grazing on riparian vegetation). The latter case is more complicated
because it requires some knowledge of livestock distribution, areas sensitive to grazing, and grazing
patterns. Because it is likely that only a portion of the area of interest can be monitored, it is important to
determine up front whether or not the findings can be extrapolated to areas not monitored. This is
particularly challenging for photo-point monitoring because statistical analysis of photo-based data is not
common. Attribution of sample findings to the broader area of interest would require the sample is
representative, there is a measurable variable from the photos, the distribution for that variable is known,
and an estimate of the standard deviation is available.
Some may wish to use photo-point monitoring to track BMP-related information in support of a
traditional biological or chemical monitoring program. For example, if total suspended sediment
concentration or loads are monitored in a predominantly agricultural watershed, it may be useful to track
percent residue as an indicator of the extent to which reduced tillage practices have been implemented
across the watershed. This could be accomplished in a number of ways including photo-point monitoring
of a set of randomly selected field sites. Both comparison (to determine percent residue) and repeat (to
track changes in percent residue overtime) photography would be used in this application (see section
5.2.2). Again, attribution of sample findings to the broader area would require that the samples are
representative, the distribution of percent residue is known, and an estimate of the standard deviation is
available.
5.2.4 Identifying Photo Points
Photo points are defined somewhat differently in various guidance manuals, which can lead to confusion
when flipping back and forth between manuals. This document adopts the terminology used by Hall
(2001), in which the photo point is essentially what you point the camera at when you take the
photograph, and the camera point is a permanently marked location for the camera (Figure 5-3). Photo
points have also been defined as permanent or semi-permanent sites set up from where you take a series
of photographs over time (ERS 2010). Despite the different definitions and intermingling of various
concepts within these definitions, photo-point monitoring manuals ultimately address the area to be
photographed, the location from which the photos are taken, and the camera direction and settings to
identify what will be captured in the photos. In simple terms, the photo point is what you point the camera
at when you take the photograph.
5-9
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 5
The area captured in each photo will depend on the monitoring objectives and is controlled by camera
settings and the distance between the camera location and the subject. Hall (2001) describes three general
types of photos, each of which has an associated scale:
• Landscape - distant scenes with areas generally greater than 10 ha
• General - specific topics monitored on areas 0.25 to 10 ha
• Closeup - specific topics on areas under 0.25 ha
Figure 5-3. Photo illustrating photo points (A and B) and camera points (1 and 2). Photos of A and
B are taken from cameras located at 1 and 2.
Landscape photography generally requires a long-term commitment during which repeat photos are taken
as infrequently as every 20 years or so (Hall 2001). This timeframe is greater than typically encountered
in watershed projects. General and closeup monitoring will be more appropriate for most watershed-scale
projects. Hamilton (n.d.) states that general photography can be used to document an entire scene,
whereas topic (closeup) photography narrows the target down to specific elements or subjects in the
landscape.
Scale is also incorporated within the definitions of photo types found in other guidance documents. For
example, one scheme refers to spot, trayback (small truck with short, flat tray in back rather than a typical
pickup box), and landscape photographs which generally correspond to Hall's closeup, general, and
landscape photos (ERS 2010). Shaff et al. (2007) describes feature, landscape, and opportunistic photos.
Landscape photos cover a broader area than feature photos, while opportunistic photos (see section 5.2.2)
vary in scale but are generally at the feature or finer scale. The authors also provide guidelines on the type
of photography and features to photograph for various restoration activities associated with habitat
5-10
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
improvement projects, road projects, water management projects, wetlands, and fish passage
improvement.
Be sure to consider the following when establishing photo points:
• The general or specific features that must be photographed to meet the monitoring objectives.
• How representative the photo points are of conditions in the study area.
• Whether the number and type of photo points are sufficient for tracking change.
• Whether changes will be visible at the desired scale.
• Whether the site is accessible and lighting and sight lines are adequate during the entire monitoring
period.
5.2.5 Establishing Camera Points
As noted in section 5.2.4, camera points are permanently marked locations for the camera. Hamilton
(n.d.) suggests selecting camera points from which multiple photo points can be photographed. The same
photo point can also be photographed from multiple camera points, for example, if there is a need to
examine the subject matter at different scales or from different angles. If the sizes of objects will be
compared in photos taken from multiple camera points, the distance from each camera point to the photo
point must be the same. In addition, to avoid shadowing of the photo point, camera points should be
located north of photo points when they are close together.
Hall (2001) performed field testing of camera point setups (e.g., distance from photo point and the
vertical and horizontal positioning of the camera) to determine the effects of various camera positions and
settings on the ability to perform reliable repeat photography. Results of this testing clearly showed the
following:
• Distance from the camera to the meter board (or subject) affects both the size and location of
objects photographed.
• The vertical and horizontal position of the camera affects the location but not the size of objects
photographed.
• Focal length is not a critical issue because images can be enlarged or reduced to a constant area of
coverage. Resolution can be lost, however, if images are enlarged or cropped too much, so it is best
that the same or similar focal length be used for all photos.
Depending on the study objectives, therefore, camera point setup should provide a constant distance from
the camera to the photo point (for size and location considerations), and consistent height and left-right
orientation of the camera (for location). It should be noted that in Hall's testing, camera position was
shifted both upward and sideways by 40 cm (16 in) from an initial position centered at 1.4 m (55 in)
above the ground. Smaller shifts would result in lesser changes in object location.
Figure 5-3 illustrates the location of photo and camera points. Both camera points 1 and 2 would need
consistent camera positions if object locations were to be tracked overtime. Meter boards can be used to
guide camera position when taking photos, with the camera siting always on the top, bottom, or other
specific marking on the meter board.
5-11
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
A recommended standard equipment list for establishing photo-point monitoring areas can be found in
section 5.3.
5.2.6 Marking and Identifying Photo and Camera Points
Every photo and camera point should be geolocated, photographed, and permanently marked so that those
returning to take photos can find the sites with little waste of time. Capturing prominent features such as a
ridge line in the photos can help others identify the location and the photo points (Bauer and Burton
1993). Labor is usually the greatest cost associated with monitoring efforts (see chapter 9), and doing
whatever it takes to minimize the time needed to find photo-monitoring sites is cost effective. If
volunteers perform the monitoring, marking of photo and camera points is essential to efficiently finding
the locations so they can spend more time taking and documenting photos and less time searching for
sites.
The best material to mark sites depends on the circumstances, but metal fenceposts work well in many
cases (Hamilton n.d.). If metal fenceposts are unsuitable due to appearance or other considerations, steel
survey stakes driven into the ground may be appropriate provided that metal detecting equipment is
available (Hall 2001). If steel stakes are used, they can be covered with plastic pipe for safety, and all
stakes can be painted in bright colors to improve visibility (Larsen 2006). Each photo and camera point
should be given a unique identification number.
It is very important that the distance between camera points and photo points is measured and
documented (Hamilton n.d.). Site location can be facilitated by use of a GPS but marking of photo and
camera points will still be necessary in many cases, given that the best resolution for GPS systems is
currently about 3-5 meters. Identifiers for opportunistic photos and temporary photo and camera points
used for problem assessment and planning should at least include the purpose, address or GPS
coordinates, camera direction, date photos were taken, narrative description of what was observed, and
photographer name to provide sufficient information to interpret the information obtained and revisit the
site if necessary.
5.2.7 Identifying a Witness Site
A witness site is an object that can be easily identified when returning to the monitoring area (Hamilton
n.d., Hall 2001). It may be a large rock, a structure, or other feature that is easily identifiable from the
road or path to the photo and camera points. It is important to measure and document the distance and
direction from the witness site to the camera points, photo points, or both. If possible, it is also helpful to
attach a permanent identification tag to the witness site with the distance and direction to the photo and/or
camera points inscribed on the tag (Hamilton n.d.). Newer photo-monitoring guidance recommends the
use of GPS devices to facilitate finding the photo and camera points (ERS 2010, Shaff et al. 2007). In all
cases, however, it is helpful to have photographs of the site and a description of landmarks to help locate
and identify important spots within the monitoring area.
5.2.8 Recording Important Site Information
Information about any monitoring site, whether it be chemical, biological, physical, or photographic
(permanent or temporary), should be recorded to help future staff understand the reasons for selecting the
site and to help in the interpretation of data collected from the site. Maps, aerial photographs, and
5-12
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
standardized forms can be used to record date, observer name(s), location, site description, objectives,
identification numbers, and locations of the witness site, photo points, and camera points, including
distances and directions between points. It is important to indicate whether directions are magnetic or true
degrees (Hamilton n.d.), a topic addressed in detail by the U.S. Search and Rescue Task Force
(USSARTF n.d.). Standardized forms for all aspects of photo-point monitoring can be found in existing
documents (Hall 2001 2002, Shaff et al. 2007).
5.2.9 Determining Timing and Frequency of Photographs
Monitoring frequency should be based primarily on the monitoring objectives, planned data analyses,
features to be photographed, and expectations regarding detectable change in those features. Photo-point
monitoring for problem assessment and planning can be a one-time activity or may involve multiple
photographs taken at various times during the year to characterize seasonal, flow-related, or other
significant variability. Efforts to track project implementation or evaluate project success will usually
involve multiple years, with the frequency and timing of photos based on an understanding of seasonal
and other variability.
Land managers are encouraged to photograph native vegetation at least once per year at the end of the
growing season, or twice per year to show seasonal differences (ERS 2010). For restoration projects, the
frequency options are generally seasonal, annual, or biennial (Shaff et al. 2007). In addition, photos taken
during the high-flow and low-flow seasons should be compared to give some indication of the causes
affecting streambank condition. Regardless of the frequency selected, annual changes should be assessed
using photos taken at the same time of year.
Although photo-point monitoring for watershed projects is usually qualitative rather than quantitative, the
concept of MDC (see section 3.4.2) can still be applied when determining the frequency and duration of
photography. In essence, MDC is based on sample variance and the number of independent samples taken
over time. Kinney and Clary (1998) used repeat photography to track cattle density (animals/ha) on
various vegetation-soil categories in a riparian meadow and used analysis of variance to test for
differences in cattle distribution across vegetation-soil categories. Such time-series data could be analyzed
to estimate variance (i.e., variability) in the number of cattle in each photograph. This data could then be
used in an MDC analysis to estimate how often photographs would need to be taken to detect a significant
change in cattle density at a given level of confidence. It is important to note that the authors found
autocorrelation in their data due to frequency of photography, something that would have to be addressed
in the MDC analysis (see section 3.4.2).
In an assessment of photo grid analysis precision, it was found that variability among different observers
was about 12 percent, indicating that a change in mean intersects of that much would be needed to
indicate that the change was real at the 5 percent level of confidence (Hall 2001). Monitoring, therefore,
would need to continue until a 12 percent change or more was expected.
Absent a rigorous database to support MDC analysis, it is recommended that a qualitative assessment of
time needed to see measurable change is performed. Guidelines that can be used to estimate the number
of years photo-monitoring should continue to document measurable change include plant growth rates for
restoration activities, typical timeframes for construction of urban runoff controls, and historical patterns
for adoption of agricultural BMPs.
5-13
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
5.2.10 Creating a Field Book
Hamilton (n.d.) recommends creation of a field book to help others find the monitoring location, witness
site, and photo and camera points. Field books should also include copies of the original photo-point
photographs, and other important site information recorded as described under section 5.2.8. Advances in
GPS, portable computer, and cell phone technology, however, may reduce the need for a physical field
book, but a printed version should be created as a backup.
5.2.11 Defining Data Analysis Plans
It is essential to establish plans for analysis before taking the photos. As described in section 5.2.1 and
5.2.2, photo-point monitoring objectives can range from highly qualitative to quantitative, and data
analysis plans need to be worked out in advance to ensure that information collected through photo-point
monitoring will be sufficient to achieve these objectives.
Although statistical analysis of photo-based data for watershed projects is uncommon, examples exist that
could be applied to watershed projects. For example, quantitative analysis of differences in grazing
patterns in various areas of a riparian meadow was performed by Kinney and Clary (1998) using analysis
of variance. Photos were analyzed to count the number of cattle within each of five vegetation-soil
categories that were delineated within the study area and superimposed on individual photographs.
Through this method, researchers created a database with counts that were converted to a density measure
that was associated with both year and class variables (e.g., vegetation-soil category, pasture number).
In another example where statistical analysis was applied to photo-derived data, digital image analysis
was compared against subjective analysis (SA) and line-intersect analysis (LIA) in determining the
percentage of turf cover on study plots (Richardson et al. 2001). For DIA, the percentage of green pixels
in images of turfgrass taken from a digital camera mounted on a monopod was calculated to determine the
turf coverage percentage in each of the images. The DIA approach was shown to be very accurate through
calibration with turf plugs of known cover, and DIA also performed far better than either SA or LIA in
determining the percent cover of study plots. The variance for DIA was only 0.65, while the variances for
LIA and SA were 13.18 and 99.12, respectively.
As described in section 5.2.2, both the photo grid analysis and nested frequency methods support
statistical analysis (Hall 2001). For example, demonstration of regression analysis of grid intersects from
annual photography over a 20-year period appeared to be useful.
If these or other monitoring approaches that support statistical analysis are planned, it is essential that the
statistics to be performed are identified, the data needs to support the statistical analyses are documented,
and plans are developed at the beginning of a project to obtain the needed information from photo-point
monitoring. Because statistical analysis of photo-derived data is uncommon for watershed projects, it is
essential that a statistician is involved in the design of the monitoring effort.
5.2.12 Establishing a Data Management System
Data management systems are described in detail in section 3.9. The basic requirements and safeguards
associated with a data management system for water quality data also apply to photo-point monitoring
data sets. These include an organized and readily accessible filing system, quality assurance and quality
control procedures, working interfaces between data files and data analysis software, and backup systems.
5-14
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
It is recommended that backup archives are kept at a location separate from the original data (Hamilton
n.d.).
As with water quality monitoring data records, information on monitoring objectives, designs, and
locations must also be recorded and associated with the photos taken at each site. All information
recorded on forms should be included in the database and linked to photos as appropriate.
If necessary, hard copies of photos can be stored in manila folders in filing cabinets or above-floor boxes
and should be labeled clearly with locational information, date, time, and camera and photo point
identifiers (Bechtel 2005, Larsen 2006, Shaff et al. 2007). Digital images and files will need to be stored
in a computer database housed on a computer or computer network, and it is recommended that file
names provide the same information contained in the labels on the paper photos (Bechtel 2005, Shaff et
al. 2007). Software such as GPS Photo Link can be used to process the GPS information onto the images
(Larsen 2006). Digital information should be backed up on CDs or other "permanent" storage devices,
and networks should be backed up nightly (Bechtel 2005). Photo-point monitoring will usually be
performed far less frequently than storm-event monitoring, for example, but the file sizes associated with
photographs may create data storage challenges that should be considered early on in the project.
Whether photos are used for qualitative or quantitative analyses, it is important that standard procedures
are established and followed. For example, photos used in a river continuity assessment in New
Hampshire were taken in accordance with a standard operating procedure that was incorporated within a
quality assurance project plan (Bechtel 2005). The QAPP identified equipment needs and the roles and
duties of team members, provided general instructions, and gave details on all important aspects of
selecting sites and taking the photos. In addition, volunteers were trained in photo documentation, and
standardized forms were provided to ensure consistency.
5.2.13 Taking and Documenting Photographs
Whether photo points are temporary or permanent, opportunistic or part of a trend assessment, certain
guidelines should be followed to ensure that the photos support the monitoring objectives. It should be
clear from the following recommendations, some of which are slightly at odds with each other, that
photography is part art, part science (Bechtel 2005, ERS 2010, Shaff et al. 2007):
" Closeup photos should be taken from the north facing south to minimize shadows.
• Both medium and longer distance photos should be taken with the sun behind the photographer.
• Recommendations on the best times for taking photos vary, with some choosing early in the
morning, late in the afternoon, or on slightly overcast days to reduce shadows and glare, and others
wanting clear days between 9 a.m. and 3 p.m.. Photos taken before 9 a.m. and after 3 p.m. can
result in increased shadowing and a different color cast that could conceal some features.
• Some recommend camera settings that give the greatest depth of field, while others simply
recommend using the camera's auto settings.
" Report the true compass bearing (corrected for declination) if possible.
5-15
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
Additional guidelines apply when the monitoring plan involves repeat photography. For example,
consistency is essential for trend assessment, and the following information taken from a variety of
sources should be recorded with each photograph to ensure such consistency (Bechtel 2005, ERS 2010,
Hall 2001, Hamilton n.d., Larsen 2006, Shaff et al. 2007):
* When shooting repeat photography it is helpful to compare the view through the camera with a
copy of the original photo to create comparable photos. Camera settings should be the same as
those documented when the original photo was taken.
• Document the type of camera and lens used, digital resolution, tripod and camera height, lens focal
length or degree of zoom, light conditions, compass direction of the photo, and the distance from
the camera to the one-meter board or center of the photo area.
• Document whether the camera is held horizontally or vertically.
« Record the date, location, compass bearing, and management history since the last photo was taken
(e.g., description of observable progress in achieving restoration or BMP goals).
• Describe the scene or subject and record that information.
« Hold the camera at eye level, positioning it so the one-meter board is centered in the middle of the
photo. Try to include some skyline in the photo to help establish the scale of the area. Photo
identification cards should be placed within the camera's field of view for each photograph to
embed relevant information into the picture. Figure 5-2 illustrates one approach to positioning of
the 1-m board and photo-identification card. The recommended content for each card is illustrated
in Figure 5-4. Some of this information (e.g., date and time) can be embedded using digital camera
options, and these options are likely to improve over time.
• Blue paper should be used for photo identification cards. Alternative approaches may include
laminated cards or small chalk boards.
* Framing of the photo should ensure that the photo identification card does not obscure features of
interest.
« The angle from which the photo is taken should be consistent. When taking photos at a height of
about 3 m from a trayback, tripod, or step ladder, a downward angle of 15 degrees is recommended
to illustrate ground condition and features, (e.g., the amount of feed available in a pasture).
Date: / /
Time:
Site Name:
Photo Point Number:
Camera Point ID:
Photographer:
Figure 5-4. Photo identification card
Logistical considerations for repeat photography include the following:
« Photo-documentation teams should consist of two people for both safety and logistical concerns
(Bechtel 2005, Herrick 2005).
5-16
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
• Once at the site, it is estimated that it will take about 3 min per photo from a single camera point
(Herrick 2005).
« Landowner permission may be needed for some monitoring locations, and it is advisable to check
on the legality of taking photos of private property in your jurisdiction before monitoring begins.
There may also be gates for which keys or combinations are needed to gain access to the photo
points. It is important that landowners be notified before photos are taken and that keys or
combinations for gates are in hand.
A recommended standard equipment list for photo-taking events can be found in section 5.3. Larsen
(2006) recommends using GPS Photo Link, a software program that "links" digital photos to the GPS
coordinates. This software program is now marketed as GeoJot+ Core (GeoSpatial Experts 2016). A geo-
location feature is available on some current digital camera models. There are a wide range of GPS
receivers now available, with most enabling the user to take precise position coordinate readings and
record details about each position in an attribute table that can be downloaded to a computer (ERS 2010).
In addition, GIS software usually supports display of digital images, and there are numerous options for
property mapping software that can be found on the Internet (ERS 2010).
5.3 Equipment Needs
Methods described by Hall (2001 2002) are still largely relevant today but equipment has changed
considerably in the past decade. Most cameras in use today are digital, with resolutions far exceeding the
2 megapixel cameras described by Hall. Storage cards are larger and faster as well, and batteries last far
longer than they did just five years ago. The many improvements in camera technology have increased
the capabilities of photo-point monitoring by increasing the amount and quality of information contained
in each photo, increasing the number of photos that can be taken and stored under a single battery charge,
improving the options for time-lapse and programmed photography, and greatly enhancing the
capabilities for photo interpretation and analysis with computer software.
Because camera technology will continue to improve, it is recommended that an initial step in designing a
photo-point monitoring effort should be to survey currently available cameras and associated hardware
and software to assess the possibilities for photographic data collection and analysis, the potential for
unattended time-lapse photography (e.g., how long will batteries last at various resolutions and frequency
of taking photos), the ability to retrieve photos from a remote location through a computer link or to
rapidly upload images directly from the camera to a remote website, and the cost of various options.
Coordination with others (e.g., USDA) may be an excellent way to obtain access to integrated technology
for photo-point monitoring. For example, software such as GPS Photo Link1 has been used by NRCS to
link photos to GPS coordinates and create data files that include the photos, coordinates, and other
descriptive information (GeoSpatial Experts 2004). Technology should not drive study objectives but it is
common sense to assess the extent to which available technology can be used to meet or augment study
objectives. With labor the major cost in many monitoring efforts, there may be attractive options for using
more technology and less labor to keep costs down.
The following items should also be considered in standard equipment lists for site establishment and
subsequent photo-taking visits (Bechtel 2005, Hamilton n.d., Herrick et al. 2005a, Larsen 2006):
1 Now marketed as GeoJot+ Core (Geospatial Experts 2016).
5-17
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
Site Establishment
• Camera (and extra batteries)
« GPS unit or map of monitoring areas
• Clipboard, data forms (site description/location, camera location and photo points), and pencils OR
field computer with data entry software (extra battery for field computer if used)
• Compass
« Level (for permanently mounted meter boards)
* Hammer or post driver
« Keys and gate combinations (if needed)
« Measuring tape
• Rebar (3 ft) or other states for marking transect ends (if used)
« Shovel
• Whiteboard (and marker), chalkboard (and chalk), or photo-point ID cards
* Fenceposts
« Stakes or posts made of wood, fiberglass, plastic, rebar, or steel (point markers)
• Meter board
« Spray paint
« PVC pole (1.5 m or 5 ft long) or tripod for mounting camera at fixed height
Each Photo-Taking Visit
• Camera (and extra batteries)
« Compass
« Level
* Timepiece
« GPS unit or map of monitoring areas
• Site locator field book or field computer with copies of original photos and site information (extra
battery for field computer if used)
• Clipboard, data forms (site description/location, camera location and photo points), and pencils or
field computer with data entry software (e.g., GPS-photo ID software)
« Whiteboard, chalkboard, or photo-point ID cards
* Thick marking pen
• PVC pole (1.5 m or 5 ft long) or tripod for mounting camera at fixed height
« Keys and gate combinations (if needed)
• Measuring tape
• Metal detector (if needed for stake location)
5-18
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
• Ruler (optional - for scale on close-ups)
• Spray paint
5.4 Applications of Photo-Point Monitoring
5.4.1 Comparison Photos
Comparison photography has been used in a number of applications associated with grazing. In one
example cited by Hall (2001), the height and weight of grasses and forbs were measured, and a height-
weight curve was developed and used to estimate percent utilization based on height measurements
(Kinney and Clary 1994). The utilization level of an individual plant was determined by matching its
residual stubble to a photo in the guide and then assigning the percent utilization value for that photo to
the plant. Average utilization in an area was estimated from a number of individual plants (e.g., 50 to
100). It should be noted that the quality of estimates developed with this method depends substantially on
the level of detail in the photo guide. It may be necessary to develop seasonal or species-specific guides
depending on the level of accuracy and precision needed for the study. The authors concluded that about
25 random plant height measurements should give mean plant height estimates within 5 percent of the
mean at 95 percent confidence.
Comparison photos have also been used to provide a quick approximation of percent residue under
various conservation tillage practices (Eck and Brown 2004, Hickman and Schoenberger 1989, Shelton et
al. 1995). Percent cover can usually be estimated within 10 to 20 percent of the actual cover when using
the photo-comparison method. When using this method to estimate percent residue it is important to find
a representative area of the field, look straight down at the residue if it is flat or at an angle if it is standing
residue, and compare the observed residue cover with photos of known cover. Interpolation between
photos may be necessary, and it is recommended that the results of three or more observations from
different representative locations on the field be averaged for a better estimate.
The Queensland BioCondition Assessment Framework specifies a quantitative approach to photo-point
monitoring to assess terrestrial biodiversity, incorporating a 100-m vegetation transect and spot (close-up)
and landscape photos taken in accordance with a detailed protocol (Eyre et al. 2015). Despite the attention
to detail regarding the taking of photographs, no analysis of the photographs is described, and photos are
only recommended, not required. The related method for establishing reference sites for biocondition
assessment states only that spot photos can be useful to capture the variability in ground cover within
sample locations (Eyre et al. 2011).
5.4.2 Repeat Photography
Repeat photography has been used for a range of purposes in a large number of NFS projects including
wetland restoration, streambank restoration, and fencing (OEPA n.d., Oregon DEQ 2002, Shaff et al.
2007). The Jordan Cove, CT, Section 319 National Nonpoint Source Monitoring Program (NNPSMP)
project took weekly photos as homes were constructed and documented all development changes in the
suburban lot. Weekly observation of construction activities allowed documentation of water quantity
effects such as storage of water in cellar excavations and rainfall ponding on pavement (Clausen 2011).
The Morro Bay Section 319 NNPSMP project in California documented implementation of BMPs with
photo-point monitoring (CCRWQCB 2012b). In the Maino Ranch study area of the Morro Bay project,
photo-point monitoring failed to document changes in stream channels as a result of fencing and other
5-19
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
practices designed to control cattle movement through pastures (CCRWQCB and CPSU 2003). This
result agreed, however, with the findings from the monitoring of stream channel stability and stream
profiles from fall 1993 through spring 2001.
Photo-monitoring of pre- and post-construction conditions is used to document the success of all erosion
control projects on rural roads in Santa Cruz County, California (CCRWQCB 2012a). A report on Section
319 projects funded in NM from 1998 to 2008 showed that 11 of 127 projects used photo-point
monitoring for project evaluation, and many others used photos to assist in problem documentation
(NMED 2009). Of the 11, nine photographed vegetation to track progress associated with range/grazing
management and/or riparian restoration, one tracked road reclamation, and the other used photo-point
monitoring to document improvement from trail reconstruction.
Photo-point monitoring at Chinamans Beach, Australia, was used to gain understanding of the movement
and accumulation of wrack (piles of seaweed) on the beach (MMC n.d.). Photos collected two times per
week over a 12-week period helped determine the need for and best approach to beach raking.
Supplemental information on tides, weather, and activities in the area was used to help interpret the
photos but all observations were qualitative.
Photo-documentation was a major component of assessment monitoring for the South Fork Palouse River
riparian area restoration project (PCEI 2005). Permanent photo monitoring stations were established
along the restoration site to document both vegetation establishment success and streambank stability.
Using the methods of Hall (2001), bank stability was evaluated with photos taken twice per year (in
March following high-flows and in July under base-flow conditions) at three photo points located along
the restored site. Permanent meter stakes installed at the top of the bank at each location served as visual
reference points for photo monitoring and as references to measure erosion. Vegetation establishment
success (changes in growth and production) was also tracked through photo monitoring, with photos
taken during the first week of August and then yearly for 10 years following restoration.
The NRCS has published guidance on photo-point monitoring as a qualitative method for documenting
short-term and long-term effects of a prescribed grazing plan (Larsen 2006). In support of this guidance,
the Nebraska NRCS developed a field office guide to demonstrate the use of GPS Photo Link2, a software
program that "links" digital photos to the GPS coordinates (GeoSpatial Experts 2004).
Kinney and Clary (1998) used time-lapse photography to demonstrate differences in time spent by cattle
on several pastures within a riparian meadow. Cattle location was classified by five broad plant
community-soil groups. Photographs were taken at 20-min intervals during daylight hours, a frequency at
which auto-correlation was observed. Information obtained from the photos was reduced to number of
cattle per unit area, and analysis of variance was performed on number of animals per ha per plant-soil
site per photograph, with pasture and year used as explanatory variables that would account for
differences in animal stocking densities. The authors were able to show statistically significant differences
in cattle densities among site categories overall and for three different animal positions (standing head
down, standing head up, and lying down).
Photo-documentation is very popular among volunteer monitoring groups. For example, the SOLVE
Green Team in Oregon uses photo point monitoring to track progress at watershed restoration sites
(SOLVE 2011). The Missouri Stream Team uses photo-point monitoring to supplement water quality and
other stream monitoring activities (MST n.d.).
2 Now marketed as GeoJot+ Core (GeoSpatial Experts 2016).
5-20
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 5
5.5 Advantages, Limitations, and Opportunities
Photo-point monitoring can potentially be used for a variety of purposes, including problem assessment
and planning, tracking BMP implementation, providing supporting information for traditional water
quality monitoring, discovering unexpected events, serving as surrogates for water quality parameters,
and serving as direct measures of water quality conditions (Figure 5-5).
Assessment and Planning
• Document conditions
• Identify sediment sources
• Document treatment needs
BMP Implementation
• Presence/Absence
• Plant growth
• Percent residue
Supporting Information
• Snow cover
• Grazing
The Unexpected
Manure spreading
Stream bank failure
Direct Measures
• Algal blooms
• Flow (requires calibration)
Surrogate Measures
• Percent shade
• Plant growth
Figure 5-5. Various potential applications of photo-point monitoring
5.5.1 Advantages
Every monitoring option has advantages and limitations, and Hamilton (n.d.) identified the following
strengths of photo-point monitoring:
• Uses readily available equipment.
• Is an effective communication tool for public education.
• Is a method of providing landscape context for a study area.
• Is a standardized evaluation procedure for comparing multiple locations.
5-21
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
" Is a method to document rates of change.
In addition to these observations, photo-point monitoring is less expensive than most other watershed
project monitoring options.
5.5.2 Limitations
Some weaknesses of photo-point monitoring were also identified by Hamilton (n.d.):
• Only limited quantitative data can be obtained.
" Bias in photo point placement may occur.
" It may be difficult to use in dense vegetation.
" Photo points can be lost or obscured over time.
An additional limitation of photo-point monitoring for watershed projects is that, in most cases, it cannot
be used to evaluate progress in achieving water quality objectives. Further, statistical approaches to using
photo-derived data remain to be developed for use by those who apply photo-point monitoring
techniques.
5.5.3 Opportunities
Recognizing the inherent advantages and limitations of photo-point monitoring, there are many
opportunities to use this tool for watershed projects. Several of these opportunities have been realized,
while others are suggested only for consideration, with full understanding that any method must be tested
and evaluated before being adopted.
Photo-point monitoring can be very helpful in assessing watershed problems. For example, it was used in
a volunteer-led river continuity assessment of the Ashuelot River water in New Hampshire (Bechtel
2005). Photos were taken at each dam site (at the downstream end) and at both the upstream and
downstream ends of stream crossings. The QA officer used the photographs to ensure that information
recorded regarding bridge and culvert type made sense. Photos were also used as part of the permanent
inventory record.
Photo-point monitoring for western grazing lands has been found to be an easy and inexpensive way to
provide an excellent visual representation of conditions at a given point in time. These photographs were
considered only as supplementary data, however, not sufficient alone to evaluate objectives (Bauer and
Burton 1993). Photographs could be used to indicate a trend in woody vegetation, streambank stability,
and streambank cover, but the authors noted that vegetation "expression" as seen in photographs was not
the same as vegetation "succession" needed for stream ecosystem health.
At the farm-scale, researchers at the University of Wisconsin-Platteville have applied photo-point
monitoring to farm-scale research. Photos have been used for a variety of applications as seen in the
sidebar (Busch and Mentz, 2012).
As an example of new applications of photo-point monitoring, it is feasible that photo-point monitoring
could be used to track flow provided that a stage-discharge relationship is first established. While this
may at first seem to offer no advantage over visual observation of a staff gage, tracking stage with
photographs could offer the advantages of 24-hour surveillance and safety during high-flow events.
5-22
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
Cameras would need to be positioned in secure locations, however, and remote transmission of photos
may be required.
The greatest opportunity for photo-point monitoring at the watershed scale, however, may be an
improvement in the quantification of variables of interest and statistical analysis of photo-derived data.
All monitoring is limited by sample size and representativeness but interpretation of water chemistry
monitoring data, for example, is supported by a long history of statistical analysis. Photo-point monitoring
for watershed projects has almost no history of statistical analysis. Numeric data are needed for statistical
analysis. The primary challenge for those who want to pursue low-cost photo-point monitoring options
for project evaluation is to develop more quantitative data and put that data through statistical analyses to
create a record of achievement and potential.
5-23
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 5
Figure 1. Sample bottles
Photographic Data Collected at UW-Platteville Pioneer Farm
Researchers at the University of Wisconsin-Platteville have applied photo-point monitoring to
farm-scale research. Photographs are used to identify areas of concern, record field conditions
within research project areas, monitor the locations of grazing cattle, record unusual or atypical
events, and support QA/QC efforts in the surface-water runoff monitoring program.
Photographs can be especially useful to convey information to off-site researchers.
Time-lapse photos are taken on a 24-hr interval at
surface-water gauging stations to create a record of
field conditions within monitored areas. These
photographs are useful in determining soil cover,
plant canopy, snow cover, and crop growth
throughout the year- especially at times when runoff
events occur. Moreover, photographs of surface
water runoff sample bottles are taken after collection
and prior to lab analysis (Figure 1). While bottle
photos provide only qualitative information, such as
relative sample color, this information, along with
time-lapse photos can help confirm results when
laboratory test results are in question. Photos of the
bottle tops are used as part of the chain of custody
record and project QA/QC, providing an accurate
record of samples shipped for analysis.
Daily time-lapse photos have also been used both to
identify paddocks where cattle are grazing in riparian
corridors, and to record pasture vegetation height
and density. In studies where the location of grazing
cattle needs to be recorded daily, landscape
photographs can identify the paddocks in which
cattle are grazing on a daily basis (Figure 2). Plot
photos of pasture vegetation have been used to
create a visual record of pasture condition and grass
height for runoff studies as well (Figure 3).
Photographs are often taken to record extreme
events and unusual field observations. For example,
photographs have been taken of high-flow events
where water depth was greater than the flume height
and runoff water flowed over the wing walls holding
the flume (Figure 4). Information from these photos
can be used to confirm recorded maximum stage
readings, and estimate discharge by providing
information that can be used to calculate cross-
section flow area that occurs above the flume.
Figure 2. Grazing cattle
Figure 3. Pasture vegetation
Figure 4. Flume
5-24
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
5.6 References
Bauer, S.B. and T.A. Burton. 1993. Monitoring Protocols to Evaluate Water Quality Effects of Grazing
Management on Western Rangeland Streams. EPA 910/R-93-017. U.S. Environmental Protection
Agency, Water Division, Seattle, WA. Accessed February 10, 2016.
Bechtel, D.A. 2005. River Continuity Assessment of the Ashuelot River Watershed Quality Assurance
Project Plan. The Nature Conservancy, New Hampshire Chapter, Concord, NH. Accessed
February 10,2016.
http://des.nh.gov/organization/divisions/water/wmb/was/qapp/documents/ashuelot.pdf.
Bledsoe, B.P., and J.E. Meyer. 2005. Monitoring of the Little Snake River and Tributaries Year 5 - Final
Report. Colorado State University, Department of Civil Engineering, Ft. Collins, CO. Accessed
February 10, 2016.
http://www.wildlandhydrology.com/assets/Monitoring_of_the_Little_Snake_River_Year_5_Final
_Report.pdf.
Busch, D. and R. Mentz. 2012. Photographic Data Collected at UW-Platteville Pioneer Farm. University
of Wisconsin-Platteville, Platteville, WI.
CCRWQCB (Central Coast Regional Water Quality Control Board). 2012a. Water Quality Success
Stories: Rural Roads Erosion Control Assistance in Santa Cruz County, California. Central Coast
Regional Water Quality Control Board, San Luis Obispo, CA. Accessed February 10, 2016.
http: //www. waterboards .ca. gov/centralcoast/about_us/docs/succe ss/rural_roads .pdf.
CCRWQCB (Central Coast Regional Water Quality Control Board). 2012b. Water Quality Success
Stories: TheMorro Bay National Monitoring Program - a Ten-Year Rangeland Management
Practices Study. Central Coast Regional Water Quality Control Board, San Luis Obispo, CA.
Accessed February 10, 2016.
http: //www .waterboards. ca. go v/centralcoast/about_us/morrobay. shtml.
CCRWQCB (Central Coast Regional Water Quality Control Board) and CPSU (California Polytechnic
State University). 2003. Morro Bay National Monitoring Program: Nonpoint Source Pollution
and Treatment Measure Evaluation for the Morro Bay Watershed 1992-2002 Final Report.
Prepared for U.S. Environmental Protection Agency, by Central Coast Regional Water Quality
Control Board and California Polytechnic State University - San Luis Obispo, CA. Accessed
February 10, 2016.
http://www.elkhornsloughctp.org/uploads/files/1149032089MorroBavEstuaryProgram.pdf.
Clausen, John C., University of Connecticut, Department of Natural Resources and the Environment.
2011. Telephone conversation with Steve Dressing, Tetra Tech, Inc., regarding Jordan Cove
Section 319 National Monitoring Program project. Accessed February 10, 2016. See
http:II]ordancove.uconn.edu/index.html for additional details.
Eck, K.J. and D.E. Brown. 2004. Estimating Corn and Soybean Residue Cover. AY-269-W. Purdue
University Cooperative Extension Service, West Lafayette, IN. Accessed February 10, 2016.
http://www.extension.purdue.edu/extmedia/AY/AY-269-W.pdf.
ERS (Environment and Resource Sciences). 2010. Land Manager's Monitoring Guide -Photopoint
Monitoring. State of Queensland Department of Environment and Resource Management,
5-25
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
Environment and Resource Sciences, Brisbane. Accessed February 10, 2016.
http: //www. granitenet. com. au/assets/file s/Landcare/rabbits_photopoint indicator aug2010 .pdf.
Eyre, T.J., A.L. Kelly, and V.J. Neldner. 2011. Method for the Establishment and Survey of Reference
Sites for Biocondition Version 2.0. State of Queensland Department of Environment and
Resource Management, Biodiversity and Ecological Sciences Unit, Brisbane. Accessed February
10, 2016. https://www.qld.gov.au/environment/assets/documents/plants-
animals/biodiversity/reference-sites-biocondition.pdf.
Eyre, T.J., A.L. Kelly, V.J. Neldner, B.A. Wilson, D.J. Ferguson, M.J. Laidlaw, and A.J. Franks. 2015.
Biocondition: a Condition Assessment Framework for Terrestrial Biodiversity in Queensland.
Assessment Manual. Version 2.2. State of Queensland Department of Environment and Resource
Management, Biodiversity and Ecosystem Sciences, Brisbane. Accessed February 10, 2016.
https://www. qld.gov.au/environment/assets/documents/plants-animals/biodiversity/biocondition-
assessment-manual .pdf.
Faux, R.N., P. Maus, H. Lachowski, C.E. Torgersen, and M.S. Boyd. 2001. New Approaches for
Monitoring Stream Temperature: Airborne Thermal Infrared Remote Sensing. U.S. Department
of Agriculture, Forest Service, Remote Application Center, Integrating & Monitoring Steering
Committee, San Dimas, CA. Accessed February 10, 2016.
http://www.fs .fed.us/eng/techdev/IM/rsac reports/TIR.pdf.
GeoSpatial Experts. 2004. Introduction to GPS-Photo Link. U.S. Department of Agriculture, Natural
Resources Conservation Service, NE. Accessed February 10, 2016. GPS-
Photo Link step by step .ppt
GeoSpatial Experts. 2016. Geotagging and Photo Mapping Software. Geospatial Experts, Inc., Thornton,
CO. Accessed February 10, 2016. http://www.geospatialexperts.com/geotagging.php.
Hall, Frederick C. 2001. Ground-Based Photographic Monitoring. General Technical Report PNW-GTR-
503. U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station,
Portland, OR. Accessed February 11, 2016. http://www.fs.fed.us/pnw/publications/pnw gtr503/.
Hall, Frederick C. 2002. Photo Point Monitoring Handbook. General Technical Report PNW-GTR-526.
U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, Portland,
OR. Accessed February 10, 2016. http://www.fs.fed.us/pnw/pubs/gtr526/.
Hamilton, R.M. n.d. Photo Point Monitoring, aWeed Manager's Guide to Remote Sensing and GIS —
Mapping & Monitoring. U.S. Department of Agriculture, Forest Service, Remote Sensing
Applications Center, Salt Lake City, UT. Accessed February 10, 2016.
http: //www. fs .fed .us/eng/rsac/invasivespecie s/documents/Photopoint monitoring .pdf .
Herrick, J.E., J.W. Van Zee, K.M. Havstad, L.M. Burkett, and W.G. Whitford. 2005. Monitoring Manual
for Grassland, Shrubland and Savanna Ecosystems - Volume I: Quick Start. U.S. Department of
Agriculture, Agricultural Research Service Jornada Experimental Range, Las Cruces, NM.
Accessed February 10, 2016.
http: //www .ntc.blm.gov/krc/viewresource .php ?courseID=2 81 &programAreaId= 148.
Herrick, J.E., J.W. Van Zee, K.M. Havstad, L.M. Burkett, and W.G. Whitford. 2005a. Monitoring
Manual for Grassland, Shrubland and Savanna Ecosystems - Volume II: Design, Supplementary
5-26
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
Methods And Interpretation, U.S. Department of Agriculture, Agricultural Research Service
Jornada Experimental Range, Las Cruces, NM. Accessed February 10, 2016.
http: //www .ntc.blm.gov/krc/viewresource .php ?courseID=2 81 &programAreaId= 148.
Hickman, J.S. and D.L. Schoenberger. 1989. Wheat Residue. Bulletin L-781. Kansas State University,
Cooperative Extension Service, Manhattan, KS. Accessed February 10, 2016.
http://www.ksre.ksu.edu/bookstore/pubs/L781 .pdf.
Hively, W.D., M. Lang, G.W. McCarty, J. Keppler, A. Sadeghi, and L.L. McConnell. 2009a. Using
satellite remote sensing to estimate winter cover crop nutrient uptake efficiency. Journal of Soil
and Water Conservation 64(5):303-313.
Hively, W.D., G.W. McCarty, and J. Keppler. 2009b. Federal-state partnership yields success in remote
sensing analysis of conservation practice effectiveness: results from the Choptank River
Conservation Effects Assessment Project. Journal of Soil and Water Conservation 64(5): 154A.
Key, C. H., D. B. Fagre, and R. K. Menicke. 2002. Glacier Retreat in Glacier National Park, Montana. In
Satellite Image Atlas of Glaciers of the World, Glaciers of North America - Glaciers of the
Western United States. U.S. Geological Survey Professional Paper 1386-J, R.M. Krimmel, pp.
J365-J381. Accessed February 10, 2016.
http://nrmsc.usgs.gov/files/norock/products/GCC/SatelliteAtlas Key 02 .pdf.
Kinney, J.W. and W.P. Clary. 1994. A. Photographic Utilization Guide for Key Riparian Graminoids.
General Technical Report INT-GTR-308. U.S. Department of Agriculture, Forest Service,
Intermountain Research Station, Ogden, UT. Accessed February 10, 2016.
http://www.fs.fed.us/rm/pubs int/int gtr308.html.
Kinney, J.W. and W.P. Clary. 1998. Time-Lapse Photography to Monitor Riparian Meadow Use. USDA
Forest Service Research Note RMRS-RN-5. U.S. Department of Agriculture, Forest Service,
Rocky Mountain Research Station, Boise, ID. Accessed February 10, 2016.
http://www.fs.fed.us/rm/pubs/rmrs_rn005.pdf.
Larsen, D. 2006. Minimum Standards for Nebraska NRCS Photo-Point Monitoring. Range and Pasture
Technical Note No. 16. U.S. Department of Agriculture, Natural Resources Conservation Service.
Accessed February 10, 2016.
http://efotg.sc.egov.usda.gov/references/public/NE/NE TECH NOTE 16.pdf.
MathWorks. 2012. MATLAB Image Processing Toolbox. MathWorks, Natick, MA. Accessed February
10, 2016. http://www.mathworks.com/products/image/.
MMC (Mosman Municipal Council), n.d. Chinamans Beach Photo Point Monitoring Program
Spring/Summer 2004 & Autumn/Winter 2005. Mosman Municipal Council, Mosman, Australia.
MST (Missouri Stream Team), n.d. Stream Team Photo Point Monitoring. Missouri Stream Team,
Jefferson City, MO. Accessed February 10, 2016.
http://www.mostreamteam.org/Documents/STActivities/ppmonitoring.pdf.
NMED (New Mexico Environment Department). 2009. Watershed Protection Section Clean Water Act
§319(H) Projects 1998 - 2008. New Mexico Environment Department, Surface Water Quality
Bureau.
5-27
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
OEPA (Ohio Environmental Protection Agency), n.d. Ohio Section 319 Success Story: Wetland
Restoration in the Sandusky Watershed. Ohio Environmental Protection Agency, Columbus, OH.
Accessed February 11, 2016.
http://www.epa.ohio.gov/portals/35/nps/319DOCSAVetlands%20Restoration%20in%20Northern
%20Sanduskv%20Watershed.pdf.
Oregon DEQ (Department of Environmental Quality). 2002. Oregon 2002 Nonpoint Source Pollution
Program Annual Report. Oregon Department of Environmental Quality, Water Quality Division,
Portland, OR. Accessed February 11, 2016.
http: //www .deq. state. or .us/wq/nonpoint/docs/annualrpts/rpt02 .pdf.
PCEI (Palouse-Clearwater Environmental Institute). 2005. South ForkPalouse River -Lower Watershed
Restoration Project Final Report. Palouse-Clearwater Environmental Institute, Moscow, ID.
Rasmussen J., M. N0rremark, and B.M. Bibby. 2007. Assessment of leaf cover and crop soil cover in
weed harrowing research using digital images. Weed Research 47, 299-310.
Richardson, M.D., D.E. Karcher, and L.C. Purcell. 2001. Quantifying turfgrass cover using digital image
analysis. Crop Science 41:1884-1888.
Rosgen, David L., 1973. The Use of Color Infrared Photography for the Determination of Sediment
Production. In Fluvial Process and Sedimentation: Proceedings of Hydrology Symposium,
Canadian National Research Council, Edmonton, Alberta, May 8-9, 1973, pp. 381-402.
Rosgen, David L., 1976. The Use of Color Infrared Photography for the Determination of Suspended
Sediment Concentrations and Source Areas. In Proceedings of the 3rd Inter-Agency Sediment
Conference, Water Resources Council, Denver, Colorado, March 22-25, 1976, Chapter 7, pp. 30-
42. Accessed February 11, 2016.
http://www.wildlandhydrologv.com/assets/The Use of Color Infrared Photography-FISC.pdf.
Shaff, C., J. Reiher, and J. Campbell. 2007. OWEB Guide to Photo Point Monitoring. Oregon Watershed
Enhancement Board, Salem, OR. Accessed February 11, 2016.
http://www.Oregon.gov/OWEB/docs/pubs/PhotoPoint_Monitoring_Doc_July2007.pdf?ga=t.
Shelton, D.P., P.J. Jasa, J.A. Smith, and R Kanable. 1995. G95-1132 Estimating Percent Residue Cover.
Paper 784. Historical Materials from University of Nebraska-Lincoln Extension. Accessed
February 11, 2016. http://digitalcommons.unl.edu/extensionhist/784/.
SOLVE. 2011. Adventures in Monitoring. SOLVE Green Team, Oregon. Accessed February 11, 2016.
http://solvgreenteam.wordpress.com/tag/photo-point-monitoring/.
USDA-CES (U.S. Department of Agriculture, Cooperative Extension Service), USDA-NRCS (U.S.
Department of Agriculture, Natural Resources Conservation Service), and USDOI-BLM (U.S.
Department of Interior-Bureau of Land Management). 1999. Utilization Studies and Residual
Measurements. Interagency Technical Reference 1734-3. U.S. Department of Agriculture,
Cooperative Extensive Service and Natural Resources Conservation Service and U.S. Department
of the Interior, Bureau of Land Management, Denver, CO. Accessed February 11, 2016.
http://www.blm.gov/nstc/library/pdf/utilstudies.pdf.
5-28
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
USDA-FS (U.S. Department of Agriculture, Forest Service). 2007. Removal of Marmot Dam. U.S.
Department of Agriculture, Forest Service, Mt. Hood National Forest, Sandy, OR. Accessed
February 11, 2016. http://www.fs.usda.gov/detail/mthood/home/?cid=STELPRDB5200643 and
http: //or .water .usgs. gov/proj s_dir/marmot/index.html.
USEPA (U.S. Environmental Protection Agency). 2008. Handbook for Developing Watershed Plans to
Restore and Protect Our Waters. EPA 841-B-08-002. U.S. Environmental Protection Agency,
Office of Water, Washington, DC. Accessed February 10, 2016. http://www.epa.gov/polluted-
runoff-nonpoint-source-pollution/handbook-developing-watershed-plans-restore-and-protect.
USSARTF (United States Search and Rescue Task Force), n.d. Compass Basics, United States Search and
Rescue Task Force, Elkins Park, PA. Accessed February 11, 2016.
http://www.ussartf.org/compass_basics.htm.
Wolfram. 20\2.Mathematica. Wolfram Research, Champaign, IL. Accessed February 11, 2016.
http://www.wolfram.com/mathematica/?source=nav.
5-29
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 5
This page intentionally left blank.
5-30
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
6 Monitoring Challenges and Opportunities
By D.W. Meals and S.A. Dressing
Monitoring is the foundation of water quality management and provides essential information about the
resource. Carefully done, monitoring can answer important questions and contribute to a successful NFS
watershed project. However, monitoring can also be challenging and offer numerous pitfalls.
Sections 6.1 and 6.2 of this chapter highlight some of the problems that can hinder watershed monitoring
efforts from the planning stage through execution. Opportunities to enhance and expand the impact and
utility of monitoring data are discussed in sections 6.3 and 6.4.
6.1 Monitoring Pitfalls
Too many watershed monitoring projects have reported little or no improvement in water quality after
extensive implementation of BMPs in the watershed. Reasons for this outcome are numerous and varied
and may include:
• Mistakes in understanding of pollution sources
• Improper selection of BMPs
• Poor experimental design
• Uncooperative weather
• Lag time between treatment and response
There are numerous ways that a monitoring effort can fail to achieve its objectives. Reid (2001) examined
30 U.S. monitoring programs and classified reasons for failure into design flaws and procedural problems.
Design flaws are errors or shortcomings inherent in the monitoring plan that prevent monitoring from
obtaining appropriate data, answering fundamental questions, or otherwise achieving its goals. Serious
design flaws can doom a monitoring project from the start and no amount of hard work or added
resources can salvage it. Procedural problems are problems in execution of a program that can cause even
the best design to fail. Unlike design problems, procedural problems can be overcome by applying
additional resources, more personnel, better training, or good management.
A list of the top reasons for monitoring failure drawn from Reid (2001) and experience with numerous
NPS monitoring projects includes both design and procedural problems.
6.1.1 Design Fla ws
• Inadequate problem identification/analysis. In some cases, the source of NPS problems is
unclear. For example, E. coll bacteria can come from livestock, domestic pets, septic systems, or
wildlife. Without accurate identification of the pollutant source (E. coll in this case), monitoring is
unlikely to be able to document a response to treatment effectively.
• Fundamental misunderstanding of the system. Effective monitoring of pollutant load or delivery
requires an understanding of how the pollutant moves through the watershed. Monitoring in the
6-1
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
wrong place or on the wrong pathway will doom a program to failure. If nitrate-N moves mainly
through ground water, for example, monitoring of surface runoff or streamflow is unlikely to yield
good results. Similarly, if most suspended sediment at a watershed outlet comes from stream
channels and banks, edge-of-field monitoring will not be effective.
Inability of the monitoring plan to measure what is needed. If a sampling station is mis-located
- upstream of a critical tributary inflow, for example - samples taken cannot record the pollutant
load delivered in that inflow.
Insufficient study duration. Significant lag time between land treatment and water quality
response is common (see section 6.2, below). No matter how well-executed, a three-year
monitoring program cannot document a response to BMPs if the response takes ten years to become
evident because of legacy pollutants or slow watershed processes.
Statistically weak design. As discussed in section 2.4, monitoring design must be carefully
selected to achieve program objectives, be they load measurement, change in pollutant
concentration, or response to land treatment, notably in the context of weather-driven variability
characteristic of NFS pollution. A statistically weak design - such as a single watershed before and
after or side-by-side watersheds - cannot control for weather variations and is unlikely to be able to
attribute observed changes in water quality to a specific cause.
6.1.2 Procedural Problems
" Lack of training or enthusiasm of field staff. If a field technician is unable or unwilling to collect
essential data because of lack of knowledge or initiative, critical data may be lost. In extreme cases,
individuals can compromise a data record by cutting corners as illustrated in Figure 6-1. A simple
time plot of recently obtained laboratory results revealed a pattern that indicated a sampling
irregularity, thus triggering an investigation into the cause before further damage could be done.
" Failure to collect collateral information. Often, collateral information is required to properly
interpret monitoring data. Information on stream stage, for example, may be essential to understand
if a water sample was collected on the rising or falling limb of the hydrograph. Failure to record
stage at the time of sample collection will greatly reduce the meaning of the sample result.
" Bad or misunderstood technology. Modern field or laboratory instruments make it easy to collect
a great deal of monitoring data. However, if a field instrument is deployed for long periods without
maintenance or calibration, or if a laboratory instrument is not calibrated and tested regularly, the
resulting bad data will seriously impair a monitoring program.
" Failure to evaluate data regularly. As noted in section 3.10.2 and illustrated in Figure 6-2, it is
essential to examine monitoring data frequently to catch problems early. Two dramatic changes in
the apparent pattern of TKN concentration were caused by laboratory actions. Replacement of a
defective probe in a lab instrument changed the range and sensitivity of the analytical results (point
labeled #1). Later a change in lab method significantly raised the detection limit (point labeled #2).
These two phenomena required rejection of almost a year of TKN data, but if the problems had not
been noted in a data review, serious bias would have been introduced into the monitoring results for
a seven-year monitoring effort (Meals 2001).
" Protocol changes. Whether in field or laboratory settings, consistent operating procedures are
essential to generating consistent monitoring data. Although long-term monitoring programs should
strive for consistency in methods and procedures, sometimes it is necessary to replace or upgrade
6-2
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 6
instruments or change analytical methods. Without careful documentation and extensive
comparative analysis, changes in monitoring or analytical procedures can introduce spurious
changes in resulting data, changes that do not reflect conditions in the water resource.
Personnel change. Complex monitoring activities - such as those involving GIS or sophisticated
laboratory instruments - require a high level of expertise and/or training. Frequent personnel
changes can result in loss of such expertise, with a consequent loss of data or of data accuracy,
especially if transitions are not managed properly.
Lack of institutional integration. Most watershed monitoring projects involve multiple
participants, with responsibility for different activities sometimes spread across several institutions.
If the different departments or agencies do not share information or talk to each other regularly,
critical information may be overlooked and the monitoring program may suffer.
•U LiU
+++
1*1 .11
Llfl .11
Wfl .11 .LI
Daily samples were
manufactured from a single
large sample taken once per
week. A volunteer's
violation of sampling
protocol was detected after
samples were analyzed and
data were plotted to reveal a
suspicious pattern.
Figure 6-1. Detection of violation of sampling protocol (R.P. Richards, Heidelberg University,
Tiffin, OH)
6-3
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 6
Figure 6-2. Effects of changing (1) a defective probe and (2) a laboratory method detection limit
(Meals 2001)
Because design flaws may doom a monitoring project from the start, it is essential to follow the steps in
designing a monitoring program discussed in chapters 2 and 3. Procedural problems can be addressed
with additional resources, training, and good management during the course of a monitoring program, but
such corrections require constant vigilance to identify the problems before they cause too much damage.
6.2 Lag Time Issues in Watershed Projects
One important reason NFS watershed projects may fail to meet expectations for water quality
improvement is lag time. Lag time can be thought of as the time elapsed between installation or adoption
of management measures at the level projected to reduce NFS pollution and the first measurable
improvement in water quality in the target waterbody. Even in cases where a program of management
measures is well-designed and fully implemented, water quality monitoring efforts (even those designed
to be "long-term") may not show definitive results if the monitoring period and sampling frequency are
not sufficient to address the lag between treatment and response. Lag time issues have been explored in
detail in a recent review (Meals et al. 2010).
Project management, watershed processes, and components of the monitoring program itself influence the
lag between treatment and response (Figure 6-3). Any or all of these may come into play in a watershed
project.
6-4
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 6
Project Management
Components
System Components
Time required for
practice(s) to
produce desired
effect
Time required for
effect to be
delivered to
water resource
Time required for
water body to
respond to effect
LAG
TIME
Measurement
Components
Figure 6-3. Lag time conceptual model
6.2.1 Project Management Components
The time required for planning and implementation of a NFS watershed project often causes the public to
perceive a delay between the decision to act and results of that decision. A project may be funded and
announced today, but it will be some time before that project will be fully planned and implementation
begins. It might even take years, considering the essential time required to identify NFS pollution sources
and critical areas, design management measures, engage landowner participation, and integrate new
practices into cropping and land management cycles. Although such planning delays are not part of the
physical process of lag time, stakeholders will often perceive them as part of the wait for results.
6.2.1.1 Time Required for an Installed or Adopted Practice to Produce an Effect
BMPs are installed in watersheds to provide a wide range of effects to protect or restore the physical,
chemical, and biological condition of waterbodies, including:
• Change hydrology
• Reduce dissolved pollutant concentration or load
• Reduce particulate/adsorbed pollutant concentration or load
• Improve vegetative habitat
• Improve physical habitat
6-5
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
The time required for a BMP to be fully installed and become operational influences how quickly it can
produce an effect. Some NFS control measures may become functional quickly. Installation of livestock
exclusion fencing along several Vermont streams over a three-month period resulted in significant
nutrient concentration and load reductions and reductions of fecal bacteria counts in the first post-
treatment year as the fences immediately prevented manure deposition in the stream (Meals 2001).
However, other NFS management measures, especially vegetative practices where plant communities
need time to become established, may take years to become fully effective. For example, in a
Pennsylvania study of a newly-constructed riparian forest buffer, the influence of tree growth on nitrate-
N removal from groundwater did not become apparent until about ten years after tree planting (Newbold
etal. 2009).
Lag time between BMP implementation and reduction of pollutant losses at the edge-of-field scale varies
by the pollutant type and the behavior of the pollution source. Erosion controls such as cover crops,
contour farming, and conservation tillage tend to have a fairly rapid effect on soil loss from a crop field as
these practices quickly mitigate the forces contributing to detachment and transport of soil particles
(Nearing et al. 1990). However, the response time of runoff P to nutrient management is likely to be much
slower. It may take years to "mine" excess P out of the soil through crop removal to the point where
dissolved P in runoff is effectively reduced (Zhang et al. 2004, Sharpley et al. 2007).
6.2.1.2 Time Required for the Effect to be Delivered to the Water Resource
Practice effects initially occur at or near the practice location, yet managers and stakeholders usually want
and expect the impact of these effects to appear promptly in the water resource of interest in the
watershed. The time required to deliver an effect to a water resource depends on a number of factors,
including:
• The route for delivering the effect
• Directly in (e.g., streambed restoration) or immediately adjacent to (e.g., shade) the water
resource
• Overland flow (particulate pollutants)
• Overland and subsurface flow (dissolved pollutants)
• Infiltration groundwater and groundwater flow (e.g., nitrate)
" The path distance
• The path travel rate
• Fast (e.g., ditches and artificial drainage outlets to surface waters)
• Moderate (e.g., overland and subsurface flow in porous soils)
• Slow (e.g., infiltration in absence of macropores and groundwater flow)
• Very slow (e.g., transport in a regional aquifer)
" Hydrologic patterns during the study period
• Wet periods generally increase volume and rate of transport
• Dry periods generally decrease volume and rate of transport
6-6
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
Once in a stream, dissolved pollutants like N and P can move rapidly downstream with flowing water to
reach a receiving body relatively quickly. However, sediment and attached pollutants (e.g., P and some
synthetic chemicals) can take years to move downstream as particles are repeatedly deposited,
resuspended, and redeposited within the drainage network by episodic high flow events. This process can
delay sediment and P transport (when attached P constitutes a large fraction of the P load) from
headwaters to outlet by years or even decades. Substantial lag time could occur between reductions of
sediment and P delivery into the headwaters and measurement of those reductions at the watershed outlet.
Pollutants delivered predominantly in groundwater such as nitrate-N generally move at the rate of
groundwater flow, typically much more slowly than the rate of surface water flow. For example, about
40% of all N reaching the Chesapeake Bay travels through groundwater before reaching the bay. Phillips
and Lindsey (2003) estimated that N loads associated with groundwater in the Chesapeake Bay
Watershed would have a median lag time often years for water quality improvements to become evident.
Groundwater nitrate concentrations in upland areas of Iowa were still influenced by the legacy of past
agricultural management conducted more than 25 years earlier (Tomer and Burkart 2003).
6.2.1.3 Time Required for the Waterbody to Respond to the Effect
The speed with which a waterbody responds to the effect(s) produced by and delivered from management
measures in the watershed introduces another increment of lag time. For example, hydraulic residence
time (or the inverse, flushing rate) is an important determinant of how quickly a waterbody may respond
to changes in nutrient loading. Residence times in selected North American waterbodies range from
0.6 year for Chesapeake Bay to 3.3 years for Lake Champlain to 191 years for Lake Superior to more than
650 years for Lake Tahoe. Simply on the basis of dilution, it will likely take considerably longer for water
column nutrient concentrations to respond to a decrease in nutrient loading in Lake Superior than in
Chesapeake Bay.
Apparent lag time in water quality response may also depend on the indicator evaluated or the impairment
involved, especially if the focus is on biological water quality. A relatively short lag time might be
expected between reductions of E. coll bacteria inputs and reduction in bacteria levels in the receiving
waters because the bacteria generally do not persist as long in the aquatic environment as do heavy metals
or synthetic organic chemicals. Such response has been demonstrated in estuarine systems where bacterial
contamination of shellfish beds has been reduced or eliminated through improved waste management on
the land in less than a year (BBNEP 2008). Improved sewage treatment in Washington, B.C. led to sharp
reductions in point source P and N loading to the Potomac River Estuary in the early 1970s (Jaworski
1990). The tidal freshwater region of the estuary responded significantly over the next 5 years with
decreased algal biomass, higher water column dissolved oxygen levels, and increased water clarity.
In contrast, lake response to changes in incoming P load is often delayed by recycling of P stored in
aquatic sediments. When P loads to Shagawa Lake (MN) were reduced by 80% through tertiary
wastewater treatment, residence time models predicted new equilibrium P concentrations within
1.5 years, but high in-lake P levels continued to be maintained by recycling of P from lake sediments
(Larsen et al. 1979). Even more than 20 years after the reduction of the external loading, sediment
feedback of P continued to influence the trophic state of the lake (Seo and Canale 1999). Similarly,
St. Albans Bay (VT) in Lake Champlain failed to respond rapidly to reductions in P load from its
watershed. From 1980 through 1991, a combination of wastewater treatment upgrades and intensive
implementation of dairy waste management BMPs through the Rural Clean Water Program brought about
a reduction of P loads to this eutrophic bay. However, water quality in the bay did not improve
6-7
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
significantly. This pattern was attributed to internal loading from sediments highly enriched in P from
decades of point and NFS inputs (Meals 1992). Although researchers at that time believed that the
sediment P would begin to decline over time as the internal supply was depleted, subsequent monitoring
has shown that P levels have not declined over the years as expected (LCBP 2008). Recent research has
confirmed that a substantial reservoir of P continues to exist in the sediments that can be transferred into
the water under certain chemical conditions and nourish algae blooms for many years to come (Druschel
et al. 2005). In effect, this internal loading has become a significant source of P, one that cannot be
addressed by management measures on the land.
Macroinvertebrate or fish response to improved water quality and habitat conditions in stream systems
requires time for the organisms to migrate into the system and occupy newly improved habitat.
Significant lag times have been observed in the response of benthic invertebrates and fish to management
measures implemented on land, including in the Middle Fork Holston River project (Virginia), where IBI
scores and Ephemeroptera-Plecoptera-Trichoptera (EPT) scores did not improve, even though the project
accomplished substantial reduction in the sediment, N, and P loadings (VADCR 1997). The lack of
increase in the biological indicator scores indicates a system lag time between the actual BMP
implementation and positive changes in the biological community structure. This lag could depend in part
on the amount of ecological connectivity with neighboring healthier aquatic systems that could provide
sources of appropriate organisms to repopulate the restored habitats. In several Vermont streams, the
benthic invertebrate community improved within 3 years in response to reductions of sediment, nutrient,
and organic matter inputs from the land (Meals 2001). However, despite observed improvement in stream
physical habitat and water temperature, no improvements in the fish community were documented. The
project attributed this at least partially to a lag time in community response exceeding the monitoring
period.
6.2.2 Effects Measurement Components of Lag Time
Watershed project managers are routinely pressed for results by a wide range of stakeholders. The
fundamental temporal components of lag time control how long it will take for a response to occur, but
the effectiveness of measuring the response may cause a further delay in recognizing it. The design of the
monitoring program is a major determinant of our ability to discern a response against the background of
the variability of natural systems.
In the context of lag time, sampling frequency with respect to background variability is a key determinant
of how long it will take to document change. In a given system, taking n samples per year provides a
certain statistical power to detect a trend. If the number of samples per year is reduced, statistical power is
reduced (the magnitude by which is influenced by the degree of autocorrelation), and it may take longer
to document a significant trend or to state with confidence that a concentration has dropped below a water
quality standard. Simply stated, taking fewer samples a year is likely to introduce an additional
"statistical" lag time before a change can be effectively documented.
6.2.2.1 The Magnitude of Lag Time
The magnitude of lag time is difficult to predict in specific cases and generalizations are difficult to make.
A few examples, summarized in Table 6-1, illustrate some possible time frames for several categories of
lag times.
6-8
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 6
Table 6-1. Examples of lag times reported in response to environmental impact or treatment
Parameter(s)
Sediment
Sediment
Chloride
NOs-N
NOs-N
NOs-N
NOs-N
NOs-N
NOs-N
Soil test P
Soil test P
Soil and runoff P
P
P
P, N, £ co//
Fecal bacteria
Fecal bacteria
Macro! nvertebrates
Macro! nvertebrates
Fish
Fish
Scale
Large watershed
Large watershed
Large aquifer
Small watershed
River basin
Large watershed
Small watershed
Small watershed
Small watershed
Field
Field
Plot/field
Lake
Lake
Small watershed
Estuary
Estuary
Small watershed
Small watershed
First order stream
Small watershed
Impact/Treatment
Extreme storm events
Cropland erosion control
Road salt
N fertilizer rates
N fertilizer rates
Nutrient management
Nutrient management
Prairie restoration
Riparian forest buffer
P fertilizer rates
P fertilizer rates
Poultry litter management
WWTP upgrade
WWTP upgrade/agricultural
BMPs
Livestock exclusion
Waste management
Waste management
Livestock exclusion
Mine waste treatment
Habitat restoration
Acid mine drainage treatment
Response
lag
8-25 yr
19 yr
>50yr
>30yr
>50yr
>5yr
15-39yr
10 yr
10 yr
8-14 yr
10-1 4 yr
>5yr
>20yr
>20yr
<1 yr
< 1yr
1yr
Syr
10 yr
2yr
3-9 yr
Reference
Marutani et al. 1999
Newson 2007
Besteretal. 2006
Tomer and Burkart 2003
Bratton et al. 2004
STAC 2005
Galeone 2005
Schilling and Spooner 2006
Newboldetal. 2009
McCollum 1991
Giroux and Royer 2007
Sharpleyetal. 2007
Larsenetal. 1979
LCBP 2008
Meals 2001
BBNEP2008
Spooner etal. 2011
Meals 2001
Chadwicketal. 1986
Whitney and Hafele 2006
Cravotta et al. 2009
6.2.3 How to Deal with Lag Time
In most situations, some lag time between implementation of BMPs and water quality response is
inevitable. Although the exact duration of the lag can rarely be predicted, in many cases the lag time will
exceed the length of typical monitoring periods, making it problematic to document a water quality
response. Several possible approaches are proposed to deal with this challenge.
6.2.3.1 Recognize Lag Time and Adjust Expectations
It usually takes time for a waterbody to become impaired and it will take time to accomplish the clean-up.
Failure to meet quick-fix expectations may cause frustration, pessimism, and a reluctance to pursue
further action. It is up to scientists, investigators, and project managers to do a better job explaining to all
stakeholders in realistic terms that current water quality impairments usually result from historically poor
land management and that immediate solutions should not be expected.
6-9
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
6.2.3.1.1 Characterize the Watershed
Before designing a NFS management program and an associated monitoring program, investigate
important watershed characteristics likely to influence lag time. Determining the time of travel for
groundwater movement is an obvious example. Watershed characterization is an important step in the
project planning process (USEPA 2008) and such characterization should especially address important
aspects of the hydrologic and geologic setting, as well as documentation of NFS pollution sources and the
nature of the water quality impairment, all of which can influence observed lag time in system response.
6.2.3.1.2 Consider Lag Time Issues in Selection, Siting, and Monitoring of Best
Management Practices
First and foremost, proper BMP selection must be based on solving the problem and ensuring that
landowners have the capability and willingness to implement and maintain the BMPs. Lag time can be an
important factor in the final design of BMP systems by ensuring that when down-gradient BMPs are
installed, they are ready to handle the anticipated runoff or pollutant load from up-gradient sources. In
addition, when projects include targeted BMP monitoring to document interim water quality
improvements, recognition of lag time may require an adjustment of the approach to targeting the
management program. When designing a program for projects that include BMP-specific monitoring,
potential BMPs should be evaluated to determine which practices might provide the most rapid
improvement in water quality, given watershed characteristics. For example, practices such as barnyard
runoff management that affect direct delivery of nutrients into surface runoff and streamflow may yield
more rapid reductions in nutrient loading to the receiving water than practices that reduce nutrient
leaching to groundwater, when groundwater time of travel is measured in years. Fencing livestock out of
streams may give an immediate water quality improvement, compared to waiting for riparian forest
buffers to grow. Such considerations, combined with application of other criteria such as cost
effectiveness, can help determine priorities for BMP implementation in a watershed project.
Lag time should also be considered in locating management practices within a watershed. Managers
should consider the need to demonstrate results to the public, which may be easier at small scales, along
with the need to achieve water quality targets and consequently wider benefits at the large watershed
scale. Where sediment and sediment-bound pollutants from cropland erosion are primary concerns,
implementing practices that target the largest sediment sources closest to the receiving water may provide
a more rapid water quality benefit than erosion controls in the upper reaches of the watershed. Where
groundwater transport is a key determinant of response, application of a groundwater travel time model
before application of management changes could help managers understand when to anticipate a water
quality response and communicate this issue to the public. At best, the model will support targeting the
application of an initial round of management measures to land areas where the effects are expected to be
transmitted to receiving waters quickly. An example of this can be found in Walnut Creek, Iowa
(Schilling and Wolter 2007).
It is important to point out that factoring lag time into BMP selection and targeting is not to say that long-
term management improvements like riparian forest buffer restoration should be discounted or that upland
sediment sources should be ignored. Rather, it is suggested that planners and managers may want to
consider implementing BMPs and treating sources likely to exhibit short lag times first to increase the
probability of demonstrating some water quality improvement as quickly as possible. "Quick-fix"
practices with minimum lag time must be complemented by other needed practices to ultimately yield
permanent reductions in pollutant loads.
6-10
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
6.2.3.1.3 Monitor Small Watersheds Close to Sources
In cases where documentation of the effects of a management program on water quality is a critical goal,
lag time can sometimes be minimized by focusing monitoring on small watersheds, close to pollution
sources. Lag times introduced by transport phenomena (e.g., groundwater travel, sediment flux through
stream networks) will likely be shorter in small watersheds than in larger basins. In the extreme, this
principle implies monitoring at the edge of field or above/below a limited treated area, but small
watersheds (e.g., < 1500 ha) can also yield good results. In the NNPSMP, projects monitoring BMP
programs in small watersheds (e.g., the Morro Bay Watershed Project in California, the Jordan Cove
Project in Connecticut, the Pequea/Mill Creek Watershed Project in Pennsylvania, and the Lake
Champlain Basin Watersheds Project in Vermont) were more successful in documenting improvements in
water quality in response to change than were projects that took place in large watersheds (e.g., the
Lightwood Knot Creek Project in Alabama and the Sny Magill Watershed Project in Iowa) in the 7- to
10-year time frame of the NNPSMP (Spooner et al. 2011).
Monitoring programs can be designed to get a better handle on lag time issues. Monitoring indicators at
all points along the pathway from source to response or conducting periodic synoptic surveys over the
course of a project will identify changes as they occur and document progress toward the end response.
Supplementing a stream monitoring program with special studies can help project managers understand
watershed processes, predict potential lag times, and help explain delays in water quality improvement to
stakeholders. In the Walnut Creek (IA) watershed, no changes in stream suspended sediment loads were
documented, despite extensive conversion of row crop land to prairie and reductions in field erosion
predicted by RUSLE (Revised Universal Soil Loss Equation). This was explained largely by a 22-mile
stream survey showing that streambank erosion contributed more than 50% of Walnut Creek sediment
export (Spooner et al. 2011).
6.2.3.1.4 Select Indicators Carefully
Some water quality variables can be expected to change more quickly than others in response to
management changes. As documented in the Jordan Cove (CT) NNPSMP Project (1996-2005), peak
storm flows from a developing watershed can be reduced quickly through application of stormwater
infiltration practices (Clausen 2007). NNPSMP projects in California, North Carolina, Pennsylvania, and
Vermont demonstrated rapid reductions in nutrients and bacteria by reducing direct deposition of
livestock waste in surface waters through fencing livestock out of streams (Spooner et al. 2011).
Improvements in stream biota, however, often come beyond the time frame of many watershed-scale
monitoring efforts, but a number of NNPSMP projects have documented success with biological
monitoring. As noted in section 6.2.1, Meals (2001) found that the benthic invertebrate community in
Vermont streams improved within 3 years in response to livestock exclusion practices, but improvements
in the fish community were not documented. Whitney and Hafele (2006) noted improvements in the fish
community within two years of a habitat restoration effort, and Cravotta et al. (2009) documented the
gradual return offish to streams within a few years after treatment to neutralize acid mine drainage.
Despite these successes, many other watershed-scale projects have failed to document improvements by
monitoring macroinvertebrates and fish. This may simply argue for a more sustained monitoring effort to
document a biological response to land treatment. Failing that, however, selection of indicators that have
relatively short lag times where possible will make it easier (and quicker) to demonstrate success. Simple
numbers of macroinvertebrates, for example, may respond before more complex community indices show
6-11
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
change. See chapter 4 for additional details and illustrative case studies on biological monitoring
approaches.
6.2.3.1.5 Design Monitoring Programs to Detect Change Effectively
Monitor at locations and at a frequency sufficient to detect change with reasonable sensitivity. Assess
background variability before the project begins and conduct a minimum detectable change analysis as
described in section 3.4.2 to determine a sampling frequency sufficient to document the anticipated
magnitude of change with statistical confidence (Spooner et al. 1987, Richards and Grabow 2003).
Although lag time will still be a factor in actual system response, a paired-watershed design (Clausen and
Spooner 1993, King et al. 2008), where data from an untreated watershed are used to control for weather
and other sources of variability, is one of the most effective ways to document water quality changes in
response to improvements in land management. If a monitoring program is intended to detect trends,
evaluate statistical power to determine the best sampling frequency for the project. See Meals et al. (2011)
and section 7.8.2.4 for additional information on trend analysis.
Target monitoring to the effects expected from the BMPs implemented, in the sequence that those effects
are anticipated. For example, when the ultimate goal is habitat/biota restoration in an urban stream, if
BMPs are implemented first that will alter peak stormflows, design the monitoring program to track
changes in hydrology. After the needed hydrologic restoration is achieved, monitoring can be redirected
to track expected changes in channel morphology. Once changes in channel morphology are documented,
monitoring can then focus on assessment of habitat and biological community response. Response of
stream hydrology is likely to be quicker than restoration of stream biota and would therefore be a
valuable—and more prompt—indicator of progress.
6.3 Integrating Monitoring and Modeling
Monitoring and modeling are the primary tools for assessment of NFS watershed projects. By providing
essential data about the resource, water quality monitoring has long been the foundation of water quality
management. Monitoring can, however, be expensive and technically challenging and requires careful
design and execution to achieve objectives. Modeling, on the other hand, is indispensable in evaluating
alternative scenarios and in forecasting water quality over time. Modeling is also technically demanding,
and application of a model in the absence of observed data can contribute to legitimate skepticism and
uncertainty about model results that can compromise the utility of modeling for watershed management.
To meet the demands of future watershed programs, it is essential that we integrate the strengths of both
tools.
6.3.1 The Roles of Monitoring and Modeling
Both monitoring and modeling have distinctive roles to play in watershed projects. In many cases these
roles are complementary, but in some cases one tool is used as a substitute for the other for various
reasons including budgetary constraints.
6.3.1.1 Monitoring
Monitoring plays many key roles in watershed projects:
• Identify and document water quality problems and impairments
6-12
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
• Assess compliance with water quality standards and other regulations
• Establish baseline conditions
" Provide credibility to project planning
• Provide data to support modeling
" Document water quality change
" Assess program or project effectiveness
• Provide information for adaptive management
• Inform stakeholders
• Contribute to behavior change by documenting actual watershed conditions
Monitoring can provide fundamental knowledge about the generation, fate, and transport of NPS
pollutants. Monitoring data provide hard evidence of water quality impairment and represent the best
evidence of water quality restoration. When successful, monitoring can effectively document water
quality response to land treatment, e.g., reductions in nutrient and sediment loads resulting from livestock
exclusion in Vermont (Meals 2004) and reductions in nitrate loading to streams from prairie restoration in
Iowa (Schilling and Spooner 2006).
Water quality monitoring also presents important challenges in watershed projects. Over the past decades,
many projects have failed to show water quality response through monitoring. Such failure can be
attributed to shortcomings in both design (e.g., failure to measure what is needed, inadequate sampling
frequency) and execution (e.g., failure to evaluate data regularly, inadequate staff training, poor
institutional integration) (Reid 2001). As noted throughout this guidance, monitoring must be conducted
under appropriate objectives with a statistical design that can meet those objectives. Monitoring must be
conducted at a frequency adequate to meet objectives (e.g., to document change) and for an adequate
duration (e.g., to overcome lag time). Water quality monitoring must be executed effectively, with careful
attention to procedural issues like collection of collateral information, regular data evaluation, and
institutional coordination.
6.3.1.2 Modeling
Modeling also plays a number of critical roles in watershed projects:
• Provide initial estimates of flow and pollutant loads
• Link sources to impacts and evaluate relative magnitudes of sources
• Identify critical areas for management
• Predict pollutant reductions and waterbody response to management actions
" Support informed choices among alternative actions
• Analyze cost-effectiveness of alternatives
" Address issues of lag time in system response to treatment
" Guide monitoring design
• Help build knowledge of natural processes and response to treatment
6-13
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
• Provide opportunities for collaborative learning and stakeholder involvement
Modeling can forecast future response to alternatives too numerous or time-consuming to monitor
effectively. Modeling provides the means to assemble, express, and test the current state of knowledge
and point the way for future investigations. Model applications for watershed evaluation range from the
simple to the very complex. An Oklahoma project used SIMPLE (Spatially Integrated Models for
Phosphorus Loading and Erosion) to identify high-risk P sources in the Peacheater Creek watershed to
design a land treatment plan (Storm et al. 1996). A recent Vermont project used SWAT (Soil and Water
Assessment Tool) to identify critical source areas for NPS P in a large agricultural watershed (Winchell et
al. 2011). National CEAP Cropland Studies in the Upper Mississippi River Basin (USDA-NRCS 2012),
the Chesapeake Bay region (USDA-NRCS 201 la), and the Great Lakes system (USDA-NRCS 201 Ib)
used SWAT and other models to quantify the effects of conservation practices currently present on the
landscape in the regions and to project potential benefits that could be gained by implementation of
additional conservation treatment in under-treated agricultural acres.
Modeling also presents significant challenges in watershed projects. Some data are always required - for
model parameterization, calibration, and validation - and inadequate supporting data can significantly
degrade model performance. Technical and financial resources are required for modeling that may be
difficult to assemble and sustain. Modeling may be impaired by inappropriate or outdated information
(e.g., soil surveys, use of Curve Numbers), or by lack of fundamental understanding of how
agroecosystems or urban stormwater processes function. The credibility of model application may be
threatened by lack of appropriate algorithms for simulating conservation or urban stormwater
management practices and by failure to adequately analyze uncertainties associated with model results.
Model results nearly always require analysis and interpretation to be useful; failure to provide such
support can lead to justifiable skepticism about model results. The Chesapeake Bay model, for example,
has been criticized for overstating environmental achievements in contradiction to monitoring data (GAO
2005, Powledge 2005). Disputes or misunderstandings over pollutant loads simulated by the SPARROW
model in the Mississippi River Basin have generated economic and political conflict over source
identification and choices of alternatives for remediation (Robertson et al. 2009).
6.3.2 Using Monitoring and Modeling Together
Clearly, monitoring and modeling are not mutually exclusive and can be better integrated in watershed
protection and restoration projects. Each tool has its own strengths and weaknesses and neither can by
itself provide all the information needed for water quality decision-making or program accountability.
Integration of monitoring and modeling should address these elements:
Use the strengths of both tools.
• Monitoring is the best tool for project evaluation, but modeling simulations and extrapolations can
play an important role in projecting whether project success is likely.
" Modeling can provide guidance on where and how the on-the-ground monitoring is best conducted.
• Modeling is better than monitoring for comparing numerous scenarios and extrapolating effects into
the future.
" Data collected through monitoring are essential for calibration and validation of models, and for
establishing credibility for modeling-derived information.
6-14
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
• The validity of model application and the type of questions that are addressed must be corroborated
by watershed stakeholders.
« Models are underutilized for collaborative learning purposes. Their use within collaborative
frameworks must be promoted to incorporate feedback from stakeholders while demonstrating how
decisions at the field-scale affect the environment.
Begin with project objectives and design the monitoring-modelingprogram to do what can be done well
to meet those objectives.
• Begin with a clear set of objectives. Determine if the objectives need to be quantitative (e.g., reduce
N load by 40%), if they need to incorporate time frames and scales for which accountability is
needed (e.g., reduced N load at a tributary mouth or at each HUC-12), and if there is a need to
attribute changes to activities on the land (e.g., in response to implementing specific management
measures at a specified level).
• Establish a clear set of evaluation objectives. Define the specific questions to be answered with
monitoring (measure N load reductions with a minimum detectable change of 20%) and with
modeling (measure and project N load reductions within ±15% of actual loads). Incorporate the
needed time frames and scales within the objectives, and ensure that monitoring and modeling
objectives are complementary. For example, the monitoring objective might be to measure N load
reductions with a minimum detectable change of 20% in select smaller watersheds within 10 years
and assess with an MDC of 30% long-term N load trends at mouths of larger watersheds and the
state line. The evaluation objective for modeling might be to estimate and project N load reductions
within 15% of actual loads in select smaller watersheds within 10 years and estimate and project
within 15% of actual long-term N load trends at mouths of larger watersheds and the state line.
Address uncertainty at the outset and include uncertainty in all monitoring and modeling reporting.
• Select a model based on project needs - models selected solely by cost or convenience before
setting objectives are unlikely to be satisfactory.
« Create a monitoring program that will collect the number and frequency of samples that are
required to provide useful information - monitoring designs based solely on budget may yield data
that cannot serve project objectives.
Select the appropriate designs.
« Establish the monitoring design(s). Address overall experimental design (e.g., long-term trend,
upstream-downstream) and specify the elements of monitoring scale, sample type, station locations,
sampling frequency, collection and analysis methods, land use/land treatment monitoring, and data
management (see chapters 2 and 3).
« Select the modeling approach. Determine which model(s) to use, input data requirements and
availability, model testing locations and procedures, and procedures for output analysis. Make
certain that adequate technical skill and support are available for the selected approach.
Pay attention to source data.
« Availability of data at consistent scales and of known quality is essential to an integrated
monitoring-modeling effort.
« Spatially- and temporally-explicit land treatment and agricultural management data are necessary
for both water quality monitoring and watershed modeling.
6-15
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
" Identify common needs of monitoring and modeling. Share precipitation, land use, land treatment,
and other data. Use monitored flow and water quality data to calibrate and validate the model(s).
Evaluate the suitability of both monitoring data/programs and proposed model(s) for the project in the
project planning stage, before a project is funded and underway.
* Evaluate existing and planned monitoring data for quality, consistency, and suitability for project
purposes.
" Evaluate candidate watershed models for applicability to watershed characteristics, technical
competence, and resources necessary to apply and support modeling in the project.
" Verify that important watershed characteristics (e.g., claypan soils) and conservation and
stormwater management practice functions can be adequately represented in the selected model.
Integrate data analysis and reporting.
" Combine systems for discharge calculations, loads calculated from monitoring data, and land
use/land treatment data.
• Link monitoring data to a GIS framework used for modeling.
" Provide for compatibility between monitoring data and model(s) to permit efficient use of
monitoring data for model calibration and validation.
" Facilitate analysis of small-scale monitoring and modeling to develop input parameters for large-
scale model application(s).
Include a documentation plan for both monitoring and modeling.
" Use a formal Quality Assurance Project Plan (QAPP) to guide and document all aspects of the
monitoring and modeling efforts.
* Lay out the purpose of model application and the justification for the selection of a particular
model.
" Document the model name and version and the source of the model.
" Identify and document model assumptions.
" Document data requirements and sources of data sets to be used.
" Provide estimates of the uncertainty associated with modeling and monitoring results, particularly
when they are used to quantify the environmental benefits of practices.
Develop a communication strategy. Control expectations from the beginning by addressing monitoring
and modeling uncertainty explicitly. Avoid overly optimistic projections.
Be aware of potential differences in precision and accuracy of modeling results vs. monitoring data.
Monitoring data may be used to identify trends or changes in water quality (see sections 7.7 and 7.8);
such trends are identified in the context of statistical confidence, based largely on the characteristics of
the monitoring program (see MDC, section 3.4.2). Model predictions, however, may show changes in
water quality without the benefit of statistical trend analysis and thus suggest very small trends that
cannot be verified by monitoring data. Monitoring data may, for example, support a MDC of 20% for
6-16
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
phosphorus concentration, while a model may predict a 7% reduction. This situation is not necessarily
contradictory, but calls for a bit of realistic caution in application and interpretation of model results.
Finally, in practical terms, project water quality monitoring and watershed modeling activities must be
closely coordinated so that information from each effort can be collected, shared, and combined at
appropriate times to meet project goals. Preliminary model runs to identify critical subwatersheds, for
example, can also be used to help select monitoring station locations. Similarly, water quality data that are
analyzed in atimely fashion as described in section 3.10.2 are more likely to be available at the right time
for model calibration and validation.
6.4 Supporting BMP and Other Databases
6.4.1 General Considerations
Monitoring is often performed to develop a belter understanding of BMP effectiveness, characterize
reference conditions over broad geographic areas, determine effluent characteristics, or address other
purposes not directly related to problem assessment or watershed project evaluation. In some cases this
monitoring can be done in conjunction with problem assessment or project evaluation to maximize the
return on resources expended, but this monitoring is often done separately.
The basic steps presented in chapters 2 and 3 should also be applied to development of monitoring plans
in support of BMP and other databases. Some of the specifics may not apply, however, such as watershed
characterization or monitoring of meteorological variables in cases where urban stormwater BMPs are
assessed in a laboratory setting. Pollutant transport mechanisms and pollutant source activities may be of
little interest in monitoring designed to establish reference conditions. Still, the focus on objectives must
be the driving force behind all monitoring design.
For new databases, decisions need to be made regarding the types and quality of data that will be
included. Development of a QAPP (see chapter 8) is an important first step in defining data needs and
data quality expectations for the database.
When monitoring to support existing databases, it is essential that data requirements are reviewed and
understood before the monitoring plan is developed to ensure that suitable data will be collected. For
example, those managing the International Stormwater BMP Database have developed guidance with
recommended BMP monitoring protocols that are directly related to requirements of the database, and
have established a recommended protocol for evaluating BMP performance (Geosyntec and WWE 2009).
This database is described in section 6.4.2.
Databases may have specific requirements for monitoring designs (e.g., above/below), sampling type
(grab or composite), sampling frequencies, specific variables (e.g., EPA Method 365.4 for total P), and
other monitoring details, as well as requirements for reporting information on the study conditions and
features. For example, it may be required that designs for BMPs are reported in accordance with industry
standards, or that a specific level of detail be reported for soils or crops. All of these requirements need to
be reviewed and understood before monitoring begins.
Data format, approaches to data analysis, and data transmittal requirements may also be specified.
Questions and issues associated with these requirements need to be addressed up front to prevent
problems later.
6-17
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
The single most important step to take when monitoring in support of database development is for those
performing the monitoring to communicate with those managing the databases to ensure that monitoring,
data analysis, reporting, and data management requirements are understood and that the proposed
monitoring plan is suitable before monitoring begins.
6.4.2 International Urban Stormwater BMP Database
The International Stormwater BMP Database (www.bmpdatabase.org/) is a database of over 530 BMP
studies, performance analysis results, tools for use in BMP performance studies, monitoring guidance,
and other study-related publications. The overall purpose of the project is to provide scientifically sound
information to improve the design, selection, and performance of BMPs. Data obtained from BMP studies
are expected to help create a better understanding of factors influencing BMP performance.
The database is focused on field studies of post-construction, permanent BMPs (International Stormwater
BMP Database 2013). Data entry requirements are specified in a user's guide (WWW and Geosyntec
2010). Options for BMPs include structural BMPs, non-structural BMPs, low-impact development sites,
and composite BMPs. Monitoring results may include precipitation, flow, water quality, and settling
velocity.
Guidance is provided on approaches to determining BMP performance using concentrations, loads, and
volume reductions (Geosyntec and WWW 2009). Comparison of the average value of the Event Mean
Concentrations (EMC) or storm loads for the outlet as compared to the inlet is emphasized. Examining
the cumulative distribution of each of the outlet and inlet storm EMCs allows for more detailed
examination of the efficiency at different inlet loadings. This approach, the Effluent Probability Method
(Strecker et al. 2003, Erickson et al. 2010), is described in more detail in section 7.7.2.
The database structure and contents may be downloaded from the project website and used solely for the
following purposes (International Stormwater BMP Database 2013):
• Research and analysis related to BMP performance and costs, characterization of urban runoff,
characterization of receiving water impacts, and characterization of the ability of BMPs to meet
water quality goals or criteria.
• Use of database structure and/or data entry spreadsheets to track performance data for regional,
state, watershed or local purposes or for subsequent upload to the International Stormwater BMP
Database.
6.5 References
BBNEP (Buzzards Bay National Estuary Program). 2008. Turning the Tide on Shellfish Bed Closures in
Buzzards Bay During the 1990s. Buzzards Bay National Estuary Program, East Wareham, MA.
Accessed February 12, 2016. http://www.buzzardsbav.org/shellclssuccess.htm.
Bester, M.L., E.O. Frind, J.W. Molson, and D.L. Rudolph. 2006. Numerical investigation of road salt
impact on an urban wellfield. Ground Water 44:165-175.
Bratton, J.F., J.K. Bohlke, P.M. Manheim, and D.E. Krantz. 2004. Submarine ground water in Delmarva
Peninsula coastal bays: ages and nutrients. Ground Water 42:1021-1034.
6-18
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
Chadwick, J.W., S.P. Canton, and R.L. Dent. 1986. Recovery of benthic invertebrate communities in
Silver Bow Creek, Montana following improved metal mine wastewater treatment. Water, Air, &
Soil Pollution 28:427-438.
Clausen, J.C. 2007. Jordan Cove Watershed Project Final Report. University of Connecticut, College of
Agriculture and Natural Resources, Department of Natural Resources Management and
Engineering. Accessed January 8, 2016.
http: II] ordancove .uconn. edu/j ordan_cove/publications/final_report .pdf.
Clausen, J.C. and J. Spooner. 1993. Paired Watershed Study Design. 841-F-93-009. Prepared for
S. Dressing, U.S. Environmental Protection Agency, Office of Water, Washington, DC. Accessed
February 12,2016.
Cravotta, C.A., III., R.A. Brightbill, and M.J. Langland. 2009. Abandoned mine drainage in the Swatara
Creek basin, southern anthracite coalfield, Pennsylvania, USA: 1. streamwater-quality trends
coinciding with the return offish. Mine Water and the Environment 29(3): 176-199.
Druschel, G., A. Hartmann, R. Lomonaco, and K. Oldrid. 2005. Determination of Sediment Phosphorus
Concentrations in St. Albans Bay, Lake Champlain: Assessment of Internal Loading and
Seasonal Variations of Phosphorus Sediment-Water Column Cycling. Final Report prepared for
Vermont Agency of Natural Resources, by University of Vermont, Department of Geology,
Burlington.
Erickson, A.J., P.T. Weiss, J.S. Gulliver, and R.M. Hozalski. 2010. Analysis of Long-Term Performance.
In Stormwater Treatment: Assessment and Maintenance. ed. J.S. Gulliver, A.J. Erickson, and
P.T. Weiss. University of Minnesota, St. Anthony Falls Laboratory. Minneapolis, MN. Accessed
February 12, 2016. http://stormwaterbook.safl.umn.edu/pollutant-removal/analysis-long-term-
performance.
Galeone, D.G. 2005. Pequea and Mill Creek watersheds section 319 NMP project: effects of streambank
fencing on surface-water quality. NWQEP Notes 118:1-6, 9. North Carolina State University
Cooperative Extension, Raleigh. Accessed March 15, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/issues/notesll8.pdf
GAO (Government Accountability Office). 2005. Chesapeake Bay Program: Improved Strategies are
Needed to Better Assess, Report, andManage Restoration Progress. GAO-06-96. US Government
Accountability Office, Washington, D.C.
Geosyntec and WWE (Wright Water Engineers, Inc.). 2009. Urban Stormwater BMP Performance
Monitoring. Prepared for U.S. Environmental Protection Agency, Water Environment Research
Foundation, Federal Highway Administration, and Environmental and Water Resources Institute of
the American Society of Civil Engineers, by Geosyntec Consultants and Wright Water Engineers,
Inc., Washington, DC. Accessed February 12, 2016.
http ://www.bmpdatabase .org/Docs/2009%20Stormwater%20BMP%20Monitoring%20Manual .pdf.
Giroux, M., and R. Royer. 2007. Long term effects of phosphate applications on yields, evolution of P
soil test, saturation, and solubility in two very rich soils. Agrosolutions 18:17-24. Inst. de
recherche et de developpement en agroenvironnement (IRDA), Quebec, Canada. (In French, with
English abstract.)
6-19
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
International Stormwater BMP Database. 2013. Version 03 24 2013. Prepared for Water Environment
Research Foundation (WERF), American Society of Civil Engineers (ASCE)/Environmental and
Water Resources Institute (EWRI), American Public Works Association (APWA), Federal
Highway Administration (FHWA), and U.S. Environmental Protection Agency (EPA), by Wright
Water Engineers, Inc. and Geosyntec Consultants, Washington, DC. Accessed February 17, 2016.
http://www.bmpdatabase.org/.
Jaworski, N. 1990. Retrospective of the water quality issues of the upper Potomac estuary. Aquatic
Sciences 3:11-40.
King, K.W., P.C. Smiley, B. Baker, and N.R. Fausey. 2008. Validation of paired watersheds for assessing
conservation practices in Upper Big Walnut Creek Watershed, Ohio. Journal of Soil and Water
Conservation. 63(6):380-395.
Larsen, D.P., J. Van Sickle, K.W. Malueg, and P.O. Smith. 1979. The effect of wastewater phosphorus
removal on Shagawa Lake, Minnesota: phosphorus supplies, lake phosphorus, and chlorophyll a.
Water Resources 13:1259-1272.
LCBP (Lake Champlain Basin Program). 2008. State of the Lake and Ecosystem Indicators Report-2008.
Lake Champlain Basin Program, Grand Isle, VT. Accessed February 17, 2016.
http: //www. Icbp .org/media-center/publications-library/state -of-the -lake/.
Marutani, T., M. Kasai, L.M. Reid, andN.A. Trustrum. 1999. Influence of storm-related sediment storage
on the sediment delivery from tributary catchments in the Upper Waipaoa River, New Zealand.
Earth Surface Processes andLandforms 24:881-896.
McCollum, R.E. 1991. Buildup and decline of soil phosphorus: 30-year trends on a Typic Umprabuult.
Agronomy Journal 83:77-85.
Meals, D.W. 1992. Water Quality Trends in the St. Albans Bay, Vermont Watershed Following RCWP
Land Treatment. In The National Rural Clean Water Program Symposium, Orlando, FL,
September 1992. ORD EPA/625/R-92/006. U.S. Environmental Protection Agency, Office of
Research and Development, Washington, DC. pp. 47-58.
Meals, D.W. 2001. Lake Champlain Basin Agricultural Watersheds Section 319 National Monitoring
Program Project, Final Project Report: May, 1994-September, 2000. Vermont Department of
Environmental Conservation, Waterbury, VT.
Meals, D.W. 2004. Water Quality Improvements Following Riparian Restoration In Two Vermont
Agricultural Watersheds. In Lake Champlain: Partnership and Research in the New Millennium,
ed. T. Manley et al., pp. 81-96. Kluwer Academic/Plenum Publishers.
Meals, D.W., S.A. Dressing, and T.E. Davenport. 2010. Lag time in water quality response to best
management practices. Journal of Environmental Quality 39:85-96.
Meals, D.W., J. Spooner, S.A. Dressing, and J.B. Harcum. 2011. Statistical Analysis for Monotonic
Trends, Tech Notes #6, September 2011. Prepared for U.S. Environmental Protection Agency, by
Tetra Tech, Inc., Fairfax, VA. Accessed March 16, 2016. https://www.epa.gov/polluted-runoff-
nonpoint-source-pollution/nonpoint-source-monitoring-technical-notes.
6-20
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
Nearing, M.A., L.J. Lane, E.E. Alberts, and J.M. Laflen. 1990. Prediction technology for soil erosion by
water: status and research needs. Soil Science Society of America Journal 54:1702-1711.
Newbold, J.D., S. Herbert, B.W. Sweeney, and P. Kiry. 2009. Water quality functions of a 15-year-old
riparian forest buffer system. NWQEP Notes 130:1-9. North Carolina State University
Cooperative Extension, Raleigh. Accessed March 15, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/issues/notesl30.pdf.
Newson, J. 2007. Measurement and modeling of sediment transport in a northern Idaho stream. Masters
thesis, University of Idaho, College of Graduate Studies, Biological and Agricultural
Engineering, Moscow, ID.
Phillips, S.W. and B.D. Lindsey. 2003. The Influence of Ground Water on Nitrogen Delivery to the
Chesapeake Bay. Fact Sheet FS-091-03. U.S. Geological Survey, MD-DE-DC Water Science
Center, Baltimore, MD. Accessed March 15, 2016. http://md.water.usgs.gov/publications/fs-091-
03/index.html.
Powledge, F. 2005. Chesapeake Bay restoration: a model of what? BioScience 55(12): 1032-1038.
Reid, L.M. 2001. The epidemiology of monitoring. Journal of American Water Resources Association
37(4):815-820.
Richards, R.P., and G.L. Grabow. 2003. Detecting reductions in sediment loads associated with Ohio's
conservation reserve enhancement program. Journal of American Water Resources Association
39:1261-1268.
Robertson, D.M., G.E. Schwarz, D.A. Saad, and R.B. Alexander.2009. Incorporating uncertainty into the
ranking of SPARROW model nutrient yields from Mississippi/Atchafalaya River Basin
watersheds. Journal of American Water Resources Association 45(2):534-549.)
Schilling, K.E., and J. Spooner. 2006. Effects of watershed-scale land use change on stream nitrate
concentrations. Journal of Environmental Quality 35:2132-2145.
Schilling, K.E., and C.F. Wolter. 2007. A GIS-based groundwater travel time model to evaluate stream
nitrate concentration reductions from land use change. Environmental Geology 53:433-443.
Seo, D. and R. Canale. 1999. Analysis of sediment characteristics and total phosphorus models for
Shagawa Lake. Journal of Environmental Engineering 125(4), 346-350.
Sharpley, A.N., S. Herron, and T. Daniel. 2007. Overcoming the challenges of phosphorus-based
management in poultry farming. Journal of Soil and Water Conservation 62:375-389.
Spooner, J., C.J. Jamieson, R.P. Maas, and M.D. Smolen. 1987. Determining statistically significant
changes in water quality pollutant concentrations. Lake Reservoir Management 3:195-201.
Spooner, J., L.A. Szpir, D.E. Line, D. Meals, G.L Grabow, D.L. Osmond and C. Smith. 2011. 2077
Summary Report: Section 319 National Monitoring Program Projects, National Nonpoint Source
Watershed Project Studies. North Carolina State University, Biological and Agricultural
Engineering Department, NCSU Water Quality Group, Raleigh, NC. Accessed March 15, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/319monitoring/toc.html.
6-21
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
STAC (Scientific and Technical Advisory Committee). 2005. Understanding "Lag Times " Affecting the
Improvement of Water Quality in the Chesapeake Bay. STAC Publ. 05-001. Chesapeake Bay
Program, Edgewater, MD. Accessed March 15, 2016.
Storm, D.E., G.J. Sabbagh, M.S. Gregory, M.D. Smolen, D. Toetz, D.R. Gade, C.T. Haan, and T.
Kornecki. 1996. Basin-Wide Pollution Inventory for the Illinois River Comprehensive Basin
Management Program-Final Report. Prepared for U.S. Environmental Protection Agency and
Oklahoma Conservation Commission, by Oklahoma State University, Departments of Biosystems
and Agricultural Engineering, Agronomy, and Zoology, Stillwater, OK. Accessed March 15,
2016.
Strecker, E.W., M.M. Quigley, and B. Urbanas. 2003. A Reassessment of the Expanded EPA/ASCE
National BMP Database. In Proceedings of the World Water and Environmental Congress
2003, June 23-26, 2003, Philadelphia, PA. ed. P. Bizier and P. DeBarry, ISBN 0-7844-0685-5,
American Society of Civil Engineers, Reston VA. pp. 555-574. Accessed March 15, 2016.
Tomer, M.D., and M.R. Burkart. 2003. Long-term effects of nitrogen fertilizer use on ground water
nitrate in two small watersheds. Journal of Environmental Quality 32:2158-2171.
USDA-NRCS (U.S. Department of Agriculture-Natural Resources Conservation Service). 201 la.
Assessment of the Effects of Conservation Practices on Cultivated Cropland in the Chesapeake
Bay Region. U.S. Department of Agriculture, Natural Resources Conservation Service. Accessed
February 16,2016.
http://www.nrcs.usda.gov/wps/portal/nrcs/detail/national/technical/nra/ceap/na/?&cid=stelprdblO
41684.
USDA-NRCS (U.S. Department of Agriculture-Natural Resources Conservation Service). 201 Ib.
Assessment of the Effects of Conservation Practices on Cultivated Cropland in the Great Lakes
Region. U.S. Department of Agriculture, Natural Resources Conservation Service. Accessed
February 16,2016.
http://www.nrcs.usda.gov/wps/portal/nrcs/detail/national/technical/nra/ceap/na/?&cid=stelprdblO
45403.
USDA-NRCS (U.S. Department of Agriculture-Natural Resources Conservation Service). 2012.
Assessment of the Effects of Conservation Practices on Cultivated Cropland in the Upper
Mississippi River Basin. U.S. Department of Agriculture, Natural Resources Conservation
Service. Accessed February 16, 2016.
http://www.nrcs.usda.gov/wps/portal/nrcs/detail/national/technical/nra/ceap/na/?&cid=nrcsl43_0
14161.
USEPA (U.S. Environmental Protection Agency). 2008. Handbook for Developing Watershed Plans to
Restore and Protect Our Waters. EPA 841-B-08-002. U.S. Environmental Protection Agency,
Office of Water, Washington, DC. Accessed March 15, 2016. http://www.epa.gov/polluted-
runoff-nonpoint-source-pollution/handbook-developing-watershed-plans-restore-and-protect.
VADCR (Virginia Department of Conservation and Recreation). 1997. Alternative Watering Systems for
Livestock-the Middle Fork Holston River Builds on Success. In Section 319 Success Stories:
Volume II'- Highlights of State and Tribal Nonpoint Source Programs. EPA 841-R-97-001. U.S.
Environmental Protection Agency, Office of Water, Washington, DC. Accessed March 15, 2016.
6-22
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
Whitney, L., and R. Hafele. 2006. Stream restoration and fish in Oregon's Upper Grand Ronde river
system. NWQEPNotes 123:1-6, 9-12. North Carolina State University Cooperative Extension,
Raleigh. Accessed March 15, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/issues/notesl23.pdf.
Winchell, M., D. Meals, S. Folle, J. Moore, D. Braun, C. DeLeo, K. Budreski, and R. Schiff. 2011.
Identification of Critical Source Areas of Phosphorus with the Vermont Sector of the Missisquoi
Bay Basin. Final Report to Lake Champlain Basin Program. Accessed March 15, 2016.
http://www.lcbp.org/techreportPDF/63B Missisquoi CSA.pdf.
WWE (Wright Water Engineers, Inc.) and Geosyntec. 2010. International Stormwater Best Management
Practices (BMP) Database User's Guide for BMP Data Entry Spreadsheets - Release Version
3.2. Prepared for Water Environment Research Foundation, Federal Highway Administration,
Environmental and Water Resources Institute of the American Society of Civil Engineers, U.S.
Environmental Protection Agency, and American Public Works Association, by Wright Water
Engineers, Inc. and Geosyntec Consultants, Denver, CO. Accessed February 12, 2016.
http://www.bmpdatabase.org/Docs/2010%20BMP%20Database%20User's%20Guide.pdf.
Zhang, T.Q., A.F. MacKenzie, B.C. Liang, and C.F. Drury. 2004. Soil test phosphorus and phosphorus
fractions with long-term phosphorus addition and depletion. Soil Science Society of America
Journal 68:519-528.
6-23
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 6
This page intentionally left blank.
6-24
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7 Data Analysis
By J. Spooner, J.B. Harcum, D.W. Meals, S.A. Dressing, and R.P. Richards
7.1 Introduction
This chapter of the guidance examines options for planning and analyzing data collected in nonpoint
source watershed studies. The emphasis of this chapter is on projects at the watershed or subwatershed
level, although evaluation of individual BMPs is also addressed. These analysis approaches complement
the watershed project design considerations discussed in section 2.4 of this guidance.
Specifically, this chapter discusses the following topics:
• Exploratory data analysis
• Data transformations that might be necessary to prepare data for valid statistical analysis
• Methods to deal with extreme values, censored data, and missing data
• Data analysis methods for water quality problem assessment
• Data analysis methods for project planning
• Data analysis methods for assessing BMP or watershed project effectiveness
• Techniques for load estimation
The reader may wish to refer to chapter 4 (Data Analysis) of the 1997 guidance (USEPA 1997b) which
was written largely to provide a primer on statistical methods for analysis of data generated by nonpoint
source watershed projects. The 1997 guidance addresses various topics on statistical analysis in
considerable detail, including estimation and hypothesis testing, characteristics of environmental data, and
basic descriptive statistics. In addition, the 1997 guidance compares parametric and nonparametric tests,
recommends appropriate methods for routine analyses, and provides numerous examples of the
application of various statistical tests. Additional resources for data analysis approaches are also available
in various Tech Notes and other publications (see References).
7.2 Overview of Statistical Methods
A wide range of parametric and nonparametric methods exists for analyzing environmental data. In some
cases, graphical methods will be suitable to meet analysis objectives; more rigorous statistical analysis
approaches may be best otherwise. This section provides a brief overview and summary of key features of
these various methods. Readers should consult the 1997 guidance (USEPA 1997b) and additional sources
(e.g., statistics textbooks and software packages) for greater detail.
Recommended statistical methods are summarized in Table 7-1 through Table 7-6 based on watershed
project phase or need because experience indicates that this type of grouping will be practical for many
involved in such efforts. Methods in these tables are recommended, but the tables do not include all
possible alternative approaches. Additional discussion and illustrative examples follow in sections 7.3
through 7.8. Because of its importance to many watershed projects, especially those addressing TMDLs,
7-1
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
pollutant load estimation is addressed separately in section 7.9. While most of the methods described in
this chapter are more commonly applied to water chemistry, flow, and precipitation data, many can also
be applied to biological data as well. Recommended approaches for analyzing biological data are
described in detail in chapter 4, and some examples are also provided in this chapter.
7.2.1 Exploratory Data Analysis and Data Transformations
It is often necessary to work with a mix of information and data during the initial stages of watershed
projects. A major first task involves gathering and organizing available information and data, followed by
an initial examination of the data to help identify water quality problems, pollutants, sources, and
pathways. Exploratory data analysis techniques are well suited to this project phase, and should also be
applied as a first step to all data subsequently collected by the project. Exploratory data analysis is also a
critical first step in beginning to analyze water quality data from watershed projects that are underway,
before undertaking more complex analysis.
Exploratory data analysis provides basic information about the data record, including the data distribution
and an assessment of missing and extreme values. The presence of autocorrelation and seasonal cycles
should also be evaluated. EDA can also be useful to examine clusters in the data or relationships between
variables and/or sample locations.
Table 7-1 summarizes exploratory data analysis methods by analytical objective. The type of method
(parametric, nonparametric, graphical), basic data requirements (e.g., distribution, independence), and
major cautions and concerns are also included in the table.
Table 7-1. Exploratory data analysis methods (see discussion, section 7.3)
Analytical Objective
Describe behavior of
variable(s)
Evaluate distribution and
assumptions of
independence and
constant variance
Identify extreme values
and anomalies
Recommended Method
Univariate statistics (e.g.,
range, mean, median,
interquartile range, variance)
Plots (histogram, probability,
lag-n autocorrelation,
cumulative distribution
functions); skewness, kurtosis;
Durbin-Watson statistic to
detect presence of
autoregressive lag 1 pattern;
Shapiro-Wilk test;
Kolmogorov-Smirnov test
Plots (e.g., time series,
boxplots)
Compute frequency or
proportion of observations
exceeding threshold value;
cumulative frequency or
duration plots
Method
Type*
P, N
P, N, G
G, P, N
Data
Requirements
Minimal
Minimal to
moderate
Minimal
Major Cautions and Concerns
Mean is sensitive to extreme values;
median may be preferred measure of
central tendency.
Data transformations to satisfy likely
statistical testing assumptions should
be examined.
Autocorrelation functions (ACF) which
examine auto correlation at each lag
require equal time-space data and
appropriate software.
Outliers should not be deleted if error
cannot be confirmed.
7-2
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Analytical Objective
Observe seasonal or
other cycles
Find clusters or
groupings
Preliminary comparison
of two or more locations
or time periods
Examine relationships
between variables
Recommended Method
Plots (time series, seasonal
boxplots)
Examination of autocorrelation
pattern
Cluster analysis, principal
components analysis,
canonical correspondence
analysis, discriminant function
analysis
Boxplots
Correlation, regression
Spearman's rho or rank
correlation coefficient
Bivariate scatterplots
LOWESS smoother
Method
Type*
G
P, N, G
G
P
N
G
Data
Requirements
Minimal
Minimal
Data must be
normally
distributed to
apply parametric
analysis
Can be used
when both
independent and
dependent
variables are
ordinal or when
one variable is
ordinal and the
other is
continuous
Minimal
Major Cautions and Concerns
More intensive techniques are
generally required to confirm and
quantify trends.
Use software that can generate
autocorrelation function (ACF) graphs
(see section 7.3.6).
Factors determining groupings may be
difficult to discern or interpret.
Visual comparisons should be
confirmed by numerical tests.
Graphical analysis should be used to
confirm and understand numerical
correlation coefficient. Correlation does
not guarantee causation.
Visual comparisons should be
confirmed by numerical tests.
*Key to Method Type: G = Graphical, N = Nonparametric, P = Parametric
Table 7-2 summarizes methods that can be applied to adjust (e.g., transform) data based on the
requirements of methods (e.g., normal distribution required for parametric analyses) to be used in the next
phase of data analysis. This table also identifies methods that can be used to address problems caused by
unexpected events, including washed out monitoring equipment, floods, droughts, ice, failed BMP
implementation plans, and equipment and laboratory errors.
7-3
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Table 7-2. Methods for adjusting data for subsequent analysis (see discussion, section 7.3)
Analytical
Objective
Obtain a
normal
distribution (for
parametric
approaches)
Accommodate
extreme values
Manage
missing data
Recommended Method
logio and loge (In) are most
commonly used transformations in
water resources
arc-sine square root transformation
If distribution assumptions cannot
be met, adopt methods resistant to
errors in results caused by
deviations from the assumption of
normality
Use methods resistant to errors in
results caused by extreme values
such as: nonparametric trend tests
or frequency analyses
Data stratification (e.g., by seasons,
base flow, storm, and floods)
Use covariate/ explanatory variable
such as flow to help 'explain' the
influence of extreme values
Utilize log transformed data to
minimize skewness caused by the
extreme values
Data aggregation to create uniform
time intervals by averaging or using
the median value
Estimate missing values based
upon regression relationship from
other sites or events
Method
Type*
P
P
N
N, G
P
P
Data Requirements
Original data values must
be positive and non-zero.
Used for proportions
Minimal
Moderate
Moderate
Minimal
Minimal
Regression relationship
with data from similar
basin (e.g., flow).
Sometimes it may also be
appropriate to use the
flow/concentration
relationship at the same
station to estimate missing
concentration data
Major Cautions and Concerns
Other transformations (e.g., Box-
Cox) may be required to achieve
normal distribution. Very small
numbers and legitimate zero
values may require a different
transformation (e.g., logio(value
+ n). Transformations will not
correct issues of independence.
Back-transformations may be
difficult to interpret.
Nonparametric procedures may
still have other assumptions that
must be met for usage.
If distributional assumptions can
be met, then parametric tools
tend to be more powerful.
If the data are missing due to
right censoring (too high to
measure), techniques discussed
in section 7.4 should be
considered.
Missing values are ignored in
most nonparametric and
parametric tests; however, some
tests require equal spacing of
observations. Data aggregation
to accommodate missing data or
changes in data frequency must
be done with care.
Only use when the data meet
the assumptions for regression
analysis and the sample size is
large enough that the regression
relationship is reliable.
7-4
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Analytical
Objective
Adjust for
autocorrelation
Adjust for
seasonality or
other cycles
Recommended Method
Aggregation of data to less frequent
observations
Use of parametric time series
analysis techniques available in
many statistical software tests
Adjust the standard error for the
trend (difference or slope) to
accommodate for the reduced
effective degrees of freedom
Use non-parametric trend tests that
adjust for seasonality
Add explanatory variables that
'explain' the season affect
Use time series models that
incorporate a lag term(s) to
incorporate for seasonal cycles into
statistical models
Method
Type*
N
P
Data Requirements
Minimal
Generally equally time-
space data observations
Need to calculate the
autocorrelation coefficient
at lag 1 for this adjustment
(see section 7.3.6)
Generally the month of
year is needed for the
input data set
e.g., add data columns
representing seasonal
components for seasonal
cycle (e.g., sin/cos terms)
or monthly indicator
variables
Major Cautions and Concerns
Aggregation must be consistent
(e.g., monthly mean of n daily
observations), not mix of
different sample frequencies.
Software may correct for both
autocorrelation and seasonality.
*Key to Method Type: G = Graphical, N = Nonparametric, P = Parametric
7.2.2 Dealing with Censored Data
Censored values are usually associated with limitations of measurement or sample analysis, and are
commonly reported as results below or above measurement capacity of the available analytical
equipment. Table 7-3 summarizes techniques to use when dealing with censored data.
Table 7-3. Methods to deal with censored data (see discussion, section 7.4)
Analytical
Objective
Accommodate
censored data (i.e.,
values less than
detection or reporting
limits)
Recommended Method
Use parametric (e.g.,
maximum likelihood
estimation (MLE) and
robust regression on
order statistics (ROS)) or
nonparametric procedures
designed to
accommodate censored
data.
Method
Type*
P, N, G
Data Requirements
Knowledge about analytical
detection limits, practical
quantitation limits, and data
reporting conventions is
required to interpret the
meaning of censored data.
Major Cautions and
Concerns
Although common, substitution
of half the detection limit is not
recommended as more robust
tools are readily available.
*Key to Method Type: G = Graphical, N = Nonparametric, P = Parametric
7-5
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
7.2.3 Data Analysis for Water Quality Problem Assessment
Problem assessment is generally considered the first phase of a watershed project. Data analysis at this
stage typically involves using historical data to assess whether water quality standards are being met or
whether designated beneficial uses of waters are threatened, and the causes (e.g., pollutants) and sources
of identified problems. More refined problem assessment will include determination of pollutant
pathways and critical areas needing restoration or BMPs. Methods to support these types of analyses are
summarized in Table 7-4.
Table 7-4. Data analysis methods for problem assessment (see discussion, section 7.5)
Analytical
Objective
Summarize
existing
conditions
Assess
compliance with
water quality
standards
Identify major
pollutant sources
Define critical
areas
Recommended Method
Univariate statistics (e.g., mean,
median, range, variance, interquartile
range) for different sampling
locations, time series analysis for
long-term trends and seasonal ity, and
regression analysis comparing
pollutant concentrations or loads to
hydraulic variables
Boxplots and/or time series plots for
different sampling locations
Identification of extreme values with
boxplots or time series plots;
calculation of means (arithmetic or
geometric) over specific time
period(s)
Frequency or probability plots,
duration curves
Correlation or regression analysis or
Kendall's Tau for monotonic
association of water quality
constituent(s) vs. subwatershed
characteristic(s) (e.g., total P
concentration vs. manured acres)
Compare boxplots or bivariate
scatterplots from monitored
subwatersheds with distinctive land
use and/or management; ANCOVA
analysis
t-Test, ANOVA, Kruskall-Wallis,
cluster analysis to identify significant
differences in pollutant
concentration/load among multiple
sampling points
Method
Type*
P,N
G
P
G
P, N, G
G, P
P, N
Data Requirements
Minimal to moderate
Minimal to moderate
Concurrent data
from monitored
subwatersheds:
subwatershed land
use and/or
management data
Concurrent data
from monitored
subwatersheds:para
metric or
nonparametric
analysis can be used
depending on data
distribution
Major Cautions and Concerns
To compare locations within or
across watersheds, data from
different locations must be
consistent and comparable
(e.g., synoptic survey, multiple
sampling stations).
Criteria for determining
impairment vary (e.g., single
observation exceedance vs.
geometric mean over n
observations); both monitoring
program and data analysis
must be tailored to regulatory
requirements.
Correlation does not guarantee
causation; consider transport
and other pollutant delivery
mechanisms.
Conditions determining
pollutant generation (e.g., storm
event, season, management
schedule) must be considered
in drawing conclusions about
critical areas. Modeling may be
useful.
*Key to Method Type: G = Graphical, N = Nonparametric, P = Parametric
7-6
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
7.2.4 Project Planning Data Analysis
Project planning involves both land treatment and monitoring design. Decisions regarding project
duration, BMP and restoration needs and scheduling, and implementation tracking and monitoring should
all be supported by information and appropriate analysis. The quality of information available will vary
from project to project. In many cases, the analysis and decisions will have to rely on historical data
(perhaps collected for other purposes) or on data from other sites in the region. The methods summarized
in Table 7-5 are recommended to assist with various aspects of project planning.
Table 7-5. Data analysis methods for project planning (see discussion, section 7.6)
Analytical
Objective
Determine pollutant
reductions needed to
meet water quality
objectives
Estimate BMP
treatment needs
Estimate minimum
detectable change
(MDC)
Locate monitoring
stations
Recommended Method
Massbalance/TMDL
Receiving waterbody
relationships
Load-duration curves
Reference watershed
Compare estimated
pollutant reduction
efficiencies of planned
BM Ps with reductions
needed
MDC calculation (Spooner
etal.2011a)
Identify major pollutant
sources, critical areas as in
Table 7.5 if data are
available
Target land areas of
particular land
use/management and/or
expected treatment
implementation
Method
Type*
P, G
P
P
P
G
Data Requirements
Appropriate local or
published values on BMP
pollutant reduction
efficiencies
Mean and variance of
water quality variable(s)
of interest; parameters of
planned monitoring
program (e.g., sampling
frequency)
Concurrent data from
subwatersheds (e.g.,
from a synoptic survey)
Land use and
management data,
estimates of treatment
adoption
Major Cautions and Concerns
Published efficiencies do not
generally account for interactions
in multiple-BMP systems or
pollutant transport or delivery
issues beyond edge of field/BMP
site. Modeling may be a better
approach.
If MDC is larger than anticipated
response to treatment, may need
to re-evaluate extent of planned
land treatment and/or duration of
water quality monitoring.
If data are unavailable from
subject watershed, data from
elsewhere must be used.
Conditions determining pollutant
generation (e.g., storm event,
season, management schedule)
must be considered.
Station location depends on many
other factors, including project
objectives, monitoring design, and
site requirements.
*Key to Method Type: G = Graphical, N = Nonparametric, P = Parametric
7.2.5 BMP and Project Effectiveness Data Analysis
Table 7-6 includes recommended methods for assessing the effectiveness of BMPs and watershed
projects. In general, the analytical objective of both kinds of efforts is to document change in pollutant
concentrations or loads or both in response to BMP implementation. These methods are linked to
monitoring designs that are described in section 2.4. Methods for assessing BMP and project
effectiveness using biological data are presented in chapter 4.
7-7
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Table 7-6. Data analysis methods for assessing BMP or watershed project effectiveness
(see discussion, sections 7.7 and 7.8)
Analytical
Objective
BMP efficiency
Watershed project
effectiveness
Monitoring Design
Used
Plot
Input/output
Paired watershed
Above/below-
Before/after
Single Watershed
Monotonic
Trend
Recommended
Method
ANOVA
Kruskal-Wallis
Paired t-Test,
Wilcoxon, or Mann-
Whitney tests of input
vs. output EMCs
(Event Mean
Concentrations) or
loads
Effluent probability
ANCOVA, paired t-
Test, Wilcoxon Rank
Sum, Mann-Whitney
t-Test of input vs.
output EMCs or loads,
ANCOVA, Wilcoxon
Rank Sum, Mann-
Whitney
Linear regression on
time
Multiple linear
regression on time
and covariates
Linear regression on
time, covariates, and
periodic functions
Mann-Kendall
Mann-Kendall on
residuals from
regression on
covariates
Seasonal Kendall
Method
Type*
P
N
P, N
N
P, N
P, N
P
N
Data Requirements
Data must meet
assumptions for
parametric statistics to
apply; otherwise use
nonparametric test
Data must meet
assumptions for
parametric statistics to
apply; otherwise use
nonparametric test
Data from control and
treatment watersheds
must exhibit significant
linear relationship.
Conditions (e.g.,
precipitation,
discharge) must be in
similar range during
calibration and
treatment periods.
Data must meet
assumptions for
parametric statistics to
apply; otherwise use
nonparametric test
Numerous techniques
are available,
depending on
objectives, available
data on covariates,
seasonal ity
Numerous techniques
are available,
depending on
objectives, available
data on covariates,
seasonal ity
Major Cautions and
Concerns
Plot data may not easily
extrapolate to field or
watershed scale.
Representing change in
load or concentrations as
a percent reduction may
not be representative for
low input concentrations
or loads.
Quality of relationship
between control and
treatment watersheds
determines level of
change that can be
detected. Addition of
covariates to paired
regression model may
improve ability to
document response to
treatment.
Change in pollutant
concentration or load
measured at the below
station may be difficult to
detect if concentrations or
loads at the above station
are high.
Trend analysis is most
effective with data
sampled consistently at
fixed locations and fixed
time intervals for period
sufficient to overlap
seasonal or management
cycles that do not
represent real trends.
Covariates such as
stream flow, season, etc.
are essential to assist
with isolating trends due
to BMPs.
7-8
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Analytical
Objective
Monitoring Design
Used
Single Watershed
Step Trend
Multiple watersheds
Linking land
treatment to water
quality changes
Recommended
Method
t-Test before and after
step, Wilcoxon Rank
Sum, Mann-Whitney
t-Test or Wilcoxon
Rank-Sum test
ANOVA or Kruskal-
Wallis test
Regression analysis
Boxplots of results
from watershed
groupings (e.g.,
treated/untreated)
Correlation, regression
of pollutant
concentration or load
on land treatment
metric(s)
Method
Type*
P, N
P, N
G
P, N
Data Requirements
Data must meet
assumptions for
parametric statistics to
apply; otherwise use
nonparametric test
Data must meet
assumptions for
parametric statistics to
apply; otherwise use
nonparametric test
Minimal
Requires quantitative
monitoring data on
land treatment. Use of
explanatory variables
(e.g., precipitation,
animal populations)
may strengthen
analysis.
Major Cautions and
Concerns
Selection of step change
point in time must be
made a priori and related
to watershed activities,
e.g., onset of treatment.
Covariates such as
stream flow, season, etc.
are essential to assist
with isolating trends due
to BMPs.
Watersheds need to fall
into 2 groups (e.g.,
treated and untreated) for
t-Test or Wilcoxon Rank-
Sum test.
For more than two groups
use ANOVA or Kruskal-
Wallis.
Visual comparisons
should be confirmed by
numerical tests.
Water quality and land
treatment data must be
collected on comparable
spatial and temporal
scales. Monitored
pollutants must match
pollutants addressed by
implemented BMPs.
*Key to Method Type: G = Graphical, N = Nonparametric, P = Parametric
7.2.6 Practice Datasets
This chapter presents a wide range of parametric and nonparametric methods, including several
illustrative examples. Because practice is the best way to learn how to apply these methods, example
datasets and eight problems are provided to allow readers to test their skills. Using their own statistics
software, readers are encouraged to apply the tests indicated in Table 7-7 to the example datasets listed in
the fourth column. The objective and statistical tests are listed in the second and third columns of the
table. The specific problems and the answers are given in the files identified in the last column.
7-9
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Table 7-7. Practice datasets
Problem
Number
1
2
3
4
5
6
7
8
Objective
Test for conformance to normal
distribution
Characterize data
Compare two groups
Compare input/output for a
BMP
Compare three groups
Examine relationships between
variables/stations
Assess change due to
treatment in paired-watershed
design
Calculate MDC for a single
station
Test
Graphical, skewness, kurtosis,
Shapiro-Wilk, Kolmogorov
Descriptive statistics
t-Test
Wilcoxon/Kruskal-Wallace
Paired t-Test
Wilcoxon Rank Sum Test
ANOVA
Kruskal-Wallace
Correlation
Simple linear regression
ANCOVA
Minimum detectable change
Dataset in
Sampledata.xlsx
1
1
1
2
1
1
1
3
Problem and Answer
File
normality.pdf
description.pdf
2groups.pdf
pairedtests.pdf
3groups.pdf
correlationregress.pdf
pairedancova.pdf
mdc.pdf
All files are available at: https://www.epa.gov/polluted-runoff-nonpoint-source-pollution/monitoring-and-evaluating-nonpoint-source-watershed
7.3 Exploratory Data Analysis (EDA) and Data Adjustment
After a monitoring program is up and running, it is never too soon to begin to evaluate the data. Basic
data evaluation should not wait until the end of the project or when a report is due; regular examination of
the data should be part of ongoing project activities. A carefully designed monitoring program will have
the right kind of data, collected at appropriate times and locations to achieve the objectives, and a plan for
analyzing the data.
Describing and summarizing the data in a way that conveys their important characteristics is one purpose
of EDA. When deciding how to analyze any data set, it is essential to consider the characteristics of the
data themselves. Evaluation of characteristics like non-normal distribution and autocorrelation will help
determine the appropriate statistical analysis. Some common characteristics of water quantity and quality
data (Helsel and Hirsch 2002) include:
• A lower bound of zero - no negative values are possible.
• Presence of outliers, extreme low or high values that occur infrequently, but usually somewhere in
the data set (outliers on the high side are common).
• Skewed distribution, due to outliers or influential data.
• Non-normal distribution.
• Censored data - concentration data reported below some detection limit or above a certain value.
• Strong seasonal patterns.
• Autocorrelation - consecutive observations strongly correlated with each other.
7-10
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
• Dependence on other uncontrolled or unmeasured variables - values strongly co-vary with such
variables as streamflow, precipitation, or sediment grain size.
As such, the overall goal of data exploration is to uncover the underlying structure of a data set and set the
stage for more detailed analysis, including hypothesis testing. Specific objectives for data exploration
might include:
• To find potential problems with data quality such as data entry error, lab or collection errors
* To find extreme values and potential anomalies
• To describe the behavior of one or more variables
• To test distribution and assumptions of independence and constant variance
• To see cycles and trends
• To find clusters or groupings
• To make preliminary comparisons of two or more locations or time periods
• To examine relationships between variables
At the start, check the data for conformity with original plans and QA/QC procedures. Use the approved
project Quality Assurance Project Plan (QAPP) as a guide; see section 8.3 for details on preparing a
QAPP. A key part of EDA is to verify the data entered in the data sets are valid and not anomalies due to
data entry, lab, or collection errors.
Understanding how the data behave with respect to such features as distribution(s), cycles, clusters,
seasonality, and autocorrelation assists with selecting the appropriate statistical tests to evaluate
achievement of project goals. Data analysis to address project goals will involve more thorough statistical
analysis that will be guided by understanding of the data set through EDA.
A secondary reason for doing exploratory data analysis is to start to make sense of the data actually
collected. The purpose of EDA is to get a feel for the data, develop ideas about what it can tell, and how
to draw some preliminary conclusions. EDA is similar to detective work - sifting through all the facts,
looking for clues, and putting the pieces together to find suggestions of meaning in the data.
This process of data exploration differs from traditional hypothesis testing. Testing of hypotheses always
requires some initial assumption or prediction about the data, such as "The BMP will reduce phosphorus
loads." Although formulating and testing hypotheses is the foundation of good data analysis, the first pass
through of the data should not be too narrowly focused on testing a single idea. Hypothesis testing is
discussed in section 7.6.1, which focuses on data analysis for project planning. EDA is an approach to
data analysis that postpones the usual assumptions about what kind of model the data follow in favor of
the more direct approach of allowing the data themselves to reveal their underlying structure. EDA uses a
variety of techniques, both numerical and graphical, to open-mindedly search for new, perhaps
unexpected, insights into the data. Approaches to EDA for aquatic system biological data have been
described by EPA as part of the Causal Analysis/Diagnosis - Decision Information System (CADDIS)
(USEPA2010).
Data exploration is a necessary first step in analyzing monitoring data. Unless initial exploration reveals
indications of patterns and relationships, there is unlikely to be something for further analysis to confirm.
7-11
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
J. W. Tukey (1977), the founder of exploratory data analysis, said, "EDA can never be the whole story,
but nothing else can serve as the ... first step."
For more information, refer to Tech Notes 1: Monitoring Data Exploring Your Data, The First Step
(Meals and Dressing 2005).
7.3.1 Steps in Data Exploration
Data exploration is a process of probing more deeply into the dataset, while being careful to stay
organized and avoid errors. Here are some typical steps in the process of EDA (modified from Jambu
1991), although not all of them may apply to every situation.
1. Data management. In the process of working with the data, files will be created. These files
should be updated, checked, and validated at regular intervals. The importance of data screening
and validation cannot be overemphasized. This should always be done before embarking on
specific analyses, plotting, or other procedures. Be as sure as possible that the data are free from
entry errors, typos, and other mistakes before proceeding.
2. One-dimensional analysis. The first step in really exploring the data is often to simply describe
or summarize the information one variable at a time, independent of other variables. This can be
done using basic statistics on range, central tendency, and variability, or with simple graphs like
histograms, pie charts, or time plots. This kind of information is always useful to put data in
context, even though more intensive statistical analysis will be pursued later.
3. Two-dimensional analysis. Relationships between two variables are often of great interest,
especially if there is a meaningful connection suspected (such as between suspended sediment
and phosphorus) or cause and effect process (such as between rainfall and streamflow).
Relationships between two sampling locations (such as treatment and control watersheds) or
between two time periods (like spring snowmelt and summer) are often of interest. Graphical
techniques like scatter plots and numerical techniques like correlation are often used for this
purpose.
Because graphs summarize data in ways that describe essential information more quickly and completely
than do tables of numbers, graphics are important diagnostic tools for exploring the data. There is no
single statistical tool that is as powerful as a well-chosen graph (Chambers et al. 1983). Enormous
amounts of quantitative information can be conveyed by graphs and the human eye-brain system is
capable of quickly summarizing information, simultaneously appreciating overall patterns and minute
details. Graphs will also be essential in ultimately conveying project results to others. With computers and
software available today, there are no real constraints to graphing data as part of EDA. Graphical display
options are described in section 4.3 of the 1997 guidance (USEPA 1997b).
There are more advanced steps in data exploration including analysis of multiple variables and cluster
analysis (section 7.3.8). Also, see chapter 4 of the 1997 guidance (USEPA 1997b) for background on
some of these methods.
The project goals and the type of monitoring should guide exploration. If monitoring occurs at a single
point while upstream BMPs are implemented gradually, trends may be of the greatest interest. If sampling
for phosphorus above and below a land treatment area, a comparison of phosphorus concentrations at the
two stations might be necessary. For an erosion problem, a relationship between streamflow and
suspended solids concentrations before and after land treatment might be of interest.
7-12
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
The following sections present some specific techniques for exploring data.
7.3.2 Describe Key Variable Characteristics
In most cases, the data should be examined to summarize key characteristics and to determine if the data
satisfy statistical assumptions required for parametric statistical analyses. Data that do not meet
parametric statistical assumptions should be transformed or nonparametric tests should be used. Key
characteristics that are meaningful include central tendency, variability, and distribution.
7.3.2.1 Central Tendency
« The mean is computed as the sum of all values divided by the number of values. The mean is
probably the most common data summary technique in use; however, an extreme value (either high
or low) has much greater influence on the mean than does a more 'typical' value. Because of this
sensitivity to extremes, the mean may not be the best summary of the central tendency of the data.
« The median, or 50th percentile, is the central value of the distribution when the data are ranked in
numerical order. The median is the data value for which half of the observations are higher and half
are lower. Because it is determined by the order of observations, the median is only slightly
affected by the magnitude of a single extreme value. When a summary value is desired that is not
strongly influenced by a few extremes, the median is preferable to the mean.
Both the mean and median should be calculated for comparison.
7.3.2.2 Variability
• The sample variance, and its square root the standard deviation, are the most common measures
of the spread (dispersion) of a set of data. These statistics are computed using the squares of the
difference between each data point and the mean, so that outliers influence their magnitudes
dramatically. In data sets with major outliers, the variance and standard deviation may suggest a
much greater spread than exists for the majority of the data. This is a good reason to supplement
numerical statistics with graphical analysis.
• The coefficient of variation (CV), defined as the standard deviation divided by the mean, is a
relative measure of the variability (spread) of the data. The CV is sometimes expressed as a percent,
with larger values indicating higher variability around the mean. Comparing the CV of two data
groups can suggest their relative variability.
• The interquartile range (IQR) is defined as the 75th percentile minus the 25th percentile. Because it
measures the range of the central 50 percent of the data, it is not influenced at all by the 25 percent
of the data on either end and is relatively insensitive to outliers.
7-13
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
7.3.2.3 Skewness
Water resources data are usually skewed, meaning that the data values are not symmetric around the mean
or median, as extreme values extend out farther in one direction. Streamflow data, for example, are
typically right-skewed because of occasional high-flow events (Figure 7-1). When data are skewed, the
mean is not equal to the median, but is pulled toward the long tail of the distribution by the effects of the
extreme values. The standard deviation is also inflated by the extreme values. Because highly skewed
data restrict the ability to use hypothesis tests that assume the data have a normal distribution, it is useful
to evaluate the skewness of the data. The coefficient of skewness (g) is a common measure of skewness;
a right-skewed distribution has a positive g and a left-skewed distribution has a negative g. There are
multiple measures of skewness with varying possible ranges. Interpretation of skewness values calculated
by Excel, for example, is aided by estimating the standard error of skewness with the following
simplified1 equation for large (<5 percent difference from true value for n>30) samples (Elliott 2012):
Standard Error =
where n is the sample size. For n=24, the standard error of skewness is 0.5 using the simplified equation.
A skewness value of more than twice this amount (i.e., less than -1 or greater than 1 in this case) indicates
a skewed distribution, but a value between -1 and 1 is not proof that the data are normally distributed.
Other tests such as goodness-of-fit tests (below) must also be performed to determine if the distribution is
normal.
160
Streamflow (ft3/sec)
Figure 7-1. Right-skewed distribution
The true standard error of skewness is calculated as: rn(jl 1-),
'(n-2)(n
3)
7-14
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.3.2.4 Data Distribution
Many common statistical techniques for hypothesis-testing (parametric tests) require, among other
characteristics, that the data be normally distributed. It is common practice to apply tests such as the
Shapiro-Wilk test or the Kolmogorov-Smirnov (KS) test to evaluate the normality of the data; both of
these tests are commonly available in statistical software. The probability plot correlation coefficient
(PPCC) can also be used to test for normality. PPCC is essentially a correlation coefficient between the
data values and their normal score (i.e., data on probability paper) and the interpretation of the PPCC is
similar to that for the correlation coefficient r. This procedure is outlined by Helsel and Hirsch (2002) in
section 4.4 and in Appendix Table B.3 which gives critical values for accepting/rejecting the normal
assumption.
Histograms are familiar graphs, where bars are drawn whose height represents the number or fraction of
observations falling into one of several categories or intervals (see Figure 7-1). Histograms are useful for
depicting the shape or symmetry of a data set, especially whether the data appear to be skewed. However,
histogram appearance depends strongly on the number of categories selected for the plot. For this reason,
histograms are most useful to show data that have natural categories or groupings, such as fish numbers
by species, but are more problematic for data measured on a continuous scale such as streamflow or
phosphorus concentration.
Quantile plots (also called cumulative frequency plots) show the percentiles of the data distribution. Many
statistics packages calculate and plot frequency distributions; instructions for manually constructing a
quantile plot can be found in Helsel and Hirsch (2002) and other statistics textbooks. Quantile plots show
many important data characteristics, such as the median or the percent of observations less than or greater
than some critical threshold or frequency. With experience, an analyst can discern information about the
spread and skewness of the data. Figure 7-2 shows a quantile plot of E. coll bacteria in a stream; the
frequency of violation of the Vermont water quality standard can be easily seen (the standard was
exceeded -65 percent of the time). Flow and load duration curves (see section 7.9.3) are useful tools for
visualizing the distribution of streamflows or pollutant loads across a full range of conditions.
7-15
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
o
c
0)
o-
0)
SJ
i
3
E
3
O
VTWQS77E. coli/100 ml
10.0
100.0
1000.0
10000.0
100000.0
1000000.0
E. co//count (#/100 ml)
Figure 7-2. Quantile plot or cumulative frequency plot of E. co//data, Berry Brook, 1996
(Meals 2001)
A boxplot presents a schematic of essential data characteristics in a simple and direct way: central
tendency (median), spread (interquartile range), skewness (relative size of the box halves), and the
presence of outliers are all indicated in a simple picture. There are many variations and styles of boxplots,
but the standard boxplot (Figure 7-3) consists of a rectangle spanning the 25th and 75th percentiles, split by
a line representing the median. Whiskers extend vertically to encompass the range of most of the data
(e.g., the 5th and 95th percentiles), and outliers beyond this range are shown by dots or other symbols. The
definition of whiskers and outliers may differ among graphing programs; standard definitions can be
found in statistics textbooks (e.g., Cleveland 1993; Helsel and Hirsch 2002). When boxplots are
presented, the definitions of the rectangle, whiskers, and outlier symbols should be clearly specified.
o
"c
o
O
CL
1—
1
1.8 -,
1.6-
1.4-
1.2-
1.0-
0.8-
0.6-
0.4-
0.2-
0.0-
-0.2-
-0.4-
T 95%75o,
'
-L 5%
.
I
Figure 7-3. Boxplot of weekly TP concentration, Samsonville Brook, 1995 (Meals 2001)
7-16
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.3.2.5 Transformations to Handle Non-normal Data with Parametric Statistical Tests
Evaluations conducted thus far may suggest that the data do not conform to a normal distribution. In cases
where it is desirable or convenient to use statistical tools that require normally distributed data sets or
have a constant variance, transformation may reduce skewness and result in a data set that is more
normally distributed. Transformation is simply defined as applying the same mathematical operation to all
records in the dataset. Helsel and Hirsch (2002) provide a summary of common transformations.
Statistical software packages will often come with Box-Cox transformation tools that allow the analyst to
identify the best transformation for achieving normality, although logarithmic (e.g., logio or loge)
transformation is certainly the most common strategy (Box and Cox 1964). Regardless of which
transformation is used, the data analyst should verify that the transformation results in a dataset that
satisfies applicable assumptions.
Subsequent analysis of log-transformed data must be done with care, as quantities such as mean and
variance calculated on the transformed scale are often biased when transformed back to the original scale.
The geometric mean (the mean of the log-transformed data back-transformed to the arithmetic scale), for
example, differs from the mean of the untransformed distribution. Furthermore, results of statistical
analysis may be more difficult to understand or interpret when expressed on the transformed scale.
Typically, when analysis is performed on the log transformed data, the final statistical results are
converted to express the results as a percentage change (see Spooner et al. 201 la for additional details on
this approach).
Do not assume that a transformation will solve all the problems with the data distribution. Always test the
characteristics of the transformed data set again. Violations of the assumption of a normal distribution can
lead to incorrect conclusions about the data when parametric tests are used in subsequent hypothesis
testing. With that said, some parametric trend tests are robust to some deviation from normality. From a
practical standpoint it is best to be consistent. For example, if a log transformation is merited for TP
concentrations at most locations in a particular data set, then log transforming all TP for all site locations
is a practical course of action.
If transformed data cannot satisfy the assumptions of parametric statistical analysis, consider
nonparametric techniques for data analysis. With regard to hypothesis testing, there are a host of
nonparametric tests that are robust against non-normality. These tests are often based on the ranks of the
data and the influence of a few extreme values is reduced. However, keep in mind that while the
normality assumption is relaxed, nonparametric tests have other assumptions (constant variance and
independence of data observations) that must be met for their results to be valid. If distributional
assumptions can be met, then parametric tools tend to be more powerful. Many nonparametric procedures
are described in section 4.11.3 in the 1997 guidance and recommended in Table 7-1 through Table 7-6.
7.3.3 Examination for Extreme, Outlier, Missing, or Anomalous Values
7.3.3.1 Extremes and Outliers
Extreme values are frequently encountered in NPS monitoring efforts and include the exceptionally high
and low flow values associated with floods and droughts, respectively. Suspended sediment
concentrations may be exceptionally high during spring runoff when cropland fields are bare or when
streambank slumping occurs. Very low pesticide levels may be observed with increasing time elapsed
since application on cropland. In some cases, the extremes may be more important for water quality than
are typical conditions. For example, the extreme values in some lake variables (e.g., Secchi disc readings,
7-17
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
turbidity, and pH), the duration of the extreme values, and the season may be the dominant influence on
the extent to which lakes support designated beneficial uses. In streams, it is often the extreme low
dissolved oxygen condition that determines the character of the biological community the stream can
support. Extreme concentrations in toxic contaminants such as pesticides may also be more important
than the mean values with respect to acute toxicity to aquatic biota. Nevertheless, extreme concentrations
can have an inordinate effect on some statistical analyses, and the analyst must consider these issues when
selecting data analysis tools.
On the other hand, outliers can result from measurement or recording errors and this should be the first
thing checked (e.g., check lab and field logs). If no error can be found, an outlier should never be rejected
just because it appears unusual or extreme. All samples considered valid after exploratory analysis
contain information that should be considered when analyzing monitoring data. Different subsets of the
same dataset may reveal varying aspects of the condition of the water resource. For example, extreme
conditions may be most important when considering violations of water quality standards or load
allocations from a watershed. Annual or monthly loads may not completely illuminate the severity of a
problem, whereas high loads during extreme flow conditions may account for most of the pollutant load.
It is commonly observed that the majority of annual pollutant export occurs during a small proportion of
the time. Identifying these extremes and understanding the conditions under which they occur may be a
key to understanding and interpreting watershed monitoring results.
One approach for identifying and summarizing extreme values is to describe the situation by computing
the frequency or proportion of observations exceeding some threshold value (e.g., a water quality
criterion). Cumulative frequency or duration plots are also useful to visualize the influence of extreme
values on a dataset. In addition, determine whether most or all of the extreme values can be attributed to
certain conditions in the watershed (e.g., spring runoff, cropland tillage). In these cases, it might be more
useful to stratify the dataset by season or management condition. In this way, monitoring results can be
analyzed by season, and values that were "extreme" in the dataset as a whole may be more easily
interpreted in their respective season(s).
Histograms can be useful to illustrate exceedances of standards, targets, and goals by setting categories or
classes that are outside the standard or target. Quartile plots and boxplots are also useful tools to evaluate
the presence of extreme values.
Boxplots can be a useful visual tool for highlighting extreme values in environmental data. They show
both the spread and the range of the data. Important values visualized by boxplots include the mean (or
the median), and standard error limits (or 25th and 75th percentiles). Values falling outside these 'limits'
depict values that are from the tails of the data distribution.
Plotting the data in sequence with date as the horizontal axis are time series plots. Figure 7-4 shows a time
series plot of weekly phosphorus concentration data from three stream stations. It is clear that around the
middle of the year, something occurred that led to dramatic spikes in P concentration at Station 2, a
phenomenon demanding further investigation. Field investigation revealed concentrated overland flow
from a new CAFO upstream.
To analyze data sets with extreme values, consider using non-parametric trend tests. If documenting the
number or occurrence of extreme values is an objective (e.g., for evaluation of violations of water quality
standards or pesticide spikes), frequency analyses are useful. Stratifying the data by seasons or flow
conditions (e.g., base flow, storm flows, and flooding) may be helpful in evaluating conditions and trends
within each flow regime. Using flow as an explanatory variable/covariate in trend analysis may be helpful
7-18
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
to explain the influence/importance of the extreme values. Using the log transformation often minimizes
the skewness caused by the extreme values and enables the use of parametric trend techniques. If the data
are missing due to right censoring (too high to measure), techniques discussed in section 7.4 should be
considered.
O)O)O)O)O)O)O)O)O)O)O)O)
O)O)O)O)O)O)O)O)O)O)O)O)
O)O)O)O)O)O)O)O)O)O)O)O)
-Sta. 1
•Sta. 2
-Sta. 3
Figure 7-4. Time plot of weekly TP concentration, Godin Brook, 1999 (Meals 2001)
7.3.3.2 Anomalous Values
Plotting the data can also reveal data errors or anomalies. Figure 7-5 shows a time series plot of total
Kjeldahl nitrogen (TKN) data collected from three Vermont streams. Something happened around May,
1996 that caused a major shift in TKN concentrations in all three streams. In addition, it is clear that after
October, no values less than 0.5 mg/L were recorded. In this case, this shift was not the result of some
occurrence in the watersheds, but an artifact of a faulty laboratory instrument, followed by the
establishment of a lower detection limit of 0.50 mg/L. Discovery of this fault, while it invalidated a
considerable amount of prior data, led to correction of the problem in the lab and saved the project major
headaches down the road.
-Seriesl
-Series2 '•'•'•'•'Ser\es3
Figure 7-5. Time plot of TKN data from three stream stations, 1995-1996 (Meals 2001)
7-19
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.3.3.3 Missing Data
The reality of any watershed project monitoring program is that samples will be missed, equipment will
fail or be overwhelmed, droughts and floods will occur, and sample analysis limitations will be exposed,
resulting in missing and extreme (both high and low) values. However, if data are missing because of
extreme conditions (e.g., streamflow was too high to obtain a measurement or water was so low a sample
could not be drawn), then missing data may also represent extreme conditions.
The presence of a few missing values in a data series is not generally a major cause for concern, although
some parametric tests (e.g., trend analyses that include autocorrelation errors using time series) require
equal spacing of observations2. One way to cope with extensive missing data is to aggregate data to a
longer, uniform time interval by averaging or using the median value of a group of data points. Daily
observations, for example, could be aggregated to weekly means or medians. Such an operation would
have an additional potential benefit of reducing autocorrelation (see section 7.3.6). A downside to this
approach, however, is a reduced significance level due to fewer degrees of freedom. Do not aggregate
data when there is a systematic change in sampling. For example, if the early data were collected as
monthly observations and the more recent data were collected as quarterly data, it is not correct to
aggregate the monthly data to quarterly averages and then perform analyses. This is because the averaging
calculation changes the variability of that portion of the record in comparison to the remainder of the
record, resulting in a violation of "identically distributed" assumption of most (including nonparametric)
hypothesis tests. In these cases, the analyst will need to subsample from the more intensely monitored
data set to best mimic the sampling from the less sampled portion of the data.
For loading analyses that require flow data, it is expected that the missing flow data due to equipment
failure could be estimated by evaluating regression relationships with flow from nearby basins. On the
other hand, flows that exceed the weir capacity or reach a stage so high that the technician cannot access
the site are exceptional events. Certainly one approach to addressing this data gap is to apply the
previously mentioned regression relationship with a nearby station. Another approach might be to treat
these observations as "greater than the maximum flow" and apply methods appropriate for censored data
described in section 7.4.
7.3.4 Examination for Frequencies
For categorical data such as watershed area in different land uses or number of aquatic macroinvertebrates
in certain taxonomic groups, data can be effectively summarized as frequencies in histograms or pie
charts. Figure 7-6 shows a pie chart of the percent composition of orders of macroinvertebrates in a
Vermont stream, clearly indicating that dipterans dominate the community.
2 Some statistical software such PROC AUTOREG in SAS yield valid trend results with autocorrelated data with
missing data points, as long as the input record contains equal spaced time intervals (e.g., weekly).
7-20
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
4%
19%
11%
57%
n Coleoptera
ETrichoptera
ffl Diptera
GJPIecoptera
I Ephemeroptera
lOligochaeta
Figure 7-6. Percent composition of the orders of macroinvertebrates,
Godin Brook, 2000 (Meals 2001)
7.3.5 Examination for Seasonality or Other Cycles
Monitoring data often consist of a series of observations in time, e.g., weekly samples over a year. One of
the first, and the most useful, things to do with any time series data is to plot it. Plotting time series data
can provide insight into seasonal patterns, trends, changes, and unexpected events more quickly and
easily than tables of numbers.
Figure 7-7 shows a time series plot of E. coli counts in a Vermont stream. The extreme range of the
counts (five orders of magnitude) and the pronounced seasonal cycle are readily apparent, with the lowest
counts occurring during the winter. It is easy to see the times of year when the stream violates the water
quality standard for bacteria.
Tlme
Figure 7-7. Time series plot of weekly E. coli counts, Godin Brook, 1995-1999 (Meals 2001).
Red line indicates Vermont WQS of 77 £. co///100 ml.
7-21
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.3.6 Autocorrelation
Because many hypothesis-testing statistical techniques require that residuals from the statistical tests be
independent, it is useful to check the data set for autocorrelation during EDA. Typically, if the data points
exhibit autocorrelation, so will the residuals from a statistical test which does not correct for
autocorrelation.
Time series data collected through monitoring of water resources often exhibit autocorrelation (also called
serial correlation or dependent observations) where the value of an observation is closely related to a
previous observation (usually the one immediately before it). Autocorrelation in water quality
observations is usually positive in that high values are followed by high values and low values are
followed by low values. For example, streamflow data often show autocorrelation, as numerous high wet-
weather flows tend to occur in sequence, while low values follow low values during dry periods.
Terms Used in this Section
Lag: the difference in time steps by which one observation comes after another. The lag value is the
number of time steps.
Autocorrelation: the correlation between lagged values in a time series (data collected over equal
intervals of time, can also be spatial distances)
Correlation Coefficients, p\: a set of correlations for each lag. The autocorrelation coefficient for lag 1
is the correlation between each data in a time series and its previous (lag 1) observation. The
autocorrelation coefficient, pj, for lag j is the correlation between each datumin a time series and the
observation that lags by j time steps.
Autoregressive: situation where past values (or nearby values for spatial analyses) have an effect on
current values. For example, when most of the correlation between the lag variables is between each
current value and the immediately preceding value, it is a first-order auto regressive process denoted
as AR(1). AR(2) is second order, where previous two values effect the current value, etc.
Autoregressive, order 1, AR(1) is common for weekly and monthly water quality samples.
Moving Average: an averaging of a fixed number of consecutive observations, with or without weights.
Moving average models are denoted MA(1), MA(2), ...MA(q) to indicate the order or maximum lag for
consecutive observations that are averaged.
ARIMA (autoregressive integrated moving average) models: time series models that include both
auto regressive terms and/or moving average terms
Autocorrelation Function (ACF): the set of correlations (e.g., autocorrelation coefficients) between
each value in a series of values (e.g., xt) and the lagged values within the same series (e.g., XM, xt-2,
etc.). Alternatively stated, this is the pattern of correlation coefficients vs. lag value. This is generally
depicted as a graph of each lag and its autocorrelation coefficient with a standard error bar to help
determine the statistical significance of each of the correlation coefficients for each lag. The
pattern/shape of the ACF, along with the PACF, is used to assist in determining if the data follow an
AR, MA, or ARIMA pattern, and by what order (lag). For example, a seasonal AR(1) series has a large
p-i, with subsequent pj's trailing off, and a strong seasonal lag correlation.
Partial Autocorrelation Function (PACF): the correlation between two variables, taking into account the
relationships of other variables to these two variables. The PACF for an AR(1) series drops to 0 after
lag 1).
7-22
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Autocorrelation usually results in a reduction of the effective sample size (degrees of freedom). It
therefore affects statistical trend analyses and their interpretations. As the magnitude of autocorrelation
increases, the effective sample size decreases, and the true standard error is therefore greater than if
autocorrelation is incorrectly ignored. Adjustment for autocorrelation is needed so that the power of
detecting a difference or trend is not incorrectly inflated. For data sets with high autocorrelation, a larger
sample size (e.g., longer monitoring duration) than would be necessary in the absence of autocorrelation
may be required to correctly detect significant changes or trends.
Autocorrelation is often significant in very frequent data collection, such as that done with recording
sensors (e.g., temperature, turbidity). Daily, weekly, and monthly samples also exhibit autocorrelation,
but usually to a lesser extent. The time interval between independent samples differs with the water
resource and variable. The magnitude of autocorrelation in surface water quality concentrations is usually
quite large for samples collected more frequently than monthly (Loftis and Ward 1980a and 1980b,
Lettenmaier 1976, Lettenmaier 1978, Whitfield and Woods 1984). Loftis and Ward (1980a and 1980b)
verified that some surface water quality samples collected less frequently than once a month may be
considered independent if the seasonal variation is removed, although Whitfield (1983) found significant
autocorrelation between stream discharge samples taken as much as 60 days apart. Compared to surface
water data series, ground water data series tend to retain significant autocorrelation, even with longer
sample intervals. Similarly, a ground water data series tends to have greater autocorrelation when
compared to surface water data series taken at the same time intervals. This may be due to slower water
movement and mixing in ground water as compared to surface waters.
There are numerical techniques to test for autocorrelation, but a simple graphical method can suggest
whether data have significant autocorrelation: the lag plot. A lag plot is a graph where each data point is
plotted against its predecessor in the time series, i.e., the value for day two and the value for day one are
plotted as an x, y pair, then day three, day two, and so on. Different time lags can be examined. A "lag-1"
plot uses each data value paired with its immediate predecessor (t2, tl), a "lag-2" plot uses each data
value paired with the value observed two steps previously (t3, tl), and so on. Random (independent) data
should not exhibit any identifiable structure or pattern in the lag plot. Non-random structure in the lag plot
indicates that the underlying data are not random and that autocorrelation may exist. Figure 7-8 shows a
lag-1 plot of weekly streamflow data, suggesting that autocorrelation needs to be addressed.
100
1 10
Streamflow (time t-1) (ft3/sec)
100
Figure 7-8. Lag-one plot of streamflow observations, Samsonville Brook, 1994 (Meals 2001)
7-23
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Autocorrelation can be expressed numerically by calculating the correlation between observations
separated by j lag time periods. The autocorrelation corresponding to the jth lag is the correlation between
the observation at a given time and the observation taken j observation periods earlier. It is denoted by PJ
or p(j). For example, for the first lag (j=l), pi represents the autocorrelation of data points one time period
removed. The time period is a function of the sample frequency and corresponds to the length of time
between samples (e.g., daily, weekly, monthly). The range of values for PJ is -1 to +1, where +1 represents
a perfect positive autocorrelation and -1 is a perfect negative correlation. The sample estimate of
autocorrelation is given by 15 (in practice, PJ is often used to depict the sample autocorrelation
coefficients).
Time series generally exhibit patterns indicated by the pattern of autocorrelation coefficients at various
lags. These patterns reveal key characteristics about the data that should be incorporated into subsequent
trend analyses. For weekly and less frequent water quality sample collection, the autoregressive, lag 1 or
AR(1) data structure is usually appropriate. In this case, most of the autocorrelation can be explained by
the correlation between each observation and its previous observation. Moving Average (MA) data
structures occur when an observation is only related to the observations up to the lag value (q) and not
observations before3. Rarely is a MA structure alone useful with water quality samples. However, for
some daily or more frequent sampling, a combination of AR and MA data structures become appropriate,
known as ARIMA (AutoRegressive Integrated Moving-Average) models.
One common test for autocorrelation is the Durbin-Watson (DW) test. The DW test is appropriately used
when the data exhibit first order (lag 1) autoregressive (AR(1)) behavior. AR(1) is common with water
quality data collected weekly, biweekly, or monthly. Daily or samples collected more frequently usually
exhibit ARIMA autocorrelation structures. Even so, the DW test can be useful to indicate the presence of
autocorrelation with such samples as well. The DW test may also be used to test for independence
(i.e., the absence of autocorrelation) in the residuals from regression models.
Many statistical software packages offer tools for examining autocorrelation. For example, the
Autocorrelation Function (ACF) is the set of all the lag j autocorrelations and is usually depicted as a plot
of each lag autocorrelation versus the lag number (Figure 7-9 from Minitab (2016) and Figure 7-10 from
JMP (SAS Institute 2016b)) for the same data set. Visual inspection of the ACF is useful to detect the
presence of autocorrelation and define the structure of the autocorrelation. Typically, the lag
autocorrelation confidence limits (approximately two-standard deviation errors) are also shown on the
ACF graphs. This helps analysts determine if the autocorrelation coefficient at lag j is significant.
Seasonal patterns show up as cycles in the ACF. As a point of comparison, Figure 7-11 shows a time
series plot of independent data (i.e., zero correlation) together with its ACF graph.
Another useful graph is the Partial Autocorrelation Function (PACF) which is included as the last chart in
Figure 7-9 and in the last column of Figure 7-10. The PACF is the partial amount of R-square
(i.e., correlation) gained due to the additional lag term added to the right hand side of the model (Box and
Jenkins 1976). Patterns of the PACF that show dramatic decrease to non-significant values after a lag j,
indicate an autoregressive series of order (lag) j. For a qth order moving average model, MA(q), the
theoretical ACF function drops off to 0 after lag q with an exponentially decaying PACF value between
lag 0 and lag q.
3 j and q both refer to the number of lags, j for AR and q for MA.
7-24
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
A)
s
o
5
9.5-T
9.0-
8.5-
8.0-
7.5-
7.0-t
1/1/2007
1/1/2008
1/1/2009
Date
1/1/2010
1/1/2011
B)
ACF for Lftow_total_weekly
(with 5% significance limits for the autocorrelations)
C)
-l.OH
i i i i
i 1 , , 1
45 50
PACF for Lflow_total_weekly
(with 5% significance limits for the partial autocorrelations)
i.o
0.8
0.6
0.4-
0.2-
o.o-
-0.2-
-0.4-
-0.6
-0.8
-1.0
1
H H 1
1
1
1
j — ^ 1
i
i
i
j_ j
i
i
i
— 1~-
1
1
1
1
J — J 1
1
J-4—
1
1
1
1
1
1
1 — 1 1
I I
.
h
k
— -
ifnf
I
— -
t
j
—
" 1*
—
-
1 ( ).
1 ) \.
1 1 1
— j— i— j—
— i — i-i — i —
i i ii i
-udiLi , ,
in r | "p^ '
— i — i — \. —
1 1 L.
1 1 1
— 1~- j~- 1 —
r r T
1 — , f —
i i i
.
—
—
j
—
.||_J
—
-
—
—
*
—
-
1'
-
15 10 15 20 25 30 35 40
Lag
45 50
Figure 7-9. A) Time series plot, B) autocorrelation function (ACF) graph, and C) partial
autocorrelation function (PACF) graph of Log(10) weekly flow from the Corsica River National
Nonpoint Source Monitoring Program Project generated by Minitab. The steps are: Stat > Time
Series > Autocorrelation (or Partial Autocorrelation). Identify the time series variable and enter
number of lags. Select options for storing ACF, PACF, t statistics, and Ljung-Box Q statistics as
desired. Press ok.
7-25
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Where(:Srte == "CMS")
Time Series Log flow, \
O6/29/2OO7 CM/26/2OQ8 O2/22/2OO9 12/22/2OOQ 1O/2O/2O1O
Date
-0.438002
-S.827O99
Time Series Basic Diagnostics
I -.6 -.4 -.2 0 .2
: -.6 -.4 -.2 0 .2
211 151
292 066
365 302
429 163
474 932
507 901
532 259
550 997
556 950
560 945
562219
562 226
563 026
565 368
570 931
579 118
589 122
606 542
631 287
658 711
696 634
737 907
786 378
835 793
887 653
935 997
975 103
1007 04
1039 85
1065 53
< 0001
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
< 0001 *
Figure 7-10. Autocorrelation Function (ACF) graph of weekly flow from the Corsica River
National Nonpoint Source Monitoring Program Project generated by JMP. The steps are:
Click "Analyze" tab, select "Modeling" followed by "Time Series." Select Y time series
(LFLOW) and X time series (Date).
7-26
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
A)
9.0-
8.5-
8.0-
7.5-
7.0-
r r
1/1/2007
1/1/2008
1/1/2009
Date
1/1/2010
1/1/2011
B)
Autocorrelation Function
(with 5% significance limits for the autocorrelations)
'
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6-
-0.8-
-1.0
H
-1
J
J__.
1
1
||
1
1
-1
J
i~"
f~
i
i —
i
.
—
—
.
— •
—
—
i
i/u
—
—
—
— •
—
—
—
L
—
u
^p"
—
—
—
. 1
' '
—
—
1
1
1
1
1
1
1
1
1
L
I
I
I
— h-
i
i
i
— i —
— r —
MUM
n
1 —
1
1
1
1
1
1
1
1
— h-
1
1
— r—
i
i
i
1
i
.
.
— '
•
n
—
—
— •
—
—
—
_
—
—
"'*
—
—
—
—
—
—
—
—
—
ir
—
—
—
—
—
—
—
—
—
j*
T^n
—
—
-
^
-
15 10 15 20 25 30 35 40 45 50
Lag
Partial Autocorrelation Function
(with 5% significance limits for the partial autocorrelations)
0.4-
0.2
0.0
-0.2-
-0.4
1 1
H H +
1 1
1 1
1 1
j — ^ +
i i
i i
i i
j j. j. _
i i
i i
i i
J-f— f—
i i i
i i i
i i i
n — T T
1 — ! — !i~~
1 J 1 1 lf
••MT
i — i — i —
j — ^ +
i i i
i i i
i i i
J J 4.
i i i
i i i
i i i
f-f— f—
i i i
i i i
f-4-4—
i i i
i i i
i i i
1 — 1 t
i i i
.
—
A
1 '
—
—
—
—
—
— i — ) —
i |
i |
— i — ) —
i i
i i
L L J
1 1
1 1
1 1
—h-r—
i i
i i
i i
— 7 —
• 1 1
fllJ /Ij-Vj ,1
1 • -r
1
1 1
1 J
I |
I |
I |
1 J
I |
I |
I |
— h~r—
i i
i i
—h-r—
i i
i i
i i
1 — ,
i i
.
.
—
*l
IF" |
—
—
—
—
—
—
I
I'1
—
—
—
—
—
—
—
—
•^
s
—
—
-
1
-
5 10 15 20 25 30
Lag
35 40 45 50
Figure 7-11. A) Time series plot, B) autocorrelation function (ACF) graph, and C) partial
autocorrelation function (PACF) graph of data with zero autocorrelation (i.e., independent data
with respect to time)
7-27
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
An autoregressive error pattern of order 1, AR(1) means that an observation is correlated with the
previous observation. And, because each previous observation is related to the observation prior to it,
each observation is related to all past values, but the highest correlation is with the most recent
observation. A theoretical4 AR(1) time series structure is identified by an ACF pattern that trails off
bounded by an exponential decay after the first lag and the PACF dropping to 0 after lag 1 (or lag j for
higher order AR series).
The patterns for both ACF and PACF in Figure 7-9A and Figure 7-9B are typical of a water quality data
set with AR(1) and a strong seasonal pattern (some might argue for an AR(2) in this case which speaks to
the fact that interpretation of the patterns is required, with analysts often relying on the preponderance of
evidence across monitoring sites). The lag autocorrelations for weekly flow data from the Corsica River
(MD) NNPSMP project in these figures do show some significant autocorrelation coefficients. The PJ
falling outside of the red/blue lines are significant at the 95 percent confidence level. Significant
autocorrelation for lag 1, as well as a strong seasonal autocorrelation pattern is evident.
Readers should consult statistics textbooks and software packages for greater detail on this and other
methods to test for autocorrelation.
7.3.6.1 Methods to Handle Autocorrelation
Autocorrelation in analysis of time series data can sometimes be reduced by aggregating data over
different time periods, such as weekly means rather than daily values. Use of weekly means preserves
much of the original information of a daily data series, but separates data points far enough in time so that
autocorrelation is reduced. When aggregating data, it is important to use a consistent procedure, e.g.,
using the weekly mean of 7 daily values for each week in the year, rather than mixing weekly means for
some weeks with single grab samples for other weeks. Aggregation has disadvantages including: reducing
the degrees of freedom and potential power of a statistical test and dampening out the potentially
important high or low data.
Several statistical packages can incorporate a time series error term in the statistical model to address
autocorrelation. For example, PROC AUTOREG in SAS (SAS Institute 2016d) can be used for linear
regression when the error terms are autoregressive. Similar tools are available in Minitab's time series
tools (i.e., Stat > Time Series) or R's statistics package.
Alternatively, if the data exhibit AR(1), which is typical for water quality data collected weekly,
biweekly, or monthly, an adjustment can be made to the standard error of the trend (step or slope) terms.
The correction factor was derived by Matalas and Langbein (1962) and simplified with a large sample
size approximation by Fuller (1976):5
Sid. d6V.correcj-e(i Sid. d6V.uncorrecj-e(i
4 Patterns from water quality sampling data will resemble theoretical patterns but will usually deviate in some way,
requiring that the analyst develop a feel for interpreting such graphics.
ll+p 2 p(l-p")
jl-p n (1-p)
sample size
The exact formula is given by std. dev.corrected = std. dev.uncorrected \-^ - - ^ _ 2 where n is the
7-28
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Where p = autocorrelation coefficient at lag 1
Std. dev = the standard deviation of the trend term (e.g., standard error of the difference
between mean values between two time periods or standard error of the slope of a linear
regression).
See Spooner et al. (201 la) for additional details on this approach.
7.3.6.2 Methods to Handle Autocorrelation Caused by Seasonality
When the data exhibit seasonal cycles, incorporation of explanatory variables can be added to parametric
methods to allow for adjustment of seasons. Four common approaches are used. One is to add 1 or
2 cycles by using sine and cosine terms to a linear regression model, for example, as described in
Tech Notes 6: Statistical Analysis for Monotonic Trends (Meals et al. 2011). This approach assumes that
the sine or cosine terms realistically simulate annual or semiannual seasonal cycles.
A second approach is to incorporate seasonality into the time series model. An ARIMA time series model
could be used that incorporates a time series model with seasonal lag value ("differencing value" or "d"6
in an ARIMA model, ARIMA(p,d,q)) corresponding to the length of the seasonal cycle. For example, an
annual cycle will appear as a strong positive autocorrelation at lag 12 when the data series consists of
monthly values or at lag 4 for quarterly values. As noted above, readers should consult statistics textbooks
and software packages for greater detail on ARIMA models.
A third approach is to simply add monthly (or other seasonal) indicators to each observation in the dataset
and incorporate these indicator variables in a regression model. The number of indicator variables needed is
S-17. For example S-\ would be 11 when the cycle is annual, but where the same months behave similarly
over the years. Each indicator variable (Xi through Xn) is assigned a value of 0 or 1, as indicated below:
Xi = "1" for "January" but "0" otherwise
X2 = "1" for "February" but "0" otherwise
Xn = "1" for November" but "0" otherwise
Note: December values would all be depicted by "0" values for Xi-Xn
After the indicator variables are added to the dataset, regress Yt on the indicator variables and other
independent variables (e.g., time).
A fourth approach to address seasonality is to use non-parametric tests that can handle monthly
seasonality. The Seasonal Wilcoxon Rank Sum Test or Seasonal Mann-Whitney Rank Sum Test
compares two or more groupings (e.g., seasonal t-test or analysis of variance). The Seasonal Kendall Test
incorporates seasonal components when testing monotonic trends. Both parametric and non-parametric
trend tests are featured in section 7.8.2.4. There is also a variant of the Kendall tau test (seasonal Kendall
tau test with serial correlation correction (Hirsch and Slack 1984)) that can handle seasonality while also
adjusting for autocorrelation.
6 Differencing is a term used in time series analyses, where d is the order of differencing which creates a new time
series, Wt, whose values at time t is the difference between x(t) and x(t+d). Wt then becomes the series used in the
time series analysis.
7 Where S would represent the number of time periods (e.g., months, seasons).
7-29
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
7.3.7 Examination of Two or More Locations or Time Periods
Comparison of two or more variables with EDA can mean comparing different data sets, such as stream
nitrogen concentrations above and below a feedlot or phosphorus concentrations from a control and a
treatment watershed, or comparing data from the same site over two different time periods, such as
phosphorus loads from calibration vs. treatment periods.
The characteristics that make boxplots useful for summarizing and inspecting a single data set make them
even more useful for comparing multiple data groups representing multiple sites or time periods. The
essential characteristics of numerous groups of data can be shown in a compact form. Boxplots of
multiple data groups can help answer several important questions, such as:
• Is a factor (location, period) significant?
• Does the median appear to differ between groups?
• Does apparent variability differ between groups?
• Are there outliers? Where?
Boxplots are helpful in determining whether central values, spread, symmetry and outliers differ among
groups. If the main boxes of two groups, for example, do not substantially overlap on the vertical scale,
there may be a reason to suspect that the two groups differ significantly (note that such difference should
be tested using quantitative statistical techniques). Interpretation of boxplots can help formulate
hypotheses about differences between groups. Figure 7-12 shows a boxplot of total suspended solids
concentrations in three Vermont streams. The plot suggests that TSS concentrations may tend to be
slightly lower at Station 3 compared to the other two stations; however, because the boxes overlap, it is
unlikely that any comparison of medians would result in statistically significant differences.
Inferences about differences between locations or time periods resulting from graphical evaluation of the
data must be confirmed by more rigorous hypothesis testing analyses (see sections 7.7 and 7.8).
|? 100-:
W
LL
>n in
V :
tu :
Mean We
D
\ ISta.1
fZ^l Sta.2
• sta-3
T T T
A i
A
I
1 T T
o
O
Sta.1 Sta.2 Sta.3
Figure 7-12. Boxplots of TSS concentration for three stream stations, 1998 (Meals 2001)
7-30
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
7.3.8 Examine Relationships between Variables
Looking at how variables relate to each other is a way to begin to consider causality, i.e., is the behavior
of one variable the result of action by another. Such ideas can suggest sets of variables to evaluate
together. For example, if variable B (e.g., suspended sediment concentration) goes down as variable A
(e.g., acres of reduced tillage) goes up, has the BMP program improved water quality? Examination of
correlations between different variables observed simultaneously (e.g., SSC and total P or turbidity and
SSC) can suggest relationships that might change with BMP programs or indicate where one variable
could serve as a surrogate for another. Graphical analysis (e.g, scatterplots of variable A vs. variable B)
can suggest meaningful correlations that would need to be confirmed with more rigorous statistical tests.
The two-dimensional scatterplot is one of the most familiar graphical methods for data exploration. It
consists of a scatter of points representing the value of one variable plotted against the value of another
variable from the same point in time. Scatterplots illustrate the relationship between two variables. They
can help reveal if there appears to be any association at all between two variables, whether the
relationship is linear, whether different groups of data lie in separate regions of the scatterplot, and
whether variability is constant over the full range of data.
Figure 7-13 shows a scatterplot of phosphorus export in a control and a treatment watershed in Vermont.
Note that the data are plotted on a log scale to obtain a linear relationship. There is a strong positive
association between P in the two streams. This simple scatterplot indicates that it is probably worth
proceeding with more rigorous statistical analysis to evaluate calibration between the two watersheds in a
paired-watershed design. As with this example, it is common that the relationship between variables is
exponential. In such cases, the log transformation allows the relationship to be expressed linearly and
evaluated using linear regression.
1 UUUU 3
r ;
o
Q.
SS 1000,
t :
•g _ 100 =
£ -^ :
ui a
i_ <
£ ^ 10
S — =
5 :
1 1 ]
CD r\ A
H °'1
0
.
• t
j Tpr^«"
J^&^rlM^k •
%»&| lb&* *•
* tfB^^> *
1 1 10 100 1000 10000
Control watershed TP export (kg/wk)
Figure 7-13. Scatterplot of weekly TP export from control
and treatment watersheds, calibration period (Meals 2001)
Figure 7-14 shows another scatterplot examining the relationship between streamflow and E. coll counts
in another Vermont stream. In a nonpoint source situation, a positive association between streamflow and
bacteria counts may be expected, as runoff during high flow events might wash bacteria from the land to
the stream. In this case, however, it does not require application of advanced statistics to conclude from
Figure 7-14 that there is no such association (in fact the correlation coefficient r is close to zero).
However, recall that EDA involves an open-minded exploration of many possibilities. In Figure 7-15, the
7-31
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
data points have been distinguished by season. The open circles represent data collected in the summer
period and there still appears to be no association between streamflow and E. coll counts. The solid
circles, representing winter data, now appear to show some positive correlation (r = 0.45) between
streamflow and bacteria counts, with high bacteria counts associated with high flows. This picture
suggests that something different is happening in winter compared to summer with respect to streamflow
and E. coll in this watershed, a subject for further investigation.
1000000
100000 -
10000 -
1000 -
100
10
1 -
0.1
0.1 1 10 100
Streamflow (ft3/sec)
1000
Figure 7-14. Scatterplot of E. colivs.
streamflow, Godin Brook, 1995-1998, all data
combined (Meals 2001)
LJJ
1000000
100000
10000
1000
100 -
10 -
1 -
0.1 -
0
1 1 10 100
Streamflow (ft3/sec)
1000
Figure 7-15. Scatterplot of E. coli vs.
streamflow, Godin Brook, 1995-1998, where
solid circles = winter, open circles = summer
(Meals 2001)
In looking for correlations in scatterplots, choose the variables carefully. A common mistake is the
comparison of variables that are already related by measurement or calculation. An example of such
spurious correlation is the comparison of streamflow with load. Because load is calculated as
concentration multiplied by flow, a Scatterplot of flow vs. load has a built-in correlation that means very
little, even though it looks good in a Scatterplot. Also remember that correlation does not guarantee
causation - just because two variables are correlated does not mean that the variation in one is caused by
variation in the other.
There are many numerical techniques available to examine and test the relationship between two or more
variables. In EDA, the simplest technique is correlation, which measures the strength of an association
between two variables. The most common measure of correlation is Pearson's r, also called the linear
correlation coefficient. If the data lie exactly on a straight line with positive slope, r will equal 1; if the
data are perfectly random, r will equal 0. For Pearson's r, both variables should be normally distributed
and continuous (Statistics Solutions 2016). The test also assumes a straight-line relationship between the
variables and constant variance (homoscedasticity). Pearson's r is sensitive to outliers.
Other measures of correlation that are less sensitive to outliers include the nonparametric Kendall's tau
and Spearman's rho (Spearman's rank correlation coefficient). Spearman's rho makes no assumptions
about the distribution of the data and is an appropriate test when the variables are at least ordinal and the
variables are monotonically related (Statistics Solutions 2016). With ordinal variables, the ordering of
values is known but the differences between them are not quantified (e.g., Excellent, Good, Fair, Poor).
7-32
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Measures of correlation are easily calculated by most statistical software packages and are described in
chapter 4 of the 1997 guidance (USEPA 1997b). It must be cautioned that whenever a numerical
correlation is calculated, the data should also be plotted in a scatterplot and examined visually as
described above. Many different patterns can result in the same correlation coefficient. Never compute a
correlation coefficient and assume that the data follow a simple linear pattern.
There are several methods of simultaneously evaluating variables that are likely related to each other.
Cluster analyses group variables and/or observations into similar categories usually based on an
agglomerative hierarchical algorithm which is the most common clustering pattern used in water quality
analyses. In this clustering procedure, each observation begins as an individual "cluster." The similarities
or distances between these clusters are measured using one of several options, including Euclidian
distance and correlation coefficients. The closest two clusters are then merged into a new cluster.
Distances are calculated again using the updated set of clusters, and the process repeated until only one
cluster remains. The result of this analysis is a sequence of groupings that can be represented in a cluster
tree or dendrogram. The analyst can then perform a visual analysis to infer potential groupings and
relationships among variables. It is important to note that cluster analysis does not consider
multicollinearity between the variables. Cluster analysis conducted as part of EDA might be used to
explore and define site or time groupings that would be useful to explore in later analysis.
Other multivariate techniques that can be applied in subsequent analysis include principal components
analysis, canonical correlation, and discriminant analysis (SAS Institute 1985). These methods are
discussed further in section 7.5.2.5.
7.3.9 Next Steps
Data exploration results (knowledge of how data are distributed, their characteristics, and their
relationships) will help illustrate any needs to adjust the data to enable the appropriate subsequent
statistical tests. In addition, hypotheses can be refined to facilitate more advanced statistical techniques.
section 7.4 describes methods for accounting for censored data. Sections 7.5 through 7.9 present various
advanced procedures for analyzing data for a range of purposes. Section 7.10 presents a list of tools and
other resources for data analysis.
7.4 Dealing with Censored Data
7.4.1 Types of Censoring
Monitoring programs such as those analyzing for pesticides, metals, or other constituents often present at
very low concentrations may report lab results where concentration is below the detection limit of the
analysis. Bacteriological tests may report very high results as "too numerous to count" (TNTC). Such data
- typically reported as "<" or ">" (left- and right-censored, respectively) some value - are referred to as
"censored" data.
Censored values are usually associated with limitations of measurement or sample analysis, and are
commonly reported as results below or above measurement capacity of the available analytical
equipment. Results that are indistinguishable from a blank sample are normally reported as less than the
detection limit (DL). The true values of these left-censored observations are considered to lie between
zero and the DL. Depending on the laboratory, some results greater than the DL may be identified as less
7-33
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
than the quantitation limit (QL) or reported as a single value and given a data qualifier to indicate the
value is less than the QL. Typically, results reported as less than the QL indicate that the analyte was
detected (i.e., greater than the detection limit), but at a low enough concentration where the precision was
deemed too low to reliably report a single value. These interval-censored observations are considered to
lie between the DL and QL.
Left- and interval-censored observations are less commonly encountered when working with sediment
and nutrients because they are usually present at levels above their QLs. However; left censoring is
common when toxics and pesticides are being analyzed.
An example of right-censoring includes microbiological analyses with misestimated dilution resulting in
TNTC (too numerous to count) and exceedance of flow gage limits during floods. Right-censoring may
also be encountered when lakes and estuaries are monitored for light penetration via Secchi depth and the
result is reported as visible on bottom, i.e., the Secchi disk is observable on the bottom.
Helsel (2012) provides a seminal discussion of varying reporting limits and concerns with some data
censoring practices. This guidance recommends that detection limits and quantitation limits be stored with
the measurements and each result be clearly qualified to indicate its relation to the DL or QL as
appropriate.
7.4.2 Methods for Handling Censored Data
There is no single ideal method for managing censored data in statistical analyses. When comparing
various methods, this guidance recommends that analysts use methods that minimize bias and error.
Extensive research in water resources as well as other fields of science such as survival analysis
(e.g., how long does a cancer patient live after treatment) has considered numerous techniques. One
deficiency over the last 20 years has been the lack of readily available tools for widespread use, making
many of these tools out of reach for general use. Efforts continue to improve upon the availability of these
tools. The most notable is a compilation of methods and recommendations developed by Helsel (2012)
with additional information provided at Practical Stats. Much of the remaining discussion in this section
is derived from HelsePs book (Helsel 2012) and the reader is encouraged to review his book for a more
in-depth discussion.
7.4.2.1 Past Methods
With improved tool access, past methods for accommodating censored observations can be avoided. The
most notable past method is simple substitution. This involves the replacement of censored observations
with zero, !/2DL, or DL. Although simple substitution is commonly used (and even recommended) in
some state and federal government reports as well as some refereed journal articles, there is no real
theoretical justification for this procedure. Substitution may perform poorly compared to other more
statistically robust procedures, especially where censored data represent a high proportion of the entire
dataset. More egregiously, some reports have simply deleted observations less than the detection limit.
Some past researchers have recommended simply reporting the actual measured concentrations even if
the concentrations are below the DL (Gilliom et al. 1984). This approach has not gained traction as
laboratories are reluctant to implement such a practice, although Porter et al. (1988) suggested that an
estimate of the observation error could be reported to better qualify the measurement. While simple
substitution might be convenient for initial exploratory analyses using spreadsheet tools, more robust
procedures are available and are recommended.
7-34
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.4.2.2 Using Probability Distribution Theory to Estimate the Summary Statistics
In environmental sciences, two common methods considered for estimating summary statistics from
censored data sets include maximum likelihood estimation (MLE) and robust regression on order
statistics (ROS). Both methods ultimately rely on a distributional assumption and both methods allow for
multiple detection limits and estimation of confidence intervals. The reader is referred to Helsel (2012)
for a more detailed discussion.
MLE uses the uncensored observations, the proportion of censored observations, and a distributional
assumption to compute estimates of summary statistics. A lognormal distribution is commonly assumed
with water quality data; however, commercial software will usually allow a variety of assumptions to be
considered.
The robust ROS procedure (Helsel and Cohn 1988) relies on fitting a regression line to a normal
probability plot of the uncensored observations and is applicable for multiple censoring levels. If the
uncensored data do not fit a normal distribution, the analyst can transform the uncensored data with
lognormal or other appropriate transformation. The process of selecting the best transformation is similar
to that if all data were uncensored and diagnostics are typically available in current statistical software.
The regression is then used to impute values for the censored data. The imputed and uncensored data are
then, if necessary, transformed back to their original data scale, allowing summary statistics to be
estimated using standard techniques. Confidence intervals for the mean and standard error estimates can
be computed using bootstrapping (e.g., Helsel 2012). In summary for the mean, a random sample (with
replacement) is selected from the site data. These data are passed through the robust ROS procedure
described above, and a resulting mean is computed. The process of selecting a random sample,
implementing the robust ROS procedure and computing a resulting mean is repeated, say, 1,000 times.
Confidence limits are then empirically selected from this set of 1,000 means (e.g., the 5th and 95th
percentile of these 1,000 means would be the 90 percent confidence interval on the mean).
The MLE tool can be applied to less-thans and TNTC in the same data set. Helsel (2012) provides
recommendations for which method to use based on the number of observations and degree of censoring.
Notably, no method works well when the degree of censoring exceeds 80 percent. In the situations where
the censoring level exceeds 80 percent, Helsel (2012) recommends reporting information on the percent
of observations above a meaningful threshold and no further summary statistics. For all summary
statistics with censored data, this guidance recommends reporting the maximum detection limit, number
of observations, and number of censored observations with all summary statistics.
7.4.2.3 Hypothesis Testing with Censored Data
There are a variety of nonparametric hypothesis tests that can be directly used with raw data sets that have
censored observations and generally rely on the rank (or order) of the data. These tests include the Mann-
Whitney test (two random samples), Wilcoxon (paired samples), and Kruskal-Wallis (several random
samples), and Kendall and Seasonal Kendall tau (monotonic trends). In these tests, censored observations
are treated as tied values, no different from cases where ties might occur between uncensored
observations. Consider the ordered data set of <1, <1, 1.5, 4, 8, 9, 10, and 10. The two censored
observations (of <1) are less than all the other observations, but are treated as tied to each other. The
handling of the two "<7 's" is no different than the two 70's which are both greater than all the other
values, but tied with each other. One deficiency of these tests is that they are limited to a single detection
limit (e.g., the tests do not have a method to compare "<1" and "<2"). To apply the above nonparametric
tests with data sets that have multiple detection limits, the analyst will need to re-censor the data to the
7-35
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
highest detection limit. Note: do not use the previously described ROS procedure to impute values for
censored data and then apply one of the nonparametric tests described in this paragraph (or parametric
tests), as erroneous results might be computed because the rank of the imputed values were calculated
based upon the order of data set entry, which is not related to any true ranking of the actual water quality
values.
An alternative approach is to apply MLE regression tools that are designed for multiply censored
dependent variables. Similar to simple regression or multiple regression, relationships between singly- or
multiply-censored dependent variables can be established with independent variables. Indicator variables
can be used to set up groupings to expand the MLE regression tool for comparing two or more groups or
seasonal/explanatory adjustments as well.
7.5 Data Analysis for Problem Assessment
7.5.1 Problem Assessment- Important Considerations
One of the most critical steps in controlling NFS pollution is to correctly identify and document the
existence of a water quality problem. The water quality problem may be defined either as a threat to or
impairment of the designated use of a water resource. Impairments are generally defined and identified as
violations of water quality standards (WQS). Water quality standards define the goals for a waterbody by
designating its uses, setting criteria to protect those uses, and establishing provisions such as
antidegradation policies to protect waterbodies from pollutants. A WQS consists of four basic elements:
1. A designated use of the water body. States and Tribes specify appropriate water uses to be
achieved and protected, taking into consideration the use and value of the waterbody for public
water supply, for protection offish, shellfish, and wildlife, and for recreational, agricultural,
industrial, and navigational purposes. In designating uses for a water body, States and Tribes
consider the suitability of a water body for the uses based on the physical, chemical, and
biological characteristics of the water body, its geographical setting and scenic qualities, and
economic considerations.
2. Water quality criteria. Water quality criteria are science-based numeric pollutant concentrations
or narrative requirements that, if met, will protect the designated use(s) of the water body. Criteria
may be based on physical, chemical, or biological characteristics. Numeric criteria may, for
example, establish limits for concentrations of toxic pollutants to protect human health or aquatic
life. Narrative criteria stating that a water body must be "free from" toxic contaminants can serve
as a basis for limiting the toxicity of waste discharges to aquatic life.
3. An antidegradation policy. Water quality standards include an antidegradation policy that
maintains and protects existing uses and water quality conditions necessary to support such uses,
maintains and protects high quality waters where existing conditions are better than necessary to
protect designated uses, and maintains and protects water quality in outstanding national resource
waters. Except for certain temporary changes, water quality cannot be lowered in such waters.
4. General policies. States and Tribes may adopt policies and provisions regarding implementation
of water quality standards, such as mixing zones, variances, and low-flow policies. Such policies
are subject to EPA review and approval.
7-36
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Water quality monitoring to support problem assessment is usually focused on documenting violations of
WQS in time (e.g., frequency of exceedance) and space (e.g., geographic extent of exceedance). Water
quality data for such purposes may be collected by an ongoing monitoring program (e.g., a state ambient
monitoring program) or by a reconnaissance study designed to provide a preliminary, low-cost overview
of water quality conditions in the area of interest (see section 2.4.2.1). The EPA ATTAINS database is the
repository for information from state integrated reporting (IR) on water quality conditions under sections
305(b), 303(d), and 314 of the Clean Water Act, and the Reach Address Database contains state IR
geospatial data. ATTAINS includes state-reported information on support of designated uses in assessed
waters, identified causes and sources of impairment, identified impaired waters, and TMDL status.
A detailed discussion of monitoring designs has been presented in chapter 2 of the 1997 guidance
(USEPA 1997b). Some designs appropriate for problem assessment have been discussed in section 2.4 of
this guidance. In general, monitoring designs appropriate for collecting data to support NFS problem
assessment include:
• Synoptic surveys designed to determine the magnitude and geographic extent of WQS violations,
often used to identify pollutant source areas within a watershed;
• Above/below monitoring, wherein a potential pollutant source area is bracketed between upstream
and downstream sampling points to assess the impact of the source area on pollutant levels; and
" Trend monitoring designed to collect long-term time-series data at one or more watershed
sampling points that are useful in determining the frequency and magnitude of exceedance of WQS.
Both above/below (if pre- and post BMP data is collected) and trend monitoring designs can also be
applied to other monitoring objectives such as project effectiveness evaluation using permanent
monitoring stations equipped with automatic sampling equipment and continuous flow measurement
devices.
Grab samples with instantaneous flow measurements for a few sampling events may be sufficient for
initial problem assessment and source identification, but monitoring data for problem assessment should
include both baseflow and stormwater monitoring necessary to fully characterize the system. Storm
sampling is useful for documenting the delivery of pollutants by runoff and overland flow, critical
considerations for waters impacted by NFS. Combined with hydrologic data, basic climatic information
can be used to evaluate the seasons or times of the year when pollutant levels are highest or lowest and
when high flow events, drought, or other factors affect water quality. Note that concentration data alone
without concurrent flow or stage data are often of limited utility.
Biological monitoring is used widely in water quality assessments and EPA provides information and
links to resources addressing various aspects of the application of aquatic life criteria in water quality
assessments. Chapter 4 of this guidance is devoted to biological monitoring. The discussion below,
however, emphasizes the use and application of statistical analysis to chemical and physical monitoring
data for which there is a greater body of literature. See chapter 7 of Handbook for Developing Watershed
Plans to Restore and Protect Our Waters (USEPA 2008) for a broad discussion of approaches to
assessing water quality problems and identifying causes and sources of those problems using a wide range
of information sources.
7-37
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.5.2 Data Analysis Approaches
7.5.2.1 Summarize Existing Conditions
In a single stream or subwatershed, one monitoring location may be sufficient for problem assessment.
More often, sampling at two or more locations is necessary to evaluate existing conditions of the
watershed. Concurrently, sampling at two or more locations can aid in identification of subwatersheds
that merit further evaluation for pollution reductions or water resource protections.
When data from different locations in a watershed or different sampling time periods are consistent and
comparable (e.g., from a synoptic survey or from multiple watershed stations in the same monitoring
regime), a first step is to summarize existing conditions using univariate statistics - mean, median, range,
variance, interquartile range - for different sampling locations. If differences over time or flow conditions
are evident, it may be useful to group the data into separate baseflow and wet-weather strata or by season.
If enough samples have been collected (i.e., at least three), existing water quality can be compared across
multiple sites. Visual comparisons between sites can be depicted graphically using boxplots. Figure 7-16
shows a set of boxplots for one year of weekly conductivity data from three small watershed trend
stations in Vermont (Meals 2001). Conductivity at site WS1 appears to be substantially lower than that
observed at the other two stations; conductivity at WS2 tended to be somewhat higher than that observed
at WS3, with more frequent high extreme values. Mean or median values can be compared between two
sites using the unpaired Student's t-Test or a nonparametric equivalent such as the Wilcoxon Rank Sum
Test (also known as the Mann-Whitney Rank Sum Test). More than two sites can be compared using
Analysis of Variance or the Kruskal-Wallis k Sample Test. Adjustments for seasons or hydrologic
explanatory variables should be considered by employing appropriate statistical tests such as Analysis of
Covariance or the Seasonal Wilcoxon Rank Sum Test (also known as the Mann Whitney Rank Sum Test).
If the data between two sites are paired, differences can be tested using the paired Student's t-Test or the
Wilcoxon Signed Rank Sum Test. Paired tests are generally more powerful and should be used when
enabled by collecting samples at the same time period at two sites.
7-38
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
250
200 -
150 -
100
50 -
I
WS1
WS2
WS3
Figure 7-16. Boxplots of conductivity at three Vermont monitoring stations, October 1999 -
September 2000 (Meals 2001)
Time series plots can visually reveal relationships overtime and between locations. Figure 7-7 from section
7.3, for example, shows very clearly the seasonal cycle in E. coli counts in a Vermont stream, and
Figure 7-4 reveals a different behavior at Station 2 compared to other stations regarding P concentrations.
Time series statistical analyses can reveal autocorrelation and seasonality (see section 7.3.6).
Regression analysis between variables of primary interest (e.g., pollutant concentration/loads) and
explanatory variables such as stream discharge can assist in documenting hydraulic relationships at a
single monitoring location or between subwatersheds. Establishing relationships among variables can be
very helpful in project planning as well. Scientists involved in the Upper Grande Ronde (OR) NNMP
project, for example, explored relationships between fish and environmental factors via multivariate
analysis and found that management and restoration activities that focus on reducing the maximum annual
stream temperature would be the most effective in creating stream conditions that support salmonids
(Drake 1999).
7.5.2.2 Assess Compliance with Water Quality Standards
Water quality data can be evaluated for violation of water quality standards (WQS). Note that specific
requirements for documenting impairment in a regulatory sense may vary by circumstance. For some
states and for some pollutants, a single observation exceeding a WQS may be sufficient to designate
impairment. In other cases, determination of impairment must be based on violation of a WQS over a
defined period of time or number of observations. A WQS for bacteria to support shellfishing may, for
example, be based on a geometric mean of a number of different samples collected over a 30-day period,
rather than on a single sample. Sanitary surveys in North Carolina, for example, include a shoreline
survey to identify potential pollutant sources, a hydrographic and meteorological survey, and a
7-39
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
bacteriological survey (NCDENR 2016). Both the monitoring program and data analysis must be tailored
to the regulatory requirements that apply to the watershed under study.
A data series should be plotted and the pattern evaluated for exceedance of WQS; plots of a time series at
a single station or boxplot of multiple stations can be examined. Figure 7-17 shows how a time series plot
can illustrate both the frequency and magnitude of violations of WQS. The dashed line represents the
water quality criterion for chronic exposure; all of the observations exceed that level. The red line marks
the acute criterion and shows that several observations exceeded that concentration. Moreover, most of
the excursions above the acute criterion occurred around April, suggesting a seasonal aspect to the
impairment. This kind of pattern may support inferences about pollutant source activity.
One way to evaluate the frequency or probability of violating WQS is to use probability plots or duration
curves. Figure 7-18 shows a cumulative frequency plot of three years of E. coll data from a Vermont
agricultural watershed (Meals 2001). In this case, it can be seen that compliance with the Vermont WQS of
77 cfu/100 ml E. coll occurred about 36 percent of the time and the stream was therefore considered
impaired for E. coll about 64 percent of the time. If the USEPA criterion of 235 cfu/100 ml were applied,
the stream would be in compliance with that criterion about 48 percent of the time and impaired about
52 percent of the time.
Observed Aluminum Vs. Water Quality Standards
1D'~CC
10CC -
^
i <
£ -cc -
c
E
-
4
1D -
1 .
1 1
V
'00
f-«-
|
1 9
1
1 .
8
o o4 o
hrs-^B-rf-.
i
i
i
i
i 1
i
i
i
i
^0
4
,*-*-
1
i i
><*><>
""
y o v
--iT-
'
0
0
-*- 3>-
1
& ^
&
f -*-*-
*OCi
»-,J
1
«
~o~
-----
! 1 1 , , , .
fh f\ f\ C\ h^ K^ f£b t^5
O- Alunninurr - - -Cironic Crter on ^^^Acute Criterion
<.
V
'
'
<
0
•--
>
4
'
1 1
>
Figure 7-17. Example time series plot of observed aluminum concentrations compared to water
quality criteria
7-40
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
WS 2 Mean Weekly EC Counts 1995 - 1998
10
100
1000 10000 100000 1000000
Single-sample E coli Count (cfu/100 ml)
Figure 7-18. Cumulative frequency plot of three years of E. coli data from a Vermont stream
(adapted from Meals 2001). Red lines represent frequency of observations at or below the VT WQS
of 77 cfu/100 ml and the frequency of observations at or below the EPA criterion of 235 cfu/100 ml.
7.5.2.3 Identify Major Pollutant Sources
Cost-effective treatment of watersheds to address the pollutants and other causes of water quality
problems requires knowledge of the sources contributing to the problems. Commonly used approaches to
identifying and characterizing sources use both water quality and land-based information at varying levels
of detail and quality (USEPA 2008). This section describes methods for analyzing water quality and
associated monitoring data to characterize and aid in the prioritization of pollutant sources as part of the
watershed planning process. See section 4.4.5 for an example of using biological monitoring in the Lake
Allatoona/Upper Etowah River (GA) watershed.
Data from a synoptic survey or from regular monitoring of several subwatersheds combined with data on
land use, management, or other land-based characteristics can inform understanding of major pollutant
sources in a watershed. Correlation or regression analysis can be applied to explore relationships between
pollutant concentrations and subwatershed characteristics, e.g., total P (TP) concentrations vs. manured
cropland or suspended sediment concentration vs. cropland in cover crops. Annual mean or median values
for pollutant concentrations could be compared to annual data on land use/management activities because
concentrations will vary widely between individual events against land characteristics that are relatively
constant within a single year or crop season. However, this simplification will not reveal seasonal and
hydrologic variability in water quality or responses to short term land use changes such as animal
numbers or fertilization. Where suitable knowledge of land use or land management is available, it may
be more useful to provide water quality summary data for different periods that reflect distinctly different
land use/management conditions (e.g., after spring manure applications vs. remainder of the year) during
the monitoring period.
7-41
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Boxplots or bivariate scatterplots can be compared between monitoring sites that reflect distinctive land
use or management, thereby suggesting important pollutant source activities. If sufficient data from
different subwatersheds or sampling stations exist, analysis of variance (ANOVA), or the nonparametric
Kruskal-Wallis k Sample test can be used to test for significant differences in pollutant concentrations
between sites and then compare these findings to differences in land use between the drainage areas
sampled (graphical or tabular summaries). Analysis of covariance (ANCOVA) should be considered in
cases where data are sufficient to test for differences among sites or seasons with adjustment for
covariates such as precipitation or flow. See sections 4.6 and 4.8 of the 1997 guidance (USEPA 1997b)
for a discussion of ANOVA and ANCOVA.
If flow data are available with concentration data, load estimates can be calculated to compare the
magnitudes of pollutant sources (see section 7.9 for load estimation methods). The spatial and temporal
resolution possible for load estimates will be determined by the number and location of sampling sites
and the time frame and frequency of sampling events, respectively. Source-specific or subwatershed loads
will generally be more helpful than loads at the watershed outlet, and in many cases seasonal loads or a
classification of event vs. baseflow loads will be very helpful in the watershed project planning phase (see
section 7.6).
It should be noted that correlation does not guarantee causation. Specifics of pollutant source activity and
transport/delivery mechanisms must be considered to focus in on causation. Time of travel studies for
various points in the watershed, for example, can be helpful in belter characterizing the relationship
between various sources or subwatersheds and downstream water quality. USGS describes methods for
measuring time of travel (Kilpatrick and Wilson 1989).
7.5.2.4 Define Critical Areas
Data collected in the problem assessment phase can be used to help define critical source areas for
pollutants, knowledge that is key to understanding the watershed, prioritizing land treatment, and evaluating
project effectiveness. With concurrent data from monitored subwatersheds or tributaries (e.g., from a
synoptic survey), statistical tests such as the Student's t Test or ANOVA can be used to identify significant
differences in pollutant concentration or load among multiple sampling points. Such data can be displayed
graphically in a map to show watershed regions that may be major contributors of pollutants. Figure 7-19,
for example, shows a map of NO2+NOs-N concentrations from an April, 2003 synoptic survey in the
Corsica River (MD) watershed (Primrose 2003). Nitrate/nitrite concentrations were found to be excessive in
four subwatersheds, high in sixteen, and moderately elevated in seventeen others. Benchmarks for
determining excessive/high/moderate or similar categories can be based on numeric water quality criteria or
reference watershed values. If flow data were also available, it would be possible to estimate loads and
compare subwatersheds on the basis of absolute (e.g., kg TP) or areal (e.g., kg TP/ha) loads. Figure 4-3 of
section 4.4.5 illustrates how biological monitoring data from the Lake Allatoona/Upper Etowah River (GA)
watershed were used for site-specific assessments of biological condition.
7-42
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Corsica WRAS'-Mutrient Synqptie^ApFil,
NO2+NO3 CtJhjfc. {rfecj/L) yCf"
Figure 7-19. Map of synoptic sampling results from 41 stations in the Corsica River Watershed
(Maryland) for NCh+NOs-N concentration (Primrose 2003). Pink and red shaded subwatersheds
represent drainage areas contributing high (3-5 mg/L) and excessive (>5 mg/L) NCh+NOs-N
concentrations, respectively.
Assessment of critical areas using a small set of water quality data has some limitations. Conditions
determining pollutant generation (e.g., storm event, season, management schedules) must be considered in
drawing conclusions about critical areas. Data collected during the active crop growth season may show a
very different situation from data collected in winter, although for source identification purposes, it may be
preferable to sample during the most critical times of year. The data mapped in Figure 7-19, for example,
were collected in April, during or immediately following the spring planting and fertilizer application season
when N losses from recently applied fertilizers might be expected to be high. Secondly, the spatial
resolution of source area identification is limited by the resolution of the sampling network. Detailed site
evaluation and/or modeling may be required to identify critical source areas on a finer scale.
Another problem with using only a small set of water quality samples to determine critical areas is that
some sources are by default removed from consideration. For example, the role of streambanks and
stream channels in delivering sediment and sediment-bound pollutants such as P is often only partially
understood at the beginning of watershed projects. The Sycamore Creek (MI) NNMP project, for
example, focused on no-till and continuous cover to reduce sediment loads, but later concluded that the
stream channel stabilization implemented in one subwatershed must have been at least as important as no-
7-43
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
till in reducing suspended solids loads (Suppnick 1999). Solutions to sedimentation problems in Lake
Pittsfield (IL) progressed from an initial emphasis on no-till, terraces, and waterways (1979-1985), to
numerous water and sediment control basins and a single large sedimentation basin (1992-1996), and then
to stream restoration using stone weirs and streambank vegetation (1998) when it was learned that
massive bank erosion was increasing sediment yield (Roseboom et al. 1999). See section 4.4.5 for a
detailed example of using biological monitoring in the Lake Allatoona/Upper Etowah River (GA)
watershed.
7.5.2.5 Additional Approaches
In most cases, projects in the planning phase have limited information with which to perform statistical
analyses, particularly advanced procedures. Where such data exist, however, multivariate statistical
procedures such as factor analysis, principal component analysis, canonical correlation analysis, and
cluster and discriminant analysis can be used to define (and perhaps subsequently adjust for) complex
relationships among variables such as precipitation, flow, season, land use, or agricultural activities that
influence NFS problems. Spatial and temporal patterns can be revealed with these techniques. Scatterplots
of ordination scores can be a useful method to summarize multivariate datasets and visualize spatial and
temporal patterns.
Ordination techniques can also be powerful during the EDA phase when looking for patterns and
structure in the data. The upper Grande Ronde basin project, for example, used correlation and canonical
correspondence analysis to determine which environmental variables are largely responsible for
differences in fish assemblages between reference and impaired sites (Drake 1999). Figure 7-20 shows a
correspondence analysis plot showing intermediate/impaired sites and reference sites ordinating on the
left and right side of the origin (Drake 1999). Scatterplots such as Figure 7-20 can be a useful way to
summarize multivariate datasets and visualize these spatial and temporal patterns. With such variables
identified, the next step was applying principal component analysis to determine if these variables could
be used to track stream improvements over time. These statistical procedures are discussed briefly below.
The reader is referred to statistics textbooks and other resources for additional information. Further, it is
recommended that these procedures are performed by or in consultation with a trained statistician.
7-44
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
1
0.6
0.2
en
X
g -0.2
-0.6
-1
-1
c
c
#
* n
• * ^
*
" *
H
A 0
*
«
f
A DARKL
A DARKU
O LI ML
• LIMU
+• LOOK
D MCCL1
D MCCL2
* MCCM
O MCCR
• MEADL
* MEADS
.
A
.2 -08 -0.4 0 0.4 0.8 1.2 1.6
CA AXIS I
DARKL: Dark Canyon Creek - lower site MCCL1 & 2: McCoy Creek - Lower site 1 & 2
DARKU; Dark Canyon Creek- Upper site MCCM: McCoy Creek - Middle site
LIML; Limber Jim Creek- Lower site MCCR: McCoy Creek - Restored reach
LIMU: Limber Jim Creek- Upper site MEADL: Meadow Creek - Lower site
LOOK: Lookout Creek MEADS: Meadow Creek at Starkey
Figure 7-20. Correspondence analysis biplot of Grande Ronde fish data (Drake, 1999)
Principal component analysis (PCA) is a multivariate technique for examining linear relationships among
several quantitative variables, particularly when the variables are correlated to each other. PCA can be
used to determine the relative importance of each independent variable and determine the relationship
among several variables. Given a data set with p numeric variables, p principal components or factors can
be computed. Each principal component (or factor) is a synthesized variable that is a linear combination
of the original variables (SAS Institute 1985). The first principal component explains the most variance in
the original data, while the second principal component is uncorrelated with (i.e., orthogonal to or
statistically independent from) the first principal component and explains the next greatest proportion of
the remaining variance. This process is continued until there are p statistically independent principal
components that explain as much of the variance as possible. The results of PCA can often be enhanced
through factor analysis, which is a procedure that can be used to identify a small number of factors that
explain the relationships among the original variables. One important aspect of factor analysis is the
ability to transform the factors (i.e., reconfigure the linear combinations of original variables) from PCA
so that they make more sense scientifically. The SAS procedures PROC PRINCOMP and PROC
FACTOR can be used for these analyses (SAS Institute 2010).
Principal component analyses and factor analysis can be used in regression analysis to reduce the number
of variables or degree of freedoms (d.f) by using a subset of the principal components (factors) that
explain the majority of the variance of the data set instead of using all of the original variables. This
essentially reduces the degrees of freedom used, but incorporates most of the information from each of
7-45
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
the explanatory variables, hence increasing the validity and power of the regression analysis. Using PCA
to incorporate many explanatory variables into a regression model is superior to other techniques that
arbitrarily drop explanatory (X) variables; those may incorrectly drop the more important variables due to
multicollinearity between the X's. In principle, PCA and factor analysis could be beneficial to projects in
a number of other ways, including helping investigators focus problem assessments on the most important
indicators and stressors, aiding in the selection of water quality and land use/treatment variables to be
used in the monitoring program, and guiding BMPs toward the most important pollutant sources.
Canonical correlation analysis (CCA8) is a technique for analyzing the relationship between two sets of
multiple variables (e.g., a set of nutrient variables and a set of biomass-related variables). This
multivariate approach examines said relationship "by finding a small number of linear combinations from
each set of variables that have the highest possible between-set correlations" (SAS Institute 1985). These
linear combinations of variables from each set are synthetic variables called 'canonical variables' and the
coefficients of the linear combinations (which are similar to Pearson r) are referred to as the 'canonical
weights' (SAS Institute 1985). The first canonical correlation is the correlation between the canonical
variables from each set that maximizes the correlation value in accounting for as much as possible of the
variance in the variable sets. The second canonical correlation is between a second set of canonical
variables, is uncorrelated with the first canonical variables, and produces the second highest correlation
coefficient. Additional correlations are established until all variance is explained or the maximum number
of canonical correlations has been used (i.e., the number of variables in the smaller set). As such, the
canonical variables are similar to principal components in summarizing total variation (SAS Institute
1985).
In simple terms, CCA can be used in problem assessment to look for relationships between sets of
grouped variables to help better understand existing water quality problems or the relationships between
land use/management variables (e.g., imperviousness, acreage receiving manure) and pollution variables
(e.g., discharge, pollutant concentrations) to help guide decisions on BMP selection and placement. There
are several output statistics (e.g., significance, correlations, coefficients) in CCA, and the reader is
referred to statistical textbooks and other sources for additional details. It should be noted, however, that
while many correlations may be output from a specific analysis, only the strongest correlations should be
considered for interpretation.
Discriminant analysis is used to assess relationships between a categorical (grouping) variable
(e.g., presence or absence of a fish species) and multiple quantitative (predictor) variables (e.g., pH,
temperature, D.O.). The category options (e.g., present or absent) are assigned a priori—normally
verification of the a priori grouping is performed during discriminant function analysis. Discriminant
analysis can be used to verify the observational groupings defined by each cluster (see section 7.3.8) or
other defined grouping based on the values of the quantitative variables. This type of analysis is referred
to as ' classificatory discriminant analysis' and is probably the most common application of discriminant
analysis in water quality research. The SAS procedures DISCRIM (parametric) and NEIGHBOR
(nonparametric) can be used to perform classificatory discriminant analyses (SAS Institute 1985).
Discriminant analyses can also be used to define a subset of quantitative variables that best describes the
differences among the groups; see, for example, the SAS procedure STEPDISC (SAS Institute 1985).
Canonical discriminant analysis is equivalent to canonical analysis described above except that a set of
quantitative variables is related to a set of classification variables (SAS Institute 1985). Principal
: Canonical correspondence analysis is also often abbreviated as CCA.
7-46
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
component analysis is used as an intermediate step in the calculation of the canonical variables. The SAS
procedure CANDISC can be used to perform canonical discriminant analyses (SAS Institute 1985).
Cluster and discriminant analyses can be used to understand and adjust for relationships among water
variables. For example, spatial heterogeneity and homogeneity can be revealed. This may be necessary to
study the transport of a pollutant in a system or to remove the spatial component in order to detect
changes overtime.
In many cases, watershed projects use simulation models to help with problem assessment and planning.
Water quality models that include land use/land treatment and are calibrated using water quality data from
the watershed or similar watershed(s) can also assist with identification of critical pollutant sources. The
reader is referred to USEPA's watershed project planning guide (USEPA 2008) and TMDL modeling
website for additional information on water quality models.
7.6 Data Analysis for Project Planning
Existing data or data collected specifically in support of a developing watershed project may play
important roles in project planning, including determination of land treatment needs and design of a water
quality monitoring program. These and other aspects of watershed planning are addressed in detail in
Handbook for Developing Watershed Plans to Restore and Protect Our Waters (USEPA 2008).
7.6.1 Estimation and Hypothesis Testing
Project planning - including setting clear project goals - should result in the articulation of hypotheses
that can be tested using appropriate statistical tests. The hypothesis must be stated in quantitative terms
that can be adequately addressed by statistical analyses and must be directly related to the stated water
quality monitoring goals.
The null hypothesis (H0) is a specific hypothesis about a population that is being tested by analyzing the
collected sample data. In water quality studies, the null hypothesis is generally a statement of no change,
no trend over time or space, or no relationship(s). In contrast, the alternative hypothesis (Ha or Hi) is
generally the opposite of the null, e.g., a statistically significant change, a trend over time or space, a
relationship between 2 or more variables.
The general approach to hypothesis testing is to:
1. State the null and alternative hypotheses. For example:
• Ho - There is no statistically significant trend over 10 years in TP at the subwatershed stream
outlet
• Ha- There is a statistically significant trend over 10 years in TP at the subwatershed stream
outlet
2. Determine a parameter (e.g., mean, median, slope/trend over time) that would provide a point
estimate to test if the sample data follow a distribution that would be expected if the null
hypothesis was true, or more importantly, to test if there is evidence that the data come from an
alternative population.
3. Design a sampling plan that would collect data to test if there is statistical evidence to reject the
null hypothesis and accept the alternative hypothesis.
7-47
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
4. Analyze the sample data to calculate the sample point estimate and its confidence interval based
upon the collected data variability.
5. Compare the confidence interval to the point estimate under the null hypothesis to determine if
there is statistical evidence to reject the null and accept the alternative hypothesis (e.g., statistical
evidence that a trend has occurred overtime).
It should be noted that if the null hypothesis is not rejected, it is inappropriate to state that the null
hypothesis is accepted. Instead, failure to reject the null or failure to detect significant differences or
trends is the proper way to state such results. Failure to reject the null could be due to high sample
variability, low sample size, or no real differences or trends. The chance of documenting a true difference
or trend with statistical significance is improved by increasing sample frequency and longevity, and by
using a monitoring design that will isolate the change/trend, while accounting for some of the high
variability in data values observed in natural water quality systems. Effective monitoring designs are
described in chapters 2-4.
There are two types of errors in hypothesis testing:
1. Type I: The null hypothesis (H0) is rejected when H0 is really true.
2. Type II: The null hypothesis (H0) is not rejected when H0 is really false.
The probability of making a Type I error is equal to the significance level (a). The probability of a Type II
error is (3. The power of a test (1- (3) is the probability of correctly rejecting Ho when Ho is false. While
the significance level is often taken for granted to be 0.05, a different value might be more appropriate for
some NFS studies.
7.6.2 Determine Pollutant Reductions Needed
To set goals for a watershed project, it is important to estimate the pollutant reduction required to meet
water quality objectives, usually to meet WQS. There are several approaches to developing such
estimates:
" Mass balance/TMDL. In a TMDL setting, a load reduction goal is established based on a mass
balance approach. Monitoring data are used to estimate the pollutant load a waterbody can receive
while complying with WQS. The pollutant load reduction goal for a watershed project becomes the
difference between the current load and the TMDL which is defined by:
TMDL = WLA + LA + MOS
Where WLA is the Waste Load Allocation (the allowable point source load);
LA is the Load Allocation (the allowable nonpoint source load); and
MOS is the Margin of Safety to account for uncertainty in the other estimates.
Note that the LA term (NFS load) is often estimated by difference and is not subdivided
by source type. The pollutant load reduction goal for a watershed project focused on
agricultural sources, for example, will not necessarily address the full difference between
current load and LA because there may be other significant nonpoint sources in the
watershed such as urban and residential nonpoint sources. TMDLs are frequently based
on modeling analysis, but also use available water quality data to the extent possible.
7-48
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Detailed information on TMDL analysis is available through USEPA (2013). See Case Study 5 for
an illustration of how water quality data can be used in the development of a watershed-scale mass
balance. The accuracy of this approach, however, depends on the quality and representativeness of
the data used in the analysis. In Case Study 5, for example, because internal P loading is being
computed based on estimates of the other terms, underestimation of external P loading will lead to
an equal overestimate of internal P loading, thus confounding interpretation of the effects of alum
application. For this and other reasons, the adaptive management approach is a cornerstone of
TMDL implementation. As additional data are collected, mass balances should be revisited.
Receiving waterbody relationships. Numerous tools exist to evaluate the impacts of pollutant
loads on waterbodies that may be helpful in estimating pollutant load reduction goals. In lakes, for
example, there are many analytical procedures and modeling tools to relate phosphorus load to lake
eutrophication, including the "Vollenweider models" (Vollenweider 1976, Vollenweider and
Kerekes 1982) and BATHTUB (Walker 1999). Such tools may be used to "back-calculate"
permissible phosphorus loads to lakes. Other receiving water models may be used for similar
purposes in other types of waterbodies, e.g., OUAL2K. CONCEPTS, and WASP. All of these
models can employ available monitoring data to both establish model parameter values and to
conduct calibration and validation. Additional information on models useful in this kind of analysis
can be found in the USEPA TMDL Modeling Toolbox. Many of these models need to be calibrated
with water quality collected from the study watershed or similar watershed(s).
Load duration curves. A flow or load duration curve is a cumulative frequency plot of mean daily
flows or daily loads at a monitoring station (e.g., a watershed trend station or tributary outlet) over a
period of record, with values plotted from their highest value to lowest without regard to
chronological order (see section 7.9.3). For each flow or load value, the curve displays the
corresponding percent of time (0 to 100) that the value was met or exceeded over the specified
period - the flow or load duration interval. Extremely high values are rarely exceeded and have low
flow duration interval values; very low values are often exceeded and have high flow duration
interval values. An estimate of the pollutant reductions needed is obtained by comparing a load
duration curve developed from monitored loading data against a similar curve with loads estimated
as the product of monitored flows and the pollutant concentration established in a WQS. Detailed
information on the application of load duration curves to pollutant load reduction estimates can be
found in An Approach for Using Load Duration Curves in the Development of TMDLs (USEPA
2007).
7-49
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
CASE STUDY 5: MASS-BALANCE APPROACH USED FOR ESTIMATING
PHOSPHORUS LOADS
Grand Lake St. Marys (GLSM) is located in the Grand Lake
St. Marys watershed in western Ohio (Figure CS5-1). GLSM
is a large (5,000 ha), man-made, shallow (mean depth: 1.6
m) lake originally constructed as a "feeder reservoir" for
the Miami-Erie Canal (Moorman et al. 2008; ODNR 2013;
Tetra Tech, Inc. 2013). Over 90 percent of the watershed is
in cropland with associated livestock operations.
Cyanobacteria blooms in GLSM result both from external
and internal phosphorus loading (Tetra Tech, Inc. 2013).
Western Ohio
Treated a large, shallow
lake with aluminum sulfate
to reduce internal
phosphorus loads
Used the mass-balance
approach to estimate
internal phosphorus loads
pre- and post-treatment
The lake was treated with aluminum sulfate (alum) in June
2011 (23.6 mg AI/L, 49.6 g/m2) and in April 2012 (21.5 mg
AI/L, 45.2 g/m2) to reduce internal phosphorus loads. The
combined treatments totaled approximately 70 percent of
the recommended treatment for the lake (recommended treatment was 86 mg AI/L, 120 g/m2).
Monitoring data from 2012 were compared against monitoring data collected between 2010 and
2011 to analyze the results of the treatments (Tetra Tech, Inc. 2013). While the assessment also
included analysis of algal biomass and aluminum in the water column and sediments, this
summary focuses on total phosphorus (TP).
Figure CS5-1. Grand Lake St. Marys watershed
7-50
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Monitoring and Sampling
Data from eleven water column
monitoring sites were used in the
assessment, with the five lake sites
(shown in Figure CS5-2) sampled every
two weeks after alum treatment.
Samples at these five sites were always
collected at 0.5 m from the surface,
while some sampling events also
included samples at the bottom of the
water column. Samples were analyzed
forTP, soluble reactive phosphorus,
alkalinity, and chlorophyll. The Ohio
Environmental Protection Agency
(OEPA) also conducted routine sampling
of tributaries, with sample analysis
including TP (Tetra Tech, Inc. 2013).
USGS Flow Station
Sampling Stations
WWTP
Places
Figure CS5-2. Tributary and lake sampling stations
Mass-Balance Approach
The mass-balance approach helped
estimate internal TP loading before and
after alum treatment. This approach consisted of five basic steps: (1) Estimating the water budget
for GLSM; (2) Developing a basic P budget for the same time period as the water budget (May
2010 through May 2011 prior to any alum addition); (3) Predicting GLSM mean TP concentrations
using a P mass balance model for which input values are based on available monitoring data for
inflows and outflows; (4) Comparing estimated GLSM mean TP concentrations with measured TP
concentrations; and (5) Adjusting the rates of P sedimentation and release of P into the water
column (internal loading) to match predicted with measured TP concentrations in GLSM (Tetra
Tech, Inc. 2013).
Water budget
A water budget for GLSM was determined at a two-week time step. Change in lake storage was
determined using the following equation:
Change in GLSM lake storage = Inflow (creek and WWTP inputs) + Precipitation -
Outflow (water treatment plant withdrawal, groundwater loss, outlets) - Evaporation
+ Groundwater
The only tributary for which flow data were collected continuously was Chickasaw Creek where
USGS has a gaging station (see Figure CS5-2). Wastewater treatment plant (WWTP) flow volumes
were obtained from WWTP records and removed from the creek flow volumes so that loads from
the four WWTPs in the watershed to GLSM could be calculated separately. Flow volumes from
ungaged tributaries and areas draining directly to the lake were estimated by multiplying the
adjusted Chickasaw Creek flow (minus WWTP) by the ratio between the other contributing
drainage and Chickasaw Creek drainage areas. If creeks were observed to be dry, the flow was
assumed to be zero for that period (Tetra Tech, Inc. 2013).
7-51
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Precipitation records were obtained from a nearby weather station and multiplied by the surface
area of the lake to get a volume of direct inflow from precipitation. Monthly mean pan
evaporation rates were taken from the Hydrologic Atlas for Ohio (Harstine 1991; after Farnsworth
and Thompson 1982).
Groundwater inflow was negligible and the rate for groundwater loss was assumed based on
productivity of the underlying aquifer. This rate was adjusted such that there was more loss or
recharge during the drier months when there was no outflow. Daily WWTP withdrawals were
obtained from plant records. GLSM has two spillways, neither of which is continuously gaged.
Lake level data were used to determine when losses would occur over the spillways and two
instantaneous flow measurements were used to check estimated flows over the west spillway
which is the major outflow. Outflow over the east spillway was assumed to be 10 percent of the
west spillway outflow based on communication with local experts (Tetra Tech, Inc. 2013).
Total Phosphorus mass-balance model
A TP mass balance model was developed using the same two-week time step as used for the
water budget (Perkins et al. 1997; Tetra Tech, Inc. 2013). Mass was estimated for two-week
periods by multiplying the estimated flow volume and mean TP concentration. The principal use
of the mass-balance model was to estimate changes in internal P loading for GLSM based on input
of measured and estimated values for other terms in the model. Model calibration was based on
matching predicted with measured lake TP concentration (Tetra Tech, Inc. 2013).
The following model was used to predict whole lake TP concentrations:
dTP/dt = Wext + Wint - Ws - Wout,
where Wext is external loading, Wint is internal loading, Ws is loss to sediments, and Wout is loss
through the lake outlet. Predicted whole-lake TP concentrations were compared to observed
whole lake mean TP concentrations determined from monitoring at the five lake sites (Figure
CS5-2).
Tributary TP concentrations were based on samples collected by OEPA during its routine
monitoring. An average of all tributary TP concentrations was used for the ungaged portion of the
basin. The TP concentration in direct precipitation was assumed to be 20 u.g/L based on an
average areal loading rate at Lake Erie from 1996 to 2002 (Dolan and McGunagle 2005).
Concentration data for WWTPs were obtained from OEPA where available, and a concentration of
2 mg/L based on an OEPA analysis was assumed otherwise.
Assuming complete mixing, all but one outflow TP concentration was set equal to the whole lake
average TP concentration predicted by the model. The actual measured TP concentration of the
outflow, 210 u.g/L, was used in the model for a single, very large storm event. Sedimentation rates
(loss of TP to sediments) and sediment release rates (internal loading) of TP were adjusted in the
model to reflect alum applications and to improve the relationship between predicted and
measured lakeTP concentrations (Tetra Tech, Inc. 2013).
Results
The phosphorus mass balance model was used to determine whole-lake mean TP concentrations
based on external loading, internal loading, TP sedimentation, and TP loss through outflows.
Whole-lake mean TP concentrations predicted by the 2012 model were compared to observed
concentrations as collected and analyzed by OEPA. Sedimentation rates were adjusted to fit the
7-52
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
predicted to measured TP concentrations in the lake (Figure CS5-3). With the 2012 model thus
calibrated, results were compared with those from 2010 and 2011 to determine if changes in
internal TP loading had occurred as a result of alum treatments (Tetra Tech, Inc. 2013).
300
._. 250
~ 200
_o
I 150
01
u
§100
50
—•—Predicted (Pink-Calibrated 2010; Green-Summer 2011) -m-Predicted (2012)* Observed
Figure CS5-3. GLSM predicted vs. observed TP concentrations from May 2010 through October
2012 (Adjustments made to internal loading estimates to match predicted November 2011 -
October 2012 values to observed TP concentrations)
Table CS5-1 shows that gross summer internal TP loading to GLSM declined steadily from 2010 to
2012. The mass-balance modeling showed that average summer internal loading rate decreased
from 4.0 mg/m2 per day before alum treatment to 1.8 mg/m2 per day after the two alum
treatments, even though the combined 2011 and 2012 treatments totaled only 70 percent of the
recommended treatment for the lake (Tetra Tech, Inc. 2013).
Table CS5-1. Comparison of internal TP loading in GLSM (2010-2012)
2010
2011
2012
Total Gross Summer Internal TP Load (kg)
26,470
16,487
11,374
Average Summer Internal Loading Rate (SRR)
(mg/m2-day)
4.0
2.4
1.8
(Tetra Tech, Inc. 2013)
7-53
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
References
Dolan, D.M. and K.P. McGunagle. 2005. Lake Erie total phosphorus loading analysis and update:
1996-2002. Journal of Great Lakes Research 31 (Suppl. 2):ll-22.
.
Accessed November 2013.
Farnsworth, R.K. and E.S. Thompson. 1982. Mean Monthly, Seasonal, and Annual Pan Evaporation
for the United States. NOAA Technical Report NWS 34. National Oceanic and Atmospheric
Administration, National Weather Service, 82 pp.
Accessed November
2013.
Harstine, L.J. 1991. Hydrologic Atlas for Ohio: Average Annual Precipitation, Temperature,
Streamflow, and Water Loss for a 50-Year Period, 1931-1980. Ohio Department of Natural
Resource Division of Water, Ground Water Resources Section. Water Inventory Report No.
28.
Moorman, J., T.Hone, T.Sudman Jr., T.Dirksen, J. lies, and K.R. Islam. 2008. Agricultural impacts on
lake and stream water quality in Grand Lake St. Marys, Western Ohio. Water, Air, and Soil
Pollution. 193:309-322.
ODNR. 2013. Grand Lake St. Marys State Park. Ohio Department of Natural Resources.
. Accessed November 2013.
Perkins, W.W., E.B. Welch, J. Frodge, and T. Hubbard. 1997. A zero degree of freedom total
phosphorus model: Application to Lake Sammamish, Washington, Lake and Reserv. Manage.
13:131-141.
Tetra Tech, Inc. 2013. Preliminary Assessment of Effectiveness of the 2012 Alum Application-
Grand Lake St. Marys.
. Accessed November 2013.
7-54
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.6.3 Estimate Land Treatment Needs
A watershed project must set land treatment goals based on estimates of pollutant reductions needed and
the BMPs available to accomplish those reductions. Where aquatic habitat improvement is needed, the
project's plan must also be based on an assessment of the change in habitat parameters (e.g., water
temperature, cobble embeddedness, flow characteristics as well as pollutant loadings) needed to support
aquatic life. Various approaches to determining land and in-stream treatment needs to restore and protect
aquatic habitat have been documented (e.g., OWEB 1999, Rosgen 1997). Obviously, the BMPs selected
must be those capable of addressing the pollutants and sources identified in the planning process. Setting
goals for the level and extent of BMP implementation is necessary, but is an inexact science, partially
because of the largely voluntary (and hence poorly predictable) nature of land treatment programs, and
partially because it is difficult to predict water quality response to BMP implementation at the watershed
level. See USEPA (2008) for a comprehensive discussion of watershed project planning.
Where local data on BMP performance exist (e.g., a documented 45 percent reduction in suspended
sediment load through a water and sediment control structure or a 25 percent reduction in runoff
phosphorus concentration from fields in conservation tillage), they can be applied to estimate pollutant
reductions anticipated from different levels of implementation. Where locally-validated data do not exist,
there is ample information in published literature (e.g., Simpson and Weammert 2009, USDA-NRCS
2012). Planners should use caution when applying performance data from other studies due to potential
local site differences.
It should be noted that published BMP efficiencies do not generally account for interactions in multiple
practice systems or address pollutant transport or delivery processes beyond the edge of field or BMP site
scale. Modeling, e.g. the Soil Water Assessment Tool (SWAT), may be a better method for estimating
treatment needs because some models account for routing of BMP effects through a watershed. Simple
pollutant load estimation tools such as USEPA's STEPL (Spreadsheet Tool for Estimating Pollutant
Load) can be used to provide general estimates of load reductions achievable via various BMP
implementation options, but STEPL, for example, addresses a limited set of pollutants and simulates a
limited set of BMPs.
7.6.4 Estimate Minimum Detectable Change
One critical step in watershed project planning is to use the data that have already been collected to
evaluate the Minimum Detectable Change (MDC), the smallest monitored change in a pollutant
concentration or load over a given period of time required to be considered statistically significant.
Understanding of the MDC will assist in planning both land treatment and water quality monitoring
design and will support predictions of project success. See section 3.4.2 for details.
The basic concept in the calculation of MDC is simple: variability in water quality measurements is
examined to estimate the magnitude of changes in water quality needed to detect significant differences
over time. The MDC is a function of pollutant variability, sampling frequency, length of monitoring time,
explanatory variables or covariates (e.g., season, meteorological, and hydrologic variables) used in the
analyses which 'adjust' or 'explain' some of the variability in the measured data, magnitude and structure
of the autocorrelation, and statistical techniques and the significance level used to analyze the data. In
general, MDC decreases with an increase in the number of samples and/or duration of sampling in a
monitoring program.
7-55
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
The MDC for a system can be estimated from data collected within the same system during the planning
or the pre-BMP project phase or from data collected in a similar system, such as an adjacent watershed.
As noted above, MDC is influenced by the statistical trend test selected. For the MDC estimate to be
valid, the required assumptions must be met. Independent and identically distributed residuals are
requirements for both parametric and nonparametric trend tests. Normality is an additional assumption
placed on most parametric trend tests. However, parametric tests for step or linear trends are fairly robust
and therefore do not require 'ideally' normal data to provide valid results.
The standard error on the trend estimate, and therefore, the MDC estimate, will be minimized if the form
of the expected water quality trend is correctly represented in the statistical trend model. For example, if
BMP implementation occurs in a short period of time after a pre-BMP period, a trend model using a step
change would be appropriate. MDC in this case is an extension of the Least Significant Difference (LSD)
concept (Snedecor and Cochran 1989). If the BMPs are implemented over a longer period of time, a
linear or ramp trend would be more appropriate. Calculation of the MDC is discussed in detail in Spooner
et al. (201 la) and illustrated in section 3.4.2. The reader is advised to consult that publication to calculate
and apply the MDC analysis.
MDC provides an excellent feedback to whether the planned BMPs (type and location, acres served) will
result in an amount of change in pollutant concentration or loads that can be statistically documented.
Results of the MDC analysis can also be applied to the design of a long-term monitoring program
(e.g., sampling frequency, monitoring duration). Decisions about data analysis such as the use of
covariates to reduce effective variability and thereby reduce MDC can be made, or MDC calculations can
be used to better understand the potential and limitations of an ongoing monitoring effort. Note that the
MDC technique is applicable to water quality monitoring data collected under a range of monitoring
designs including single fixed stations and paired watersheds. MDC analysis can be performed on
datasets that include either pre- and post-implementation data or just limited pre-implementation data that
watershed projects have in the planning phase.
7.6.5 Locate Monitoring Stations
The general location of monitoring stations is described for each monitoring design in section 2.4.
Analysis of pre-project data, in conjunction with monitoring objectives, can provide insight into optimum
location of monitoring stations to be used in watershed project effectiveness evaluation. Section 3.3
provides a discussion on how site characteristics, access, and logistics influence decisions on locating
monitoring stations. Spatial analysis of land use and management data, including understanding of
relationships between land use and management patterns and water quality (see section 7.5.2.3) can be
used to inform monitoring site selection. Inferences on critical source areas (section 7.5.2.4) should also
be used to guide station location. Subwatersheds showing very high and very low NCh+NOs-N
concentrations in Figure 7-19, for example, might be selected for monitoring as treatment and control
watersheds, respectively.
7.7 Data Analysis for Assessing Individual BMP Effectiveness
The availability of BMPs that perform a known water quality function is fundamental to NPS watershed
projects. Many practices have a long history (e.g., buffers, conservation tillage for erosion control,
grassed waterways) and their efficiency in reducing NPS pollutants is well-documented by research,
although highly variable depending on site, management, and other factors. The performance of other
7-56
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
BMPs, such as novel practices or practices not common locally, may not be fully understood. In such
cases, and in cases where specific assurance that BMPs will perform adequately in local circumstances is
required, the effectiveness of individual BMPs may be assessed through monitoring.
Common monitoring designs for assessing BMP effectiveness include:
" Plot studies
" Input/output at the BMP practice scale
" Above/below at the site scale
" Paired watershed at the edge-of-field scale
Data analysis for above/below and paired-watershed BMP monitoring is essentially the same as for these
designs at the watershed project level (see section 7.8). This section will focus on discussion of data
analysis for plot studies and for BMP input/output studies.
7.7.1 Analysis of Plot Study Data
Controlled, replicated plot or field studies are effective for testing specific practices of undocumented
effectiveness or evaluating the effectiveness of a BMP program or system at a farm or watershed scale
(USEPA 1997b). To some extent, plots represent microcosms of an area where a full-scale BMP might be
applied, where inputs, management, and outputs can be controlled and measured to a degree that would
be extremely challenging at full scale. Most importantly, because plots are small (often less than 100 m2),
it is possible to test different levels of treatment and replicate treatments in the same experiment, thus
potentially capturing enough variability to have some statistical confidence in the outcome.
As discussed in section 2.4.2.2, there are a variety of plot study designs, including factorial experiments,
Latin Squares, and complete and incomplete block designs. Approaches to analyzing data from these
various options differ to some degree, but most follow three basic steps:
" Test to see if there are significant differences among the treatments
" Test to find which treatments are significantly different
• Determine the magnitude of differences
Statistical approaches discussed in this section focus on one- and two-factor designs (generally
Randomized Complete Block, RGB). Readers should consult statistics textbooks and other resources for
information on procedures to analyze data from the more complicated designs such as Latin Squares and
incomplete block designs.
Data from simple plot studies are usually analyzed using ANOVA (parametric) or the Kruskal-Wallis test
(nonparametric). These procedures allow the determination of significant differences in group means for
pollutant concentration or load coming from plots. When a plot study is conducted for a single
precipitation/runoff event (either natural or simulated rainfall), the groups tested would be the replicate
plots for each type or level of treatment, plus control plots. For a plot study conducted over a series of
events, the groups tested could be data from replicate groups within individual events or mean
concentration or total load over the entire series of events, depending on the study design. Note that the
ANOVA and Kruskal-Wallis procedures only document that one or more group means differ significantly
from the other groups. To determine which of the group means are significantly different, use a multiple
7-57
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
comparison test such as Tukey's or the Least Significant Difference tests (Snedecor and Cochran 1989,
USEPA 1997b). Applications of the Least Significant Difference and Tukey's tests are illustrated in
section 4.6.1 (pages 4-55 to 4-56) and 4.6.4 (pages 4-63 to 4-64), respectively, of the 1997 guidance
(USEPA 1997b).
The ANOVA procedure can also be used where there is more than one factor or explanatory variable
(e.g., plot, slope), whereas the Kruskal-Wallis test handles only one factor. The Friedman nonparametric
test is recommended for more than one factor. Application of these tests is described and illustrated in
section 4.6 (pages 4-52 to 4-64) of the 1997 guidance (USEPA 1997b).
One-factor comparisons using ANOVA assume random samples, independent observations, and normal
distributions for each group, as well as the same variance across groups. Group sample sizes can differ,
however. An illustrative example application of the Kruskal-Wallis test for one-factor comparisons is
included in the 1997 guidance (USEPA 1997b), pages 4-56 to 4-58.
Two-factor comparisons using ANOVA depend on whether the factors interact. An example of an
interaction is the relationship between crop yield and precipitation, both of which can independently
influence soil nitrate levels; greater yields remove more nitrate from the soil profile and greater
precipitation moves more nitrate through the soil profile. Yield, however, is also influenced by
precipitation (e.g., drought or excessively wet soil conditions), so there is an interaction between the two
factors. The plot study analysis from Vermont (see Example 7.7-1) illustrates consideration of
interactions.
Both the scope of inferences that can be made and the F statistic calculation differ for fixed effect models
(e.g., rainfall simulation studies in which rainfall rates are not randomly selected) versus models using
randomly selected or combinations of randomly selected and fixed factors. Readers are recommended to
section 4.6.2 (pages 4-58 to 4-61) of the 1997 guidance (USEPA 1997b) for an illustrative example and a
discussion of these and other important considerations when applying ANOVA to two-factor
comparisons. If the data are log-transformed prior to ANOVA, the treatment effects are then interpreted
as multiplicative (rather than additive) in the original units. An alternative approach is to rank-transform
the data prior to ANOVA, resulting in a comparison of the medians of the data in the original units (see
pages 4-61 of the 1997 guidance for details).
Once a statistically significant difference has been demonstrated and the different group means have been
identified, it is possible to explore the magnitude of such differences. Methods for two random samples,
two paired samples, or a single sample versus a reference (e.g., criterion for a WQS) are described in
section 4.5.3 (pages 4-51 to 4-52) of the 1997 guidance. It is important to take the extra step of
determining confidence intervals for difference estimates.
In addition to using statistical tests to document differences among treatment groups, plot data can be
evaluated by direct comparison of event mean concentration (EMC) or event load (or areal load) among
treatments. For plot studies evaluating practice performance over a series of events, a cumulative export
plot (where the sum of cumulative mean export from each group is plotted sequentially over the study)
will illustrate the behavior of treatment groups in different events. It must be cautioned that data and
quantitative inferences about practice performance from plots are usually very difficult to extrapolate to
field or watershed scale because physical processes like runoff velocity are not well-represented in very
small areas.
7-58
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Example 7.7-1. Plot Study Analysis: Bacteria Runoff from Manure Application in Vermont
Objective
Evaluate several practical methods for controlling
£. co/; in runoff from manure application sites.
Specific objectives included: (1) determine the
effect of manure storage time on £. co/; losses in
runoff from hay and corn land receiving liquid dairy
manure; (2) determine the effect of manure
incorporation on £. co/; losses from corn land
receiving manure; (3) determine the effect of
vegetation height on £. co/; losses in runoff from
hay land; and (4) determine the effect of delay
between manure application and rainfall on £. co/;
losses in runoff from hay land and corn land.
Monitoring Design
Two runoff experiments were conducted at
separate hay land and corn land sites. For each
experiment, 40 1.5- by 3-m plots were created,
representing a factorial design of 3 replicates for
each treatment combination, 3 manure ages,
2 vegetation heights (for hay) or
incorporated/unincorporated (for corn), 2 delay to
rain durations, resulting in 3x3x2x2 (36)
treatments, plus three control plots (no manure
applied), and one extra plot reserved as a backup.
Specific treatments were assigned to plots
randomly. A rainfall simulator was used to
generate runoff from the test plots by continuously
and uniformly applying water at an intensity
resembling natural rainfall. For each experiment,
the first hour or first 19 L of runoff was collected
from each plot.
ANOVA table for hay land runoff experiment. Results show
significant manure age, delay to rain, and two interactions
Analysis of Variance
Source
Model
Error
Total
df
7
26
33
Sum of
Squares
34.8385
3.4846
38.32311
Mean of
Squares
4.9769
0.1340
F Ratio
37.135
P
<0.001
Effects Tests
Source
Manure Age
Vegetation Height
Delay to Rain
Manure Ag x
Vegetation Height
Vegetation Height
x Delay to Rain
df
2
1
1
2
1
Sum of
Squares
31.1188
0.0673
0.602
0.7427
1.2076
F Ratio
116.096
0.502
4.494
2.771
9.011
P
<0.001
0.485
0.044
0.081
0.006
A.T
i6-:
6-
i -
,!! 2 -
a Manure Age
b
B. -
I 6-
I 5-
!<-!
1 3-
0-day 30-day
Treatment Factor
Delay
90-day
I
1-day 3-day
Treatment Factor
Levels of £. co// in hay land plot runoff by two treatment factors.
Error bars represent+1 standard deviation; bars labeled with
different letter(s) differ significantly (P < 0.1).
Data Analysis
Statistical analysis of £. co/; data was conducted
on logio transformed data to satisfy the
assumptions of normality and equal variances. All
statistical tests were performed using JMP
software at an a of 0.1. The effect of treatment on
levels of £. co/; in runoff was evaluated by multi-
factor analysis of variance (ANOVA). After an
initial pass that included all treatment factors and
all possible interactions, nonsignificant (P > 0.1)
interactions were removed from the model and a
final reduced-model ANOVA was conducted.
Interpretations of treatment effects were based on
the reduced model.
Source: Meals and Braun 2006
7-59
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.7.2 Analysis of BMP Input/Output Data
For some BMPs, such as agricultural water and sediment control basins or stormwater treatment devices,
it is possible to assess practice effectiveness by directly monitoring input and output pollutant
concentration and load. In either an agricultural or an urban setting, inflow and outflow variables such as
flow volume, peak flow, EMC, or pollutant loads, are measured and the effectiveness of the BMP is
calculated by comparing input vs. output.
Paired input and output data can be compared by testing for significant differences in group means using
the parametric paired Student's t or the nonparametric Wilcoxon Rank Sum test. Comparison of random
observations from two samples (e.g., input and output from a large constructed wetland for which it is not
possible to collect paired samples due to uncertain or variable flow pathways or time of travel) can also be
made with a t-test if equal variance is confirmed (e.g., F test); the Mann-Whitney test is the nonparametric
alternative in this case. These tests are described and illustrated in detail in chapter 4 (pages 4-34 to 4-52)
of the 1997 guidance (USEPA 1997b).
Once a statistically significant difference is confirmed, BMP efficiency can be reported in a number of
ways, including:
« Efficiency ratio (percent reduction in flow, EMC, or load),
• Summation of loads (percent reduction in sum of all monitored loads)
• Regression of loads (reduction efficiency is expressed as the slope of a regression line for input
load vs. output load)
« Efficiency of individual storm load reductions across all monitored events
• Percent removal relative to a water quality criterion
All of these methods are described and illustrated by Geosyntec and WWE (2009). It is recommended
that more than one method is used wherever possible because the results may differ. For example, results
from the summation of loads and efficiency ratio (e.g., EMC) methods may not agree because of
differences in how the water budgets are represented (Erickson et al. 2010b).
The EMC is the total event load divided by the total runoff volume. It should be noted that, for large
practices such as some constructed wetlands, the influent EMC (EMCi) must be adjusted to account for
rain that falls directly onto the practice (Erickson et al. 2010a). Long-term performance can be determined
by calculating the average EMCs (AvgEMC) for both influent (input or AvgEMCi) and effluent (output
or AvgEMCo) and using these values to calculate the percent reduction in concentration (Erickson et al.
2010b). The simple equation becomes:
(AvgEMCj - AvgEMC0\
Long - Term Efficiency = 100 x —
V AvgEMC, I
An alternative approach that can add statistical power is to pair the input and output EMCs for each storm
and calculate the average of the differences as an estimate of pollutant reduction efficiency. A paired li-
test can then be used to determine both the statistical significance of and confidence interval for the
reduction. See section 4.2.1 (pages 4-11 to 4-14) of the 1997 guidance (USEPA 1997b) for additional
information and an illustrative example of EMC calculations.
7-60
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
The percent reduction in the sum of all monitored loads is calculated using the summed loads for both the
input (Li) and output (Lo):
Percent Reduction = 100 x
(LI -
Similar to the alternative proposed for EMCs, the average differences between paired input and output
loads can also be used as an estimate of pollutant reduction efficiency.
Erickson et al. (201 Ob) illustrate a method for determining the uncertainty of long-term performance
estimates that are based on either the EMC or summation of load method they describe. Required input is
the number of storm events, the standard deviation of the performance data, and a Student's t value.
Using data from Erickson et al. (201 Ob), Figure 7-21 illustrates regression of effluent against influent
event loads. It should be noted that in this example the y-intercept was not constrained to the origin as
recommended9 by Geosyntec and WWE (2009). The slope of the line indicates that effluent concentration
is 37 percent of influent concentration above the baseline level (y intercept) of 0.01 kg TP. In other
words, the BMP reduces the load by 63 percent (100-37), a number that agrees well with the 57.5 percent
removal rate calculated by summation of loads (Erickson et al. 2010b). Regression analysis is illustrated
and described at CADDIS Volume 4: Data Analysis.
Regression of Effluent vs. Influent Load
0~)t.
n 7
3
•D
ra
5 n IR
Q_
"S
i~ n 1
^ U.I
01
3
it
LU n OR
y = 0.3731x + 0.0109
R2 = 0.6012 ^
+ + ^
^^ *
^^
* ^^
^
0 0.1 0.2 0.3 0.4 0.5 0.6
Influent Total P Load (kg)
Figure 7-21. Regression of output versus input load (data from Erickson et al. 201 Ob)
9 While specified in the definition of the regression of loads method, Geosyntec and WWE (2009) includes a
comment suggesting that such a constraint "is questionable and in some cases could significantly misrepresent the
data."
7-61
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
BMP efficiency evaluated by input/output
monitoring is frequently reported as simply
percent removal of a pollutant. In most cases,
this is an inadequate basis for assessing BMP
performance. Percent removal is primarily a
function of input quality, and BMPs with a
high apparent removal percentage may still
have unacceptably high concentrations or
loads in their output. Some BMPs with long
retention times (e.g., constructed wetlands)
show long-term performance that is not
evident in comparing paired input-output
samples because material from one event is
not discharged until a subsequent event (i.e.,
the samples are not paired or matched).
Finally, a simple percent removal calculation
can be dominated by outliers that distort an
average performance indicator.
For these and other reasons, USEPA and
ASCE have recommended the Effluent
Probability Method for evaluating
input/output data from a BMP (Geosyntec and
WWE 2009). In this procedure, a statistically
significant difference between input and
output EMC or load is verified (e.g., by
Student's t Test). Then, a normal probability plot is constructed of input and output data that allows
comparison of BMP performance over the full range of monitored conditions. For example, Figure 7-22
shows an effluent probability plot for chemical oxygen demand (COD) from an urban wet detention pond
evaluation The plot shows that COD was poorly removed at low concentrations (<20 mg/L), but that
removal increased substantially for higher concentrations.
The Effluent Probability Method is essentially a cumulative distribution function for the EMCs of the
inflows and outflows. The cumulative distribution function depicts the probability of values being below
a given EMC value or the EMC values that a percentage (e.g., 50 percent) of the data falls above.
The magnitude of the difference in EMC (or loads) from the inflow and outflows can be examined across
the range of EMC values. The Kolmogorov-Smirnov test is based on cumulative distribution functions
and can be used to determine if the two empirical distributions are significantly different (Snedecor and
Cochran 1989).
Constructing an Effluent Probability Plot
The cumulative distribution function for the EMCs
for the outflows and inflows can be created from
the following steps:
• Calculate the EMC for each storm's outflows.
• Rank all EMCs for all storms from smallest to
largest.
• Assign a 0 to 1 'probability' to the data based
upon their ranked order. For example, if 10
storms were monitored, the ranked values
would receive a 'probably ranking' value of
0.1, 0.2, ... 1.0 for the lowest to highest EMC
values.
• Plot the 'probability ranking' values on the
Y-scale and the EMCs on the X-scale. The
Y-scale should be plotted on a probability
scale. Alternatively, the Y-axis could be
expressed as the number of standard
deviations (e.g., +/- 3). Because the EMCs
are likely to follow a log-normal distribution,
the X-axis should be a log scale.
• Repeat the procedure for the inflows and plot
on the same graph.
7-62
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
100%
80%
60%
40%
20%
Effluent Probability Plot
10C
101
102
TSS (mg/L)
Figure 7-22. Effluent probability plot for input/output monitoring of a wet detention pond
Percent Removal vs Criterion = 100 x
(C/ - C0)
(C, - Cc)
Percent removal relative to a water quality criterion provides an indication of how well a BMP is
performing compared to limits or expectations established for the local waterbody. Use of this method is
recommended for specific event analysis, but not for a series of events (Geosyntec and WWE 2009).
Calculation requires values for the criterion (Cc), input (Ci), and output (Co), all expressed in the same
units (concentration in this case):
For example, in a watershed with a target total N concentration of 0.75 mg/L, storm inlet and outlet
concentrations of 3.6 mg/L N and 1.6 mg/L N, respectively, would yield a relative percent removal of
70 percent.
The reader is referred to Urban Stormwater BMP Performance Monitoring (Geosyntec and WWE 2009)
for additional information on evaluating urban storm water BMP performance through monitoring.
7.7.3 Analysis of BMP Above/Below Data
As noted earlier, BMP performance can be assessed using an above/below-before/after monitoring design,
as long as the added area monitored by the downstream station is either entirely or predominantly
influenced by the BMP. In such cases, analysis of monitoring data is done by the same approach as
described in section 7.8.2.2. An example of this kind of above/below-before/after analysis of a single
BMP can be found in the Otter Creek (WI) NNMP project, which assessed the effects of barnyard runoff
control (see Example 7.7-2). This example illustrates application of the Hodges-Lehmann estimator
described in section 4.5.3 of the 1997 guidance (USEPA 1997b).
7-63
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Example 7.7-2. Above/Below-Before/After Analysis: Barnyard Runoff BMPs in Wisconsin
Monitoring Design
Sampling stations upstream and downstream of two investigated dairy barnyards were established in
1994/1995. At the upstream sampling stations, stream stage and precipitation were continuously
monitored, and discrete water samples were collected automatically; at the downstream stations, only
water quality samples were collected. Over the course of the study, 11-15 storm runoff periods were
sampled at each of the sites. Continuous streamflow and instantaneous concentration data were used to
estimate pollutant loads for individual storm-runoff periods.
Pre-BMP Analysis
A critical aspect of obtaining useful conclusions for this study was the ability to document that
downstream loads were significantly greater than upstream loads before the BMP systems were
implemented. Results of t-Tests showed that, for the pre-BMP period at both creeks, downstream loads
of total P, ammonia, BOD, and fecal coliform bacteria were significantly greater than upstream loads. At
Otter Creek, pre-BMP downstream loads of total suspended solids also were significantly greater than
those upstream. These significant differences indicated that each barnyard was an important contributor
to the instream pollutant loads for the storm-runoff periods monitored.
Effects of Treatment
The difference between upstream and downstream constituent loads was computed for each pre- and
post-BMP storm-runoff period. These differences were considered to be the load contributed by each
barnyard. The bar graphs indicate that both barnyard BMP systems have reduced loads in the stream for
each constituent. Each bar represents the median of all the differences between upstream and
downstream constituent loads for both pre- and post-BMP storm-runoff periods. Although these medians
could have been used to determine the percentage reduction achieved by each barnyard BMP system, it
was decided that use of the Hodges-Lehmann estimator would be a more accurate approach (Helsel and
Hirsch 2002). The Hodges-Lehman
estimator is the median of all possible
pairwise differences between pre- and
post-BMP barnyard loads. This median
difference was then divided by the pre-
BMP median barnyard load for each
constituent. The result was a percentage
load reduction for each constituent.
The barnyard BMP system at Otter
Creek reduced loads of total suspended
solids by 85 percent, total P by 85
percent, ammonia by 94 percent, BOD
by 83 percent, and microbial loads of
fecal coliform bacteria by 81 percent; the
respective loads at Halfway Prairie
Creek have been reduced by 47, 87, 95,
92, and 9 percent.
OTTER CREEK
N .'la; :1J
Fecal Colflorm *
* Par-carnage raduclicn ia computed by dry id ing the H edges- La hm ann estimator for pra-and poal-
BMP barnyard loads bythepra-BMP median barnyard bad.
" Fecal Coliform microbial load in 1011 coloniaa.
Source: Stuntebeck and Bannerman 1998
7-64
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.7.4 Analysis of BMP Paired-Watershed Data
Some BMPs - especially agricultural BMPs that involve treatment of an entire field such as conservation
tillage, cover crops, or nutrient management - can be evaluated using a paired-watershed design. In this
case, monitoring takes place at the edge of field-sized watersheds, wherein one entire monitored field is
designated to receive the BMP treatment. Automated samplers are required to collect storm event runoff.
In the paired-watershed design, monitoring occurs during a calibration period in which both fields or
subwatersheds have identical management. Then, after their pollutant responses to the same rainfall
events are correlated, a treatment period occurs in which one of the subwatersheds receives the BMP
treatment and the other remains in the 'controlled' management. Analysis of covariance (ANCOVA) is
used to analyze the monitoring data from this type of study. See section 7.8.2.1 for details.
7.8 Data Analysis for Assessing Project Effectiveness
7.8.1 Recommended Watershed Monitoring Designs
Assessing the effectiveness of a watershed project where multiple BMPs are implemented in a land
treatment program across a broad watershed area is a complex task with many sources of variability and
uncertainty. Attributing changes in water quality documented through monitoring to land treatment, rather
than to other causes such as drought or extreme weather, is another significant challenge. Monitoring
designs (see chapter 2) recommended for assessing watershed project effectiveness are:
• Paired-watershed (link to section 2.4.2.3)
• Above/below-before/after (link to section 2.4.2.6)
• Nested-watershed (link to 2.4.2.3)
• Single watershed trend (link to section 2.4.2.5)
While not generally recommended because of cost and logistical constraints (see section 2.4.2.8), data
analysis for multiple-watershed studies is also discussed here. These designs vary in their ability to
evaluate watershed project effectiveness while controlling for sources of change other than land
treatment; the designs also vary in the appropriate approach to data analysis. The paired-watershed design
is generally considered to be the best design for this purpose because it strives for a controlled experiment
to evaluate BMP effectiveness at a watershed scale, accounting for year-to-year variability in weather and
streamflow through the use of a control watershed. Several common watershed project designs are
excluded from the above list because they are not generally capable of reliably documenting water quality
change and attributing the change to land treatment. Single watershed before/after and side-by-side
watersheds, for example, cannot be recommended for watershed project effectiveness monitoring because
they cannot be used directly to separate the effects of the BMPs from those of climate or watershed
differences (e.g., soils, slope, land management) which may be the actual causes of the observed
differences (see section 3.4). The single watershed before/after design can, however, be useful in
comparing pollutant loads over time to determine if TMDL goals have been achieved (see section 7.9).
None of these designs will perform effectively, however, if all the requirements of the design are not met. In
some cases, failure to meet a single criterion (e.g., unexpected treatment in the control watershed of a paired
design, or changing analytical procedures during a long-term single-station study) may doom the effort.
7-65
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Each of these designs is discussed in chapter 2; information relevant to data analysis procedures are
provided in this section.
7.8.2 Recommended Statistical Approaches
The following sections recommend statistical approaches to analysis of data from recommended
watershed monitoring designs. Additional details on specific statistical tests can be found in chapter 4
(Data Analysis) of the 1997 guidance (USEPA 1997b).
Additional Information on ANCOVA
• USEPA. 1997b. Monitoring Guidance for
Determining the Effectiveness of
Nonpoint Source Controls Chapter 4;
• Clausen and Spooner. 1993. Paired
Watershed Study Design. 841-F-93-009;
• Grabow et al. 1999. Detecting Water
Quality Changes Before and After BMP
Implementation: Use of SAS for
Statistical Analysis: and
• Grabow et al. 1998. Detecting Water
Quality Changes Before and After BMP
Implementation: Use of a Spreadsheet
for Statistical Analysis of Paired
Watershed, Upstream/Downstream and
Before/After Monitoring Designs.
7.8.2.1 Paired Watershed
As described in chapter 2, the most effective
practical design for evaluating watershed-level
BMP effectiveness through monitoring is the
paired-watershed design due to the presence of an
experimental control for year-to-year hydrologic
variability (Clausen and Spooner 1993). The
paired-watershed design has been discussed in
section 2.4.2.3. The basic design involves two
watersheds (a control, where no BMPs are to be
implemented, and a treatment watershed where
land treatment will be applied) and two periods (a
pre-treatment or calibration period, and a treatment
period). Analysis of paired data (i.e., frequently
collected chemical or physical data) from treatment
vs. control areas should show a statistically
significant correlation and result in a strong linear
regression model (usually using log-transformed
data) that changes from the pre-treatment to post-
treatment period. In the case of biological monitoring (e.g., sampling twice per year), relationships
between treatment and control watersheds should change in a more qualitative manner from pre- to post-
treatment periods. For example, treatment and control watersheds may both be of "poor" quality in the
pre-treatment (or pre-BMP) period, whereas the treatment watershed improves to "good" quality while
the control watershed remains at "poor" quality during the post-treatment period. Additional
considerations for paired-watershed designs with more than one treatment watershed are discussed at the
end of this section.
See section 4.8 of the 1997 guidance (USEPA 1997b) for details and an example including a method for
determining if enough calibration data has been collected to warrant advancing to the BMP treatment
period. Failure to establish a statistically valid pre-treatment correlation will doom the evaluation design.
7.8.2.1.1 Analysis of Covariance (ANCOVA) Procedure - Paired-Watershed Analysis
The Analysis of Covariance (ANCOVA) procedure is used to analyze data from a paired-watershed study
(Clausen and Spooner 1993, Wilm 1949, Clifford et al. 1986, Meals 2001). ANCOVA combines the
features of ANOVA with regression (Snedecor and Cochran 1989) and is an appropriate statistical
technique to use in analysis of watershed designs that compare pre- and post-BMP periods using
treatment and control watershed measurements. When applied to the analysis of paired-watershed data,
7-66
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
ANCOVA is used both (a) to compare pre- and post-BMP regression equations between water quality
measurement values (e.g., sediment concentration) for the treatment and control watersheds and (b) to test
for differences in the average value (e.g., of sediment concentration) for the treatment watershed between
the two time periods after adjusting measured values for covariates such as flow. Covariates are added to
the analysis to decrease the residual error and give a more precise comparison between covariate-adjusted
mean values.
There are three basic steps to performing ANCOVA:
1. Obtain paired observations
2. Select the proper form of linear model
3. Calculate the adjusted means (LS-means) and their confidence intervals
Paired observations could represent observations collected on the same date, the same time period for
composite samples, or from the same storm event. Weekly flow-weighted composite samples taken at the
outlet of both control and study watersheds would satisfy this requirement.
The second step is to select the proper form of the model. There are two basic statistical models here for
paired-watershed studies:
« The change in treatment watershed concentration with change in control watershed concentration
(i.e., the slope of the linear relationship between paired samples) remains constant through both the
calibration and treatment periods.
• The slope of the relationship changes from calibration to treatment period.
ANCOVA for paired-watershed studies is illustrated by Figure 7-23 where pollutant concentration (or
load) pairs are plotted with the treatment basin values on the Y-axis and the control basin values on the
X-axis. The slopes of the pollutant concentrations plotted for both periods are tested to determine if they
are significantly different (see B in Figure 7-23) or if the same slope can be assumed (see A in Figure 7-
23). A change in slope and/or mean value indicates that pollutant concentrations for the treatment
watershed exhibited different patterns, or magnitude, after BMPs were applied as compared to the
calibration period. For example, in both A and B of Figure 7-23 the same concentration in the control
watershed corresponds to a lower concentration in the treatment watershed in the post- (treatment) versus
the pre-BMP (calibration) period, indicating beneficial effects from the BMPs. In the case of B, both the
mean and the slope are reduced in the treatment period. The adjusted mean concentrations (LS-means) for
the calibration and treatment periods are also compared for differences as described above under
"ANCOVA Procedure."
The best statistical model for a particular dataset is determined with a test for homogeneity of slopes
(i.e., same or different slopes) using the 'full analysis of covariance model' that allows for separate
regression lines (i.e., different slopes and intercepts, Figure 7-23B) for the calibration and treatment
periods (i.e., the groups) for the regression of the treatment watershed variable (Y) on the control
watershed variable (X):
7-67
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
= b0i + / bi; (X^) + 6jj ("Full statistical model" for different slopes)
Where:
Yij = the jth observation for Y in period i (e.g., pollutant concentration or load from treatment
watershed)
boi = the intercept (B0) for period i
bii = the regression coefficient (Bi) of Y on X for period i
Xij = the jth observation for X in period i (e.g., pollutant concentration or load from control
watershed paired with same sample time as Y;J)
k = number of time periods (with 'calibration' and 'treatment' periods, k=2)
eij = the residuals or experimental error for the j* observation for Y in period i. Note: if the data
are weekly, biweekly, or monthly, this error series is likely autocorrelated with
Autoregressive, Lag 1 or AR(1) and depicted as Vij or Vt. A statistical model that allows
for this autocorrelated error structure should be used (e.g., PROC AUTOREG in SAS
software (SAS Institute 2016d) or use a correction for the standard error on the test of
LS-means (See section 7.3.6)
The F-Test for the homogeneity of slopes is used to see if the best model requires separate slopes for each
period or the same (pooled) slope (Clausen and Spooner 1993). The best model will have the lowest
residual sum of squares (SSE). The F-statistic for testing the homogeneity of slopes is:
[(SSER - SSEF)1
F statistic = -—-^ —— /MSEF
(k-1)
Where:
SSER = Residual sum of squares for the reduced model with a common (pooled) slope (see
below)
SSEF = Residual sum of squares for the full model which allows for separate slopes for the
calibration and treatment periods
k = number of groups (calibration + treatment periods = 2 in this case)
MSEF = Mean square error from the full model
This F-statistic is compared to an F distribution with (k-1) and (N-2k) degrees of freedom (d.f.), where k
is the number of groups and N is the total sample size (i.e., the total number of paired samples used in the
analysis). See Example 7.8-1 below for examples of how to test if the slopes are different using an
'interaction' term in the statistical software programs.
If there is no evidence for separate slopes, then a "reduced model" with the same slopes assumed for each
group (based on pooled data) should be used (see Figure 7-23 A). If the interaction term is significant,
then the "full model" is the correct model and the significance of the difference between all possible pairs
can be obtained (see Figure 7-23B).
7-68
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Same Slope - Different Mean
1C
-u S
i
~ 6
£ 2
O
Calibration
Difference
Treatment
0.5 I 1.5 2 2.5 3 3.5 4
Control Watershed
5.5
Different Slope and Mean
1 n
•n Q
cu
f— g
fi S
fli 7
*5 '
ra R
5 6
c 5
Q> 4
£ 4
IS 3 -
£ ?
H 2 "
1 _
B
_^f
_ff^^
^1*^ ^^,
Calibratioir^^^'
^ ^*
Difference _^*^
4^*^
^4^_. T
^frfx^ _Lreatnieni_
^f^i^lll^^J^
^^^_~_
i i i i i i i i i i i i
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
Control Watershed
Figure 7-23. Conceptualized regression plots for paired-watershed data. The red line indicates the
comparison of the treatment watershed from the calibration vs. treatment periods evaluated at the
LSMEANS value of 2.5 (the mean of all sampled values in the control watershed over the entire
sampling duration (both treatment and calibration period).
7-69
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-1. Software Examples for the Statistical Analyses using Analysis of
Covariance (ANCOVA) for the Paired-Watershed Study
Statistical software packages may vary in how they address ANCOVA. A few examples are given below.
NOTE: We will provide a sample dataset (e.g., Walnut Creek, IA) and results for this example so users
can test their own techniques and software.
A. SAS Software, assuming no autocorrelation
The SAS (SAS Institute 2010) program statements that generate a covariance model with unique slopes
for each group ("full model", different slopes) are:
PROC GLM; CLASS PERIOD;
MODEL Y = X PERIOD PERIOD*X/ SOLUTION;
LSMEANS PERIOD/PDIFF;
Where the user inputs the variable names used for their project data for:
Y = Name of variable which contains the treatment watershed values (e.g., concentration/load)
X = Name of variable which contains the control watershed values (e.g., concentration/load)
PERIOD = calibration or treatment period
PERIOD*X = the "interaction" term that allows for different slopes for each PERIOD
The other terms are part of the SAS program software syntax. SOLUTION is optional but generates the
regression equation for each PERIOD. The LSMEANS SAS statement generates the LS-means for each
PERIOD. The PDIFF option produces significance tests to compare the LS-means for each PERIOD for
statistically significant differences.
If there is no evidence for separate slopes (i.e., the PERIOD*X interaction term in the SAS output is not
significant), then a "reduced model" with the same slopes assumed for each group (based on pooled
data) should be used. If the interaction term is significant, then the "full model" is the correct model and
the significance of the difference between all possible pairs can be obtained from the PDIFF option in the
LSMEANS statement above.
The SAS program statements that generate a covariance model with common slope but unique
intercepts for each period ("reduced model") are:
PROC GLM; CLASS PERIOD;
MODEL Y = PERIOD X/ SOLUTION SS1 SS3;
LSMEANS PERIOD /PDIFF;
NOTE regarding data setup:
The input data set has columns for each of the variables: Y, X, PERIOD, and DATE. Although DATE is
not used in this software example, it is useful to match the values in each row for Y, X, and PERIOD to
the correct sample collection date so that the Y and X values are correctly paired up. For the PROC
GLM software procedure, PERIOD can be "0" and "1" or "Pre" and "Post" or any other numeric or
character value desired. But, be aware that internal to SAS, "0" and "1" values will be generated based
upon the alphabetical order- something to consider when interpreting the solutions for the regression
line equations for each time period.
7-70
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-1. Continued
B. SAS Software, data set with autoregressive, lag 1, AR(1) autocorrelation
The SAS (SAS Institute 2010) program statements that generate a covariance model with unique slopes
for each group ("full model", different slopes) and accommodate an AR(1) error structure are:
PROC AUTOREG;
MODEL Y = X PER PER_INTER/NLAG=1 DWPROB;
Where the user inputs the variable names used for their project data for:
Y = Name of variable which contains the treatment watershed values (e.g.,
concentration/load)
X = Name of variable which contains the control watershed values (e.g., concentration/load)
PER = calibration or treatment period ("0" for pre-BMP period values; "1" for post-BMP values)
PERJNTER = the "interaction" term that allows for different slopes for each period. This is a
numeric variable whose values are created by multiplying the values of X and PER for each
observation
The other terms are part of the SAS program software syntax. NLAG=1 indicates a lag 1 error structure
(PROC AUTOREG assumes an autoregressive error structure).
If there is no evidence for separate slopes (i.e., the PERJNTER interaction term in the SAS output is not
significant), then a "reduced model" with the same slopes assumed for each group (based on pooled
data) should be used. If the interaction term is significant, then the "full model" is the correct model.
The SAS program statements that generate a covariance model with common slope but unique
intercepts for each period ("reduced model") are:
PROC AUTOREG;
MODEL Y = X PER /NLAG=1 DWPROB;
NOTE regarding data setup:
The data setup is similar to the PROC GLM software example in A above, except there is no CLASS
option in PROC AUTOREG. Numeric input variables needs to be created for all input variables (e.g., 0
and 1 for pre- and post- BMP periods). Since this model includes is a time series error structure, the data
must be sorted by date order and have equal spaced time intervals. PROC AUTOREG can correctly
handle missing values. In such cases, a data record for the date should be included, but with missing
values (indicated by a "." for the missing data input values.
When the reduced model with common slopes is used, the following equation (Snedecor and Cochran
(1989) should be used to describe the linear regression for each time period, i, which would have the same
slope, but be allowed to have different intercepts:
YJJ = boi + b1(Xij) + 6jj ("Reduced model" for same slopes)
Where:
Yij = the jth observation for Y in period i (e.g., treatment watershed concentration or load)
boi = the intercept for period i
bi = the regression coefficient of Y on X pooled over all periods
Xij = the jth observation for X in period i (e.g., control watershed concentration or load)
eij = the residual or experimental error for the j* observation for Y in period i (Vt for autocorrelated
error series)
Note that this version of the covariance model forces the slope of the regression of Y on X to be the same
for each group, but allows the intercept to be unique (i.e., the regression lines representing each group are
parallel).
7-71
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-1. Continued
C. JMP Software, data set with no autocorrelation
Steps: Analyze => Fit Model => Select "V Variable, Add variables to the Model Effects ("X" and
"PERIOD, highlight PERIOD and X variables in Select Colum and then select 'Cross' in Model Effects to
include interaction term=>Run
NOTE regarding data setup:
The input data set has columns for each of the variables: Y, X, PERIOD, and DATE. Although DATE is
not used in this software example, it is useful to match the values in each row for Y, X, and PERIOD to
the correct sample collection date so that the Y and X values are correctly paired up. For the PROC
GLM software procedure, PERIOD can be "0" and "1" or "Pre" and "Post" or any other numeric or
character value desired. But, be aware that internal to SAS, "0" and "1" values will be generated based
upon the alphabetical order- something to consider when interpreting the solutions for the regression
line equations for each time period.
Note: if data has autocorrelated, autoregression, order 1 or AR(1) error series, the standard error on the
differences between the LS-means can be adjusted and then the corrected significant differences can be
determined by:
std. dev.corrected=std. dev.uncorrected
Where p = autocorrelation coefficient at lag 1
Std. dev = standard error on the differences of the LS-means
D. MiniTab Software, data set with no autocorrelation
Steps: Stat > ANOVA > General Linear Model. In the responses, model, and random factors dialogue
boxes, enter "Y", "X PERIOD X*PERIOD", and "PERIOD", respectively. The user can choose whether to
use adjusted or sequential sum of squares under the options button and pain/vise comparisons can be
chosen from the comparisons button. Pressing OK button runs the general linear model.
Reference: Minitab (2016)
Lastly, calculation of the adjusted means and their confidence intervals can be performed. After the
correct model is determined ("Full" or "Reduced" model), then the adjusted LS-means10 which correct for
the bias in X between periods can be calculated. The LS-mean of each period (i.e., calibration and
treatment periods in this case) is the period mean for Y adjusted to an overall common value of X. In
other words, the LS-means are the calibration and treatment period regression values for the treated
watershed evaluated at the mean of all the control watershed values over both time periods (e.g., mean of
all the X values). Operationally, inserting the mean of all X values into the regression equations for the
calibration and treatment periods will yield the LS-mean values for each period, respectively. An F-test of
the adjusted LS-means then determines if there is sufficient evidence to conclude that the adjusted LS-
mean for the treatment period is different from the adjusted LS-mean for the calibration period. The SAS
program performs this F-test on the "Period" variable in Example 7.8-1.
10 LS-means (least square means) are used in ANCOVA as a better comparison of average values between periods
as compared to arithmetic means. LS-means are estimated values that are evaluated at the average value of the
specified covariate(s) such as the control watershed values in the paired-watershed study design.
7-72
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Caution must be used when interpreting the results for the comparisons of adjusted means when
individual slopes are used. When the slopes are not parallel, the comparisons of adjusted means may not
be the most meaningful question. One may be more interested in the behavior over the entire range of X.
In this case a graphical presentation may be most appropriate.
For samples collected daily, weekly, biweekly, or monthly, autocorrelation may be significant. In these
cases, autocorrelation can be addressed by using a software regression program that incorporates the
autocorrelation in the error term, for example PROC AUTOREG by SAS (SAS Institute 2016d); see
Example 7.8-1.
7.8.2.1.2 Multivariate ANCOVA-Paired Watershed with Explanatory Variables
Note that the above analysis employed a basic univariate ANCOVA model that included only data on the
pollutant variable of interest (e.g., concentration or loads) from the control and treatment watersheds. The
New York NNPSMP project demonstrated the successful use of a multivariate ANCOVA technique that
included hydrologic variables (e.g., instantaneous peak flow rate, event flow volume, and average event
flow rate) in the model (Bishop et al. 2005). The project found that including the flow covariates
explained 80 to 90 percent of observed variability in annual and seasonal event P loads, an improvement
of 16 to 50 percent versus a simpler univariate model. In addition, inclusion of covariates reduced the
minimum detectable treatment effect by 11 to 53 percent versus the univariate model, a result that
indicates potential cost savings through reduced sample size requirements. It is important to note that the
inclusion of additional covariates (i.e., those in addition to the variable of interest in the control
watershed) is prefaced upon the assumption that they are not affected by BMP implementation. In this
example, testing indicated no influence of BMPs on farm runoff volume, event peak flow, or average
event flow.
In the case of a paired-watershed study, explanatory variables (covariates) would be added to the
statistical model. The full model which allows for different slopes for each time period and covariate is:
k d+1
V — h J- N h fy } 4- \h fy ^ -I- P
Mj U0i ~ / t uli V. lijy / , ci v^cijy ~ ij
i=l c=2
Where:
Yij = the jth observation for Y in period i (e.g., pollutant concentration or load from treatment
watershed)
boi = the intercept (bo) for period i
hi; = the regression coefficient (bi) of Y on Xi for period i
bd = the regression coefficient (bc) for covariate Xc for period i
k = number of time periods (with 'calibration' and 'treatment' periods, k=2)
XHJ = the jth observation for Xi in period i (Xi is the pollutant concentration or load from control
watershed paired with same sample time as Y;J)
d = number of explanatory variables in addition to the control watershed variable. For example, if
only flow was used as a covariate, d=l and the explanatory variable for flow would be
X2.
Xcy = the jth observation for Xc covariate in period i
ejj = the residuals or experimental error for the jth observation for Y in period i (Vij for
autocorrelated error structure)
7-73
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
As discussed above, a test for the homogeneity of slopes (by including interaction terms) would be
performed to see if a full or reduced model is the best choice, followed by calculation of adjusted means
and their confidence intervals to see if a significant difference exists between the two periods.
While the focus above has been on a basic paired-watershed study design consisting of two watersheds
(control and treatment) and two periods (calibration and treatment), ANCOVA is a powerful tool that can
also be applied to paired-watershed studies with multiple control and treatment watersheds and more than
two periods, as well as to above-below studies that have two or more time periods.
7.8.2.1.3 Multiple Paired Watersheds
Both the Jordan Cove (CT) and Lake Champ lain Basin (VT) NNMP projects included three watersheds in
their paired-watershed designs. The Jordan Cove project included a previously developed drainage area as
a control, and two newly developed drainage areas, one following traditional subdivision requirements
and another using low-impact development BMPs (Clausen 2007). The Vermont project employed a
three-way paired design including one control watershed and two treatment watersheds receiving similar
BMP systems at different intensities (Meals 2001). For both studies, the two treatment watersheds were
separately compared versus the control watershed using ANCOVA.
Changes versus the control watershed for the Jordan Cove project were represented by the percent change
in flow, concentration, and export (Clausen 2007). These calculations were made by comparing mean
predicted values (P) from the calibration regression equations to observed values (O) using the equation:
(0-P)
%Change = x 100
Meals (2001) performed a series of analyses to examine the results of the Lake Champlain Basin study.
Where full ANCOVA models were used, the calibration and treatment period regression lines intersected,
suggesting, for example, that TP concentrations in one of the treatment watersheds decreased in the high
range, but not in the lower range (Figure 7-24). The importance of this observation is that the higher range
is where active runoff conditions occur, indicating that the BMPs may have been performing as expected.
Calculations similar to those performed for the Jordan Cove project were performed to estimate the
magnitude of change (i.e., %Change), but two additional analyses were carried out to estimate this change
from different perspectives:
« Breakpoint analysis for intersecting or crossed regression lines, and
• Assessment of predicted-without-treatment versus observed-with-treatment.
7-74
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Total Phosphorus Concentration
WS 1 vs. WS 3 Calibration vs. Treatment
10
a.
1 -
I 0.1
0.01
-: Calibration
* Treatment
O
0.01
10
0.1 1
WS 3 Mean Weekly [TP] (mg/L)
Figure 7-24. Example of intersecting regression lines (Meals 2001)
For the former analysis, the point where the regression lines crossed (the "breakpoint") was used in
conjunction with the cumulative frequency of the breakpoint value in the control watershed to derive the
proportion of time or conditions at which concentration or load reductions did or did not occur in the
treatment watershed (Meals 2001). For example, the breakpoint in Figure 7-24 occurs at 0.055 mg/L in
the control watershed (WS 3), a value for which the cumulative frequency for the entire project period
was 0.32, or 32 percent. This is interpreted to mean that TP levels in the treatment watershed (WS 1) were
not reduced 32 percent of the time when the concentration in the control watershed was less than 0.055
mg/L. Conversely, TP levels were reduced 68 percent of the time when control watershed concentrations
exceeded 0.055 mg/L. This compares with an ANCOVA result that TP concentrations were reduced 15
percent in the treatment watershed.
The latter analysis was intended to assess the net treatment response regarding pollutant export over the
full range of project conditions (Meals 2001). In this analysis, all weekly values for the treatment period
in the control watershed were input to the calibration period regression for each treatment watershed to
estimate what the pollutant export would have been for the hydrologic conditions of the treatment period
under pre-treatment management, a what-if scenario. In other words, it is an estimate of the difference
between measured loads for the treatment period and what those loads would have been if the BMPs had
not been implemented.
7.8.2.1.4 Multiple Time Periods within a Paired-Watershed Study
Small watershed projects will generally have a period before BMP implementation, a period during BMP
implementation, and a period after BMP implementation. The implementation and post-implementation
periods are often lumped into the same period for data analysis, but this can complicate interpretation of
results if the BMPs are not fully functional throughout the post-BMP period. Where feasible, it may be
most appropriate to separate true implementation, and in some cases maturation of living BMPs, from
post-implementation, to establish a better test of BMP or project effectiveness. There is also a very real
7-75
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
possibility that BMP implementation will occur in phases, creating the potential for more than two or
three periods of interest. For example, in the Waukegan River NNMP project, the state Water Survey
designed biotechnical and other practices to resist high velocity runoff while increasing riparian habitat
for stream fisheries within the stream channel (White et al. 2011). However, as the project progressed it
became clear that insufficient pool depth and the lack of pools and riffles were important impairments yet
to be addressed. As a result, pool-and-riffle sequences were later added to the restoration program,
creating a two-phase implementation effort. Still, however, project scientists concluded that there is a
remaining need to address sewage and stormwater management problems and take steps to increase
implementation of alternative conservation practices that infiltrate and treat stormwater. Were the
monitoring program to be continued, these could be considered additional BMP implementation phases.
Taken to the extreme, each year could also be considered its own period or group and the groups tested
for differences, but this is not recommended11. In some cases, BMPs may have different effects
depending on the season of the year, so including a seasonal covariate(s) may be appropriate. The New
York NNMP project identified four seasons that reflect seasonal variation in both source activities and
hydrologic runoff processes (Bishop et al. 2005). ANCOVA was performed separately on both seasonal
and full-year datasets. Despite the wide range of possibilities, time periods for the types of projects
envisioned by this guidance will largely be drawn from the following set of options:
« pre-BMP or calibration,
• BMP implementation (may be subdivided by growth stage if it involves vegetative BMPs), and
• post-BMP implementation (which may include BMP implementation as well).
Where multiple phases of BMPs are to be implemented, however, there could be a separate pre-BMP
implementation and post-BMP implementation for each phase. It is important to identify and plan for
these phases at the beginning of the monitoring project. Adjustments may be warranted later, however,
because the implementation of BMPs may be more gradual or sporadic than anticipated during the
planning phases of a study, and some BMPs, like forested buffers, may take longer than expected to reach
critical growth stages.
For example, in a 15-year project monitoring the effectiveness of a riparian forest buffer in an agricultural
watershed, it was expected that it would take several years for the planted seedlings to have a measureable
influence on water quality (Newbold et al. 2009). To account for this, the calibration period was taken to
be the first five years (1992-1996) of monitoring, a period during which the seedlings became established
but remained too small to affect stream nutrient concentrations. Regression analysis was used to detect
gradual change and one-way ANOVA was performed on the differences between paired samples, with
year treated as the main effect.
7.8.2.1.5 Other Statistical Approaches for Paired- Watershed Analyses
Paired watersheds can also be analyzed with other statistical techniques. For example, some authors have
used the differences between sample pairs taken at each watershed for each sampling date (Carpenter et
al. 1989; Bernstein and Zalinski 1983; MacKenzie et al. 1987; and Palmer and MacKenzie 1985) for input
into t-test or intervention analysis. Hornbeck et al. (1970), Hibbert (1969), and Meals (1987) calculated a
11 It is feasible that a 2-year study could include one year each of pre-BMP and post-BMP monitoring, but this
would be highly unusual and not, in fact, recommended. A similar situation would be a 3-year study with a pre-
BMP, BMP-implementation, and post-BMP year.
7-76
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
linear regression equation relating the observations from the two watersheds for the calibration period.
Observations from the treated watershed in the treatment period were compared to predicted values from
the calibration period regression. If the deviations exceeded the 95 percent confidence intervals placed
about the calibration regression, the treatment was thought to be significant (Hornbeck et al. 1970).
7.8.2.2 Above/Below - Before/After
An above/below-before/after watershed design monitors a water resource (e.g., a stream) above and
below the drainage area in which land treatment is applied for multiple years before and after BMP
implementation (see section 2.4.2.6). Consistency of sampling regime at both stations overtime is
essential. Hydrologic explanatory variables (e.g., covariates) such as stream flow must also be monitored
to permit correction for changes in these conditions.
7.8.2.2.1 Comparing Means and Differences between Means
Two principal approaches can be taken to statistical analysis of data from this monitoring design. Both
approaches are illustrated by the projects in Examples 7.8-2-7.8-5. In the first approach, mean upstream
and downstream pollutant concentrations and/or loads can be compared (e.g., with the Student's t or
Wilcoxon Rank Sum tests) prior to the application of BMPs to evaluate statistically significant
differences between group means. The purpose of this analysis is to confirm and quantify the pre-
treatment ("before") pollutant contribution of the untreated downstream area. This analysis is then
repeated for the "after" data to document the changes in pollutant contribution of the treated downstream
area. Differences between upstream and downstream conditions from the before to the after condition can
be evaluated simply by examining the percent reductions in concentration or load or by conducting a
group means test of the differences between upstream and downstream concentrations or loads from the
before to the after period. A significant decrease in this upstream/downstream difference in the "after"
period, for example, would suggest a significant effect of treatment. In addition to quantitative statistical
tests, it is also possible to visualize differences between above/below and before/after using comparative
boxplots, bar graphs, or other graphical techniques (see section 7.3.2).
A more statistically powerful approach would be to use the paired Student's t-test to test the differences
between the downstream and upstream sample values in the pre-BMP period. In the post-BMP period, a
Student's t-test can be applied to the average downstream-upstream differences in the pre- vs. post-BMP
periods. Other explanatory variables can be added (e.g., stream discharge) by using an ANCOVA
statistical approach.
Differences between above and below stations were examined as part of the analyses performed for the
Otter Creek (WI) watershed project (Stuntebeck 1995). This project also incorporated innovative
sampling procedures to maximize the potential for distinguishing between upstream and downstream
water quality, including programming water quality samplers to be activated by precipitation so that time-
integrated samples were collected initially before stage-triggered samples were collected. This allowed
sampling of barnyard runoff in the stream before stage increased, thereby isolating runoff from sources
upstream. It also allowed sampling during small storms where barnyard runoff occurred in the absence of
substantial upstream contributions. In addition, investigators collected concurrent samples from both the
above and below sites via computer linkage to aid data interpretation. Paired Student's t-tests were used
to determine that the pre-BMP average of the differences between downstream and upstream event-mean
concentrations was different from zero at the 95 percent confidence level. An MDC analysis revealed that
the average downstream post-BMP event-mean concentrations of TP would need to decrease by at least
7-77
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
50 percent for the change to be considered statistically significant at the 95 percent confidence level. In
the final analysis, the Hodges-Lehmann estimator was used to determine that the barnyard BMP system at
Otter Creek reduced loads of suspended solids by 85 percent, TP by 85 percent, ammonia by 94 percent,
BOD by 83 percent, and microbial loads of fecal coliform bacteria by 81 percent (Stuntebeck and
Bannerman 1998; See Example 7.7-2). The nonparametric Hodges-Lehmann estimator is the median of
all possible pairwise differences between pre- and post-BMP barnyard loads (see section 4.5.3 of the
1997 guidance (USEPA 1997b) for a discussion of the Hodges-Lehmann estimator). This median
difference was divided by the pre-BMP median load for each constituent to determine percentage load
reductions.
7.8.2.2.2 ANCOVA
A second approach for analysis of the above/below-before/after design involves the application of
ANCOVA. The statistical analysis approach is the same as with the paired-watershed study (see section
7.8.2.1) In this case, a significant linear regression relationship for a water quality variable (e.g., weekly
mean total P concentration, weekly suspended sediment load) between the upstream and downstream
stations is obtained during the "before" period. The upstream station is considered to be the "control"
watershed. This regression relationship is then compared to a similar relationship during the "after"
period and significant difference between the two regression models indicates the effect of treatment.
Note that the analysis can include explanatory variables (e.g., covariates) like precipitation or flow in a
multiple regression model that may explain more of the variability in the water quality variable than a
simpler model.
Example 7.8-2. Above/Below-Before/After Design - Long Creek, NC NNPSMP
A number of successful projects have used multiple approaches to analyzing their data. For example,
data from an above/below-before/after study of livestock exclusion as part of the Long Creek (NC)
NNPSMP project were first log-transformed and then analyzed using t-tests, two-way ANOVA, and
ANCOVA (Line et al. 2000). While the specific questions addressed by each method differ somewhat,
the results all supported the conclusion that livestock exclusion and establishment of riparian vegetation
reduced mean weekly loads of TSS, TKN, and TP.
Example 7.8-3. Above/Below-Before/After Design (biological data) - Waukegan River, IL
NNPSMP
The Waukegan River (IL) NNPSMP project illustrates the application of the above/below design for
biological monitoring. In this project, the South Branch was divided into an upstream untreated reference
site designated as station S2 and a severely eroding downstream treated area designated as station S1
(Spooner et al. 2011 b). At each location fish, macoinvertebrates, and habitat were sampled during the
spring, summer, and fall seasons. Sampling was also conducted at stations N1 and N2 on the North
Branch for reference. Qualitative analysis of biological data collected through 2006 indicated that the
number of fish species and abundance in the South Branch had improved after the construction of
lunkers and rock grade control structures. The IBI rose sharply from a limited aquatic resource into the
moderate category after construction. Sites on both the South and North Branches where lunkers and
Newbury Weirs were applied averaged higher IBI scores and fish population with more fish species than
the untreated control at S2 or the N2 bank armored site from 1996 through 2006.
7-78
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-4. Above/Below-Before/After Design with Flow as an Explanatory Variable -
Pequea and Mill Creek Watershed, PA NNPSMP
A Pennsylvania study of the effects of streambank fencing on surface-water quality, near-stream ground
water, and benthic macroinvertebrates employed both a paired-watershed and above/below-before/after
design (Galeone et al. 2006). Data for this Section 319 NNMPMS project were collected from 1993 to
2001, with the calibration period from October 1993 through mid-July 1997. Streambank fencing was
installed from May 1997 through July 1997. The above/below-before/after design featured two sites
above fence installation (T-3 and T-4) and two sites located to show the effects of fencing (T-1 and T-2);
T1 and T2 were paired with T3 and T4, respectively, for data analysis. Both low-flow and storm-flow
samples were collected and analyzed for nutrients, suspended sediment, and fecal streptococcus (only
low-flow samples). Explanatory data collected during the study included precipitation, inorganic and
organic nutrient applications, and the number of cows.
Figure 7-25 illustrates the major data preparation steps and statistical procedures used by the project to
analyze the chemical/physical data. Low-flow, storm-flow, pre-treatment, and post-treatment data were
separated as a preliminary step. Concentrations were flow adjusted using a LOcally WEighted
Scatterplot Smoothing (LOWESS) procedure (Helsel and Hirsch 2002). Statistical tests were performed
on both original and flow-weighted data to determine if factoring out the variability caused by flow
affected the results.
After the above steps were completed, the project applied the nonparametric rank-sum test (see Mann-
Whitney test and Wilcoxon Rank Sum test on pages 4-50 of the 1997 guidance, USEPA 1997b) to
determine if data for any one site significantly changed from the pre-treatment to the post-treatment
period. In addition, the nonparametric Kruskal-Wallis test (see pages 4-56 of the 1997 guidance) was
carried out to determine if there were significant differences between any of the sites, considering pre-
treatment and post-treatment data separately. Where significant differences were found, the Tukey
multiple-comparison test (see Multiple Comparisons on pages 4-63 of the 1997 guidance) was used to
identify which sites were significantly different. The nonparametric signed-rank test (see Wilcoxon
Signed Ranks test on pages 4-42 of the 1997 guidance) was used to determine if there were significant
differences (i.e., not zero) between paired observations (e.g., matched samples from above/below sites).
Finally, ANCOVA (see section 4.8 of the 1997 guidance and section 7.8.2.1 for detailed discussions of
the ANCOVA procedure) was applied to determine the effects of streambank fencing using a procedure
highlighted by Grabow et al. (1999). ANCOVA was performed on concentrations and loads for both low-
flow and storm-flow samples. Loads were analyzed in two ways, as actual measured loads and as
weighted loads adjusted with a factor determined by dividing the annual mean discharge for each water
year by the mean discharge for the entire period for each station.
The procedures used by Galeone et al. (2006) demonstrated improvements relative to control or
untreated sites in surface-water quality (nutrients and suspended sediment) during the post-treatment
period at T-1, but T-2 showed reductions only in suspended sediment. N species at T-1 were reduced by
18 percent (dissolved nitrate) to 36 percent (dissolved ammonia); yields of total P dropped by 14
percent. Conversely, T-2 had increases in N species of 10 percent (dissolved ammonia) to 43 percent
(total ammonia plus organic N), and a 51-percent increase in total P load. The average reduction in
suspended-sediment load for the treated sites was about 40 percent. Two factors were evident at T-2
that helped to overshadow any positive effects of fencing on nutrient yields. One was the increased
concentration of dissolved P in shallow ground water (also monitored). In addition, cattle excretions at
the low-cost, in-stream cattle crossings appeared to increase concentrations of dissolved ammonia plus
organic N and dissolved P. See chapter 3 Case Study #1 for a discussion of how the benthic
macroinvertebrate data were analyzed.
7-79
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Data
/ Data* /
Separate
Low-Flow
and Storm-
Flow Data*
Separate
Pre- and
Post-
Treatment
Data*
\
f
Calculate Flow-
Weighted
Concentrations
using LOWESS*
\
Calculate
(
Weighted
Loads using Annual
Mean
Discharge/Period
Mean Discharge
Factor*
Data Analysis
Nonparametric Rank
Sum Test*
- Were post- and pre-
treatment data
different at the site?
_y
Kruskal-Wallis Test
-Were there
differences between
the sites?
Wlcoxon Signed-
Rank Test
-Was the difference
between paired
observations for
above and below
sites different from
zero?
ANCOVA
-What were the
effects of streambank
fencing? Which
factors are
significant?
"Performed separately for each monitoring site.
Tukey
Multi-Comparison
Tests
-If differences were
found between
groups, which were
significant?
Figure 7-25. Basic data preparation and analysis procedure for above/below-before/after
study in Pennsylvania (Galeone et al. 2006)
7-80
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-5. Above/Below-Before/After Design with Upstream Concentration and Flow
as Explanatory Variables - Walnut Creek, IA NNPSMP
In some cases, projects are forced to develop alternative plans for data analysis due to unforeseen
circumstances they cannot control. The Walnut Creek (IA) NNMP project, for example, began as a ten-
year paired-watershed study that also included an above/below-before/after design and three
subwatershed single-station designs within each of the paired watersheds (Schilling and Spooner2006).
The primary purpose of the project was to evaluate the response of stream nitrate concentrations to
conversion of row crops to native prairie. The normal approach of analyzing project data (both for the
paired-watershed and above/below-before/after designs) using ANCOVAwas compromised by two
facts: prairie conversion began before the calibration period was completed, and conversion to prairie
was gradual instead of rapid. Based on the guidelines and experiences of others (Spooner et al. 1987,
Grabow et al. 1998 and 1999), multiple linear regression analysis on all ten monitoring sites was
selected as an alternative approach to project evaluation (see Example 7.8-7 for the general form of
equation used). Treatment in this case was modeled as time with covariates such as upstream
concentration used to factor out hydrologic variability. For the downstream site in the treatment
watershed, a model using month (forseasonality), upstream nitrate concentration, and downstream
nitrate concentration in the control watershed provided the best fit to the data. For all other sites, month
and the log of baseflow discharge from the same or a different site were used as covariates in the best-fit
regression model. All tests resulted in detection of significant trends in nitrate concentrations, with the
downstream treatment site trend indicating nitrate reductions due to conversion to prairie (the treatment).
A negative coefficient on the time variable (-0.119 mg l~1yr~1) indicated a nitrate reduction of 1.2 mg I"1
over 10 years at this site. It was also found that in the control site, where land was unexpectedly
converted from grassland to row crops, nitrate concentrations increased during the project period.
If the errors (e.g., residuals) in the statistical models are autocorrelated, a statistical software procedure
should be used that incorporates the autocorrelation structure into the model. For example, PROC
AUTOREG of the SAS software (SAS Institute 2010) is useful with autoregressive autocorrelation
typical of weekly, biweekly, and monthly series. Alternatively, a correction of the standard deviation of
the slope estimate and revised confidence intervals can be used with the correction given in section 7.3.6.
It should be cautioned that changes in pollutant concentrations or loads measured at a downstream station
(either before or after land treatment) versus upstream may be difficult to detect if incoming
concentrations or loads at the upstream station are high and the contribution of the additional area
draining to the downstream station is small. Conversely, if the upstream contribution is very low
compared to that of the treated area, a change or difference due to treatment may be difficult to attribute
to BMPs because of dilution. If the upstream pollutant inputs do not respond similarly to hydraulic
changes (e.g., rainfall), then the design effectively becomes a single watershed design. The Walnut Creek
(IA) NNPSMP project provides an example of the former case where annual mean nitrate concentrations
ranged from 10.0 to 12.7 mg/L at the upstream site and 6.8 to 9.5 mg/L at the site below the treatment
area (Schilling and Spooner 2006). The treatment in this case was conversion of row crops to native
prairie, and the study design (paired watersheds and above/below-before/after) was compromised by the
fact that land conversion began before pre-treatment conditions could be established. See Example 7.8-5
for a discussion of how data from this project were analyzed using multiple linear regression, a technique
typically applied to single watershed trend designs.
7-81
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.8.2.3 Nested Watershed
As described in section 2.4.2.3, it is preferred that the nested subwatershed is used as the control
watershed12 and is located above the remainder of the watershed where treatment occurs (Hewlett and
Pienaar 1973). However, a valid nested design can also entail the treatment watershed in a small
headwater subbasin; the control being the much larger watershed outlet. This design requires calibration
(before) and treatment (after) periods similar to the paired-watershed design.
Analysis of data from a nested watershed design can be done using the same ANCOVA procedure
described in section 7.8.2.1 for the paired-watershed design. In the case of nested watersheds, the paired
data represent observations collected on the same date, time period, or storm at both the nested and main
watershed stations. As noted above, data from the nested watershed should represent the control
watershed, while data from the main watershed outlet represent the treatment watershed.
7.8.2.4 Single Watershed Trend Station
As noted in section 2.4.2.5, monitoring at a single watershed outlet is not a strong design for documenting
the effectiveness of watershed land treatment on water quality. Without the ability to control for the
effects of varying weather and hydrology, it is difficult to attribute any observed changes in water quality
to the land treatment program. However, because the coupling of budget limitations and accountability
requirements often leads to single-station designs, the unfortunate fact that some paired-watershed and
other superior designs fail due to unforeseen circumstances, and the simple reality that some NFS
watershed programs must rely on watershed outlet monitoring conducted by another party (e.g., a state
long-term surveillance program or a USGS network station), it is useful to discuss how best to analyze
data from such stations to assess the effects of a watershed project. In addition, experience has shown that
projects with failed paired-watershed or above/below-before/after designs may resort to trend analysis as
the best option for analyzing project data (see Example 7.8-6).
Long-term water quality data may show a monotonic trend (a continuous change, consistent in direction,
either increasing or decreasing) or a step trend (an abrupt shift up or down). Trend analysis may be the
best — or perhaps only — approach to documenting response to treatment in situations where water
quality data are collected only at a single watershed outlet station or where land treatment was
widespread, gradual, and inadequately documented. Data from long-term, fixed-station monitoring
programs where gradual responses such as those due to incremental BMP implementation or continual
urbanization are likely to occur are more aptly evaluated with monotonic trend analyses that correlate the
response variable (i.e., pollutant concentration or load) with time or other independent variables. These
types of analyses are useful in situations where vegetative BMPs like the riparian buffers implemented in
the Stroud Preserve NNPSMP project (Newbold et al. 2008) must mature, resulting in gradual effects
expressed over time. Analysis of step trends, on the other hand, is most appropriate when the change in
response to BMP implementation is rapid and abrupt (e.g., when a municipal stormwater management
regulation is enforced) and the timing of that change is known and well-documented. Biological data can
also be evaluated with either monotonic or step-trend tests. A potential limitation is that most biological
programs will only sample once a year and the time to acquire sufficient samples to detect a meaningful
trend might be longer than what is practical.
12 A reverse situation, where the downstream subwatershed area is the control is possible in theory, but all effort
would need to be made to ensure that upstream contributions to constituents measured at the downstream control
area are minimized.
7-82
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-6. Single Trend Watershed with Covariates - Sycamore Creek, Ml NNPSMP
This project planned a paired-watershed study with two treatment watersheds (Willow Creek and
Marshall Drain) and one control watershed (Haines Drain), but implementation of no-till and continuous
cover in the control watershed compromised the study (Suppnick 1999). Each watershed was then
analyzed independently, with regression analysis ultimately successful in linking reductions in TSS
(95 percent confidence level) and TP (90 percent confidence level) loads to the percentage of land in no-
till in the Willow Creek watershed (Grabow 1999, Suppnick 1999). Following is a summary of the steps
taken to establish the TSS relationship for Willow Creek (Grabow 1999):
1. Regression analysis on sediment yield versus storm discharge and/or peak flow to reduce the
analysis to water quality change over time independent of hydrologic variability. All variables
were log-transformed.
2. Two methods were then used to answer the question of whether there was a water quality trend
overtime.
a. Regression equation incorporating elapsed time and explanatory variables. This
addresses the question of whether there has been a change in water quality overtime
while simultaneously accounting for hydrologic variability.
b. Regression of residuals1 from regression on the water quality variable and explanatory
variables versus elapsed time. This addresses the question of whether there has been a
water quality change over time after adjusting for hydrologic variability.
3. Correlation of land use change to water quality change via multiple linear regression analysis.
Terms incorporated in the regression model were percent of land in no-till, percent of land in
continuous cover, storm discharge, and peak flow.
Step 1 yielded correlation between TSS load (kg/storm) and both storm discharge (mm) and peak flow
(liters/second). Discharge and peak flow were tested for collinearity which was found to be not an issue
(see Box 7.8-1).
Step 2 analyses indicated statistically significant trends in TSS and TP in Willow Creek watershed.
Method "a" used the following basic equation:
log[TSS] = fa + ^log[Q] + p2log[Qp] + 03t
Where TSS is the TSS storm load (kg), Q is the total storm discharge, QP is the peak stream discharge, t
is elapsed time in days, and the p terms are regression parameter estimates. A significant negative
value for pa indicated a reduction in TSS load overtime. Insertion of average log values of total storm
discharge and peak discharge, and setting the beginning and ending days (1 and 2,629 for tbegin and tend
in this case) would then yield the average change in loadings from the first to last data of data collection.
1 Residuals are the differences between actual and predicted values: Actual-Predicted.
7-83
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.8-6. (continued)
Sycamore Creek, Ml NNPSMP
Method "b" of Step 2 used the following equation:
TSSres = /?„ + ftt
Where TSSres is the residuals (log kg/storm) from the regression in Step 1 and t is again elapsed time.
In this approach, a statistically significant value for pi would indicate a change in the relationship
between TSS and the explanatory variables (total and peak discharge), suggesting an impact due to
land use change. The value pixtend would then estimate the change in loading (in log units) over the data
collection period. The average change in loading is determined by then plugging the average values for
log [Q] and log[QP] into the regression equation used in Step 1.
In this case, method "a" indicated a 60 percent reduction in TSS load, whereas method "b" estimated a
59 percent reduction.
With a statistically significant reduction in TSS load now documented, Step 3 explored the linkage
between that reduction and land use change by adding the percentage of land in no-till (NoTill) and the
percentage of land in continuous cover (ContCov) as additional terms in the multiple linear regression
used for method "a" in Step 2. Statistically significant regression parameters pa and/or p4 in the following
equation would indicate correlation between log[TSS] and the percentage of land in no-till and/or
continuous cover.
log[TSS] = /?„ + ^log[Q] + /32log[Qp] + /33NoTW + /34ContCov + /35t
A statistically significant value of-0.01969 was found for pa, but p4 was insignificant, suggesting that for
every percent increase in the percentage of land under no-till, the TSS load (as log kg) would decrease
by 0.01969 log units. Regression estimates based on average storm discharge and peak flow were then
used in conjunction with first-year and last-year values of no-till percentages to estimate a TSS load
reduction of 52 percent, with a 95 percent confidence interval of 18-72 percent. This agreed well with the
estimates of 59 and 60 percent reduction from Step 2.
Combining the results from the above analyses by Grabow (1999) with additional project information, it
was concluded that it is very likely that streambank stabilization also contributed to the reduction in TSS
observed in Willow Creek (Suppnick 1999).
Box 7.8-1. Collinearity
What is Collinearity?
Collinearity in multiple regression analysis occurs when there is a linear relationship
between two explanatory (x) variables. Although this does not impact the reliability of
the overall model, it does create great uncertainty regarding the model coefficients.
There are ways to address Collinearity, including recognizing the ambiguity in the
interpretation of regression coefficients (USF n.d.) or simply removing one of the
variables from the regression model (Martz 2013).
Various statistics programs have tests for Collinearity (or multicollinearity), including
the Variance Inflation Factor (VIF), Tolerance (1/VIF), and the Condition Index (SAS
2016a and 2016c, USF n.d.). Guidelines vary, but VIF values greater than 5 to 10,
Tolerance values close to 0, and Condition Index values greater than 15 to 30
indicate problems with Collinearity. See Belsley et al. (1980) for additional details.
7-84
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Several statistical trend analysis techniques will be mentioned in this section; the topic of trend analysis is
covered in more detail in Tech Notes 6: Statistical Analysis for Monotonic Trends (Meals et al.
2011). Before proceeding, it is important to recognize some limitations of trend analysis. First, trend
analysis is most effective with long periods of record; general guidelines are >5 years of monthly data for
monotonic trends and >2 years of monthly data before and after a step trend (Hirsch 1988). Short
monitoring periods and small sample sizes make documentation of trends difficult, and it must be
recognized that - especially over the short term - some increasing or decreasing patterns in water quality
are not trends. A snapshot of water quality data over a few months may suggest a trend, but examination
of a full year may show this "trend" to be part of a regular cycle associated with temperature,
precipitation, or cultural practices. Autocorrelation may also be mistaken for a trend, especially over a
short time period. Changes in sampling schedules, field methods, or laboratory practices can cause shifts
in data that could be erroneously interpreted as step trends.
Perhaps most importantly, statistical trend analysis can help to identify trends and estimate the rate of
change, but will not provide much insight into attributing a trend to a particular cause (e.g., land
treatment). Interpreting the cause of a trend requires knowledge of the watershed and a deliberate study
design (see section 7.8.1).
Before proceeding to numerical analysis, it is useful to examine time series plots for visual evidence of a
trend. Visualization of trends in noisy data can be clarified by various data smoothing techniques. Plotting
moving averages or medians, for example, instead of raw data points, reduces apparent variation and may
reveal general tendencies. Spreadsheets can display a moving-average trend line in time-series
scatterplots with adjustable averaging periods. A more complex smoothing algorithm, such as LOWESS
(ZOcally Weighted Scatterplot Smoothing), can reveal patterns in very large datasets that would be
difficult to resolve by eye (see Helsel and Hirsch 2002). Most pollutant concentrations and loads in
surface waters show strong seasonal patterns. Seasonal variations in precipitation and flow are often main
drivers of these patterns, but seasonal changes in land management and use may also play a role. See
section 4.3 of the 1997 guidance (USEPA 1997b) for additional information on seasonality.
Some techniques to address seasonality beyond controlling for the effects of flow covariates are often
necessary for water quality trend analysis. For example, the relationship between concentration and
discharge may not be consistent over time, perhaps due to seasonal variations in BMP implementation.
The relationship (or slope) can be allowed to change between time periods by the use of interaction terms
between the time periods and discharge in an analysis of covariance (ANCOVA) statistical model. An
alternative that might develop more traction with experiences is to consider a weighted regressions on
time, discharge and season (WRTDS) proposed by Hirsch et al. (2010) (see section 7.9.2 for more
information on WRTDS).
When multiple explanatory variables are included in the trend models, it is common that these variables
will be related to each other (collinearity) and/or a few data points may have a lot of 'influence' over the
regression results (Belsley et al. 1980). Regression analysis performed with various software programs
will provide leverage plots as part of the output to help identify these data features.
7.8.2.4.1 Monotonic Trends
Table 7-8 lists some monotonic trend tests available for different circumstances, including adjustments for
a covariate and the presence of seasonality. The tests are further divided into parametric, nonparametric,
and mixed types. Regression tests require that the expected value of the dependent variable is a linear
function of each independent variable, the effects of the independent variables are additive, the errors in
7-85
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
the model are independent (e.g., no correlation between consecutive errors in the case of time series data),
and the errors exhibit both normality and constant variance. Nonparametric tests require only constant
variance and independence. Parametric trend tests (see Examples 7.8-7 and 7.8-8) are considered more
powerful and/or sensitive to detect significant trends than are nonparametric tests (see Example 7.8-9),
especially with a small sample number. However, unless the assumptions for parametric statistics are met,
it is generally advisable to use a nonparametric test (Lettenmaier 1976, Hirsch et al. 1991, Thas et al.
1998).
Table 7-8. Classification of tests for monotonic (nonparametric) or linear (parametric) trend
(adapted from Helsel and Hirsch 2002)
No
Seasonality
Seasonality
Other
Explanatory
variables or
covariates
(e.g., stream
discharge)
Type of Test
Parametric
Mixed
Nonparametric
Parametric
Mixed
Nonparametric
Parametric
Mixed
Nonparametric
Not Adjusted for
covariate (X)
Linear regression of Y on t
-
Mann-Kendall
Linear regression of Y on t and
periodic functions or indicator
X's for months
Regression of deseasonalized Y
on t
Seasonal Kendall on Y
Linear regression of Y on t and
covariates (X)
Regression of deseasonalized Y
onX
Seasonal Kendall on Y
Adjusted for covariate (X)
Multiple linear regression of Y on X and t
Mann-Kendall on residuals from regression of Y on X
Mann-Kendall on residuals from LOWESS of Y on X
Multiple linear regression of Y on X, t, and periodic
functions or indicator X's for months
Seasonal Kendall on residuals from regression of Y on X
Seasonal Kendall on residuals from LOWESS of Y on X
Multiple linear regression of Y on t, X covariates
Seasonal Kendall on residuals from regression of Y on X
Seasonal Kendall on residuals from LOWESS of Y on X
Y = dependent variable of interest; X = covariate; t = time
Refer to Tech Notes 6: Statistical Analysis for Monotonic Trends (Meals et al. 2011) for details on
the tests listed in Table 7.8-1. Chapter 4 (pages 4-86 through 4-89) of the 1997 guidance (USEPA 1997b)
also discusses the computation of Mann-Kendall and Seasonal Kendall statistics.
If the trend model has autocorrelated errors, a statistical model that incorporates the autoregessive errors
should be employed. Alternatively, a correction of the standard error of the slope that is given in section
7.3.6 can be used to calculate the correct confidence interval of the slope on t (time, date) to determine if
it is significantly different from zero (e.g., evidence of a trend over time) in the pollutant concentration or
load.
7-86
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Example 7.8-7. Simple Linear Regression - Samsonville Brook in Vermont
• Eight years of monthly TP concentration data from Samsonville Brook in Vermont
• Data satisfy assumptions for regression after log transformation: normal distribution, constant
variance, independence (low autocorrelation)
< 1.000 q
M
E. ;
«
C 0.100 :
g
5 :
Q.
e 0010 -
/Q-, *CK °S) O O O O
O ff -Jp 9i_ O (O
Qo0 rt ° QCCT <° % 0%
*-^ '-b O n O p (P
°§> ® o ooo^
0
0 12 24 36 48 60 7? 84 96
Time (months)
Q.
T5
1.000
0.100 :
0.010
12 24 36 48 60 72 84 96
Time (months)
Simple linear regression (using Excel® or any basic statistical package)
Log[TP] = -0.8285-0.00414(Time)
r2 = 0.18, F = 21.268 P< 0.001
Rate of change:
Slope of log-transformed date = -0.00414
(10-°00414- 1)xiQO = -0.95%/month or about-11%/year
This result suggests that TP concentrations have decreased significantly over the period at a rate of
approximately 11 percent per year.
Note: Data used in this example are taken from the Vermont NNMP project, Lake Champlain Basin agricultural
watersheds section 319 national monitoring program project, 1993 - 2001 (Meals 2001).
7-87
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Example 7.8-8. Linear Regression with Monthly Seasons as a Covariate - Corsica River,
MDNNPSMP
A significant trend was detected in a small watershed within the Corsica River Basin, Maryland, using
times series analysis that adjusted for autocorrelation as well as monthly (seasonal) differences for log
transformed, flow-weighted total nitrogen (TN) concentrations. In this example, monthly indicator
variables were used to adjust for seasonality in an ANOVA regression model. See section 7.3.6 for
details on adjustments for autocorrelation and seasonality.
0.7 •
0.6-
0.5-
0.4 -
0.3 •
01/01/2009
Date
0.8-
0.7-
0.6-
0.5-
0.4-
0.3-
0.2-
0.1-
0-
\
1/2008
1/2010
By addressing seasonality in the regression model with monthly indicator variables, most of the
regression degrees of freedom were preserved, a more powerful approach than if each month was
evaluated separately. Each line in the plot on the left represents the trend line (log transformed, ffflow-
weighted TN concentration) fora single month (i.e., January, February ... December). The trend slopes
for each month were assumed to be the same, but the intercept was allowed to vary, enabling the
differences in concentration due to season to be removed from the test for trends and therefore making it
easier to isolate and detect trends due to other factors (e.g., BMPs).
The bottom right graph shows the raw data. The noise due to seasonal differences and other factors
makes it difficult to pick out any trends. The top right graph shows the predicted value from the seasonal
regression model with the indicator variables. A downward trend is apparent and it is also clear from this
graph that the highest TN concentration is found in February, followed by January, March, May, April,
June, Sept, August, October, November, December, and July (lowest).
7-88
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
Example 7.8-9. Mann-Kendall Procedure - Single Trend Watershed - Samsonville Brook
in Vermont.
The data from Samsonville Brook in Vermont:
• Eight years of quarterly mean TP concentration data
• Data satisfy assumptions for constant variance and independence, but are not normally
distributed without transformation
i
j_
c
o
E
4^
C
8
c
CL
1
"~
0.50 -
0.40
0.30 -
0.20 -
0.10 -
.-. - ,-.
• * • • •
1 »
•
• • # » •
0 12 24 36 48 60 72 84 96
Time (months)
Month
(n=25)
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
[TP]
mg/L
0.180
0.200
0.250
0.068
0.201
0.063
0.099
0.125
0.205
0.078
0.216
0.059
0.098
0.102
0.137
0.037
0.100
0.051
0.180
0.060
0.095
0.021
0.120
0.063
0.035
The Mann-Kendall trend test for this example may be evaluated in two ways. First, in a
manual calculation, use the formulas below. The value of S (sum of the signs of
differences between all combinations of observations) can be determined either manually
or by using a spreadsheet to compare combinations, create dummy variables (-1,0, and
+1), and sum for S.
Mann-Kendall S = Ef^1 ZJ=i+isign (y} - yi) = -106
i = •
n(n-l)/2
-106
300
= -0.353 (decreasing trend)
Calculating Zs as (S±1)/as where
as = J(-) x (n - 1) x (2n + 5) = 42.817
Z =
-105
= -2.454 (USEPA 1997a)
This Z statistic is significant at P=0.014, indicating a significant trend, i.e., we are
98.6 percent confident there is a decreasing trend in TP. See USEPA (1997a) for the
calculation of as when there are ties among the data.
To estimate the rate of change, use the Sen slope estimator:
ft = median (ZL2l\ 211 individual slopes -000945 to +0.00766
\Xj-xJ
Median slope = -0.0011 mg/L/month = -0.013 mg/L/yr
This result suggests that TP concentration decreased significantly over the period
at a rate of about 0.013 mg/L/yr.
Note: Data used in this example are taken from the Vermont NNMP project, Lake Champlain Basin agricultural
watersheds section 319 national monitoring program project, 1993 -2001 (Meals 2001).
7-89
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.8.2.4.2 Step Trends
Monotonic trend analysis may not be appropriate for all situations. Other statistical tests for discrete
changes (step trends) should be applied where a known discrete event (like BMP implementation over a
short period) has occurred. Testing for differences between the "before" and "after" conditions is done
using two-sample procedures such as the Student's t test and ANCOVA (parametric techniques with and
without covariates) and nonparametric alternatives such as the rank-sum test, Mann-Whitney test, and the
Hodges-Lehmann estimator of step trend magnitude (Helsel and Hirsch 2002, Walker 1994). Application
of the Mann-WhitneyAVilcoxon's rank sum test and the Hodges-Lehmann estimator are illustrated in
sections 4.5.2 and 4.5.3, respectively, of the 1997 guidance (USEPA 1997b). A key principle in step trend
analysis is that the hypothesized timing of the step change must be selected in advance (i.e., define the
pre- and post- periods before conducting statistical tests). Knowledge of watershed management activities
and examination of data plots will be helpful in identifying a potential step in time.
For example, the Mann-Whitney test was used to associate changes in P management practices with a
decrease in annual median soluble reactive P concentration from a 9-ha grassland catchment in Northern
Ireland (Smith et al. 2003). Weekly samples were collected from 1989 through 2000, with the change in P
management instituted in 1998. A comparison of data from 1997 with data from 2000 indicated that the
change from whole-farm to site-specific P management reduced SRP concentrations significantly.
If the trend model has autocorrelated errors, a statistical model that incorporates the autoregessive errors
should be employed. Alternatively, a correction of the standard error of the slope that is given in section
7.3.6 can be used to calculate the correct confidence interval of the step change (difference) between time
periods to determine if it is significantly different from zero (e.g., evidence of a step change) in the
pollutant concentration or load.
7.8.2.5 Multiple Watersheds
In the simplest case of a multiple watershed design, where monitored watersheds fall into two groups,
treated and untreated, data may be analyzed by Student's t test or the non-parametric Wilcoxon Rank-
Sum test. Such an analysis would test the (null) hypothesis that there was no significant difference in
mean pollutant concentration or load between the treated and untreated watershed groups. Where
monitored watersheds occur in more than two groups (e.g., untreated, treatment A, treatment B, etc.),
significant differences in group means can be evaluated using ANOVA or the Kruskal-Wallis test. For
example, Clausen and Brooks (1983) assessed mining impacts on MN peat lands using a multiple
watershed design. Results - analyzed by ANOVA for normally distributed variables and otherwise by
nonparametric Kruskal-Wallis and Chi-Square tests - documented significant impacts of peat mining on
water quality. Lewis (2006) describes application of fixed-effect and mixed-effect (i.e., includes random
effects) regression models to multiple-watershed studies involving logging. A 13-watershed study
involving 3 controls, 5 clear-cuts, and 5 partial cuts was carried out over sixteen years with monitoring of
storm volumes during four years before cutting, three years of logging, and nine years13 of post-logging.
The best fit was obtained when the proportion harvested, antecedent wetness, regrowth, and spatial
autocorrelation were all incorporated into the model. This study design and analytic approach allows the
prediction of streamflow response to harvesting in other watersheds considered part of the same
population of watersheds included in the study.
! Three years of post-cut monitoring at seven stations and nine years at six stations.
7-90
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
7.8.3 Linking Water Quality Trends to Land Treatment
A central objective of many NFS watershed projects is to determine not only if water quality changes can
be documented but also if water quality changes can be associated with changes in land treatment. Such
documentation is necessary to help build an information base to support continued improvement in
preventing and solving water quality problems. It is also needed in many cases to justify expenditure on
clean-up efforts.
For a range of reasons, including budgets and programmatic constraints, watershed project monitoring
efforts are almost never designed to satisfy the rigorous criteria for establishing true cause and effect
relationships (see Box 7.8-2). Rather, project effectiveness monitoring designs are generally intended to
measure improvements in water quality and, hopefully, relate that improvement to activities undertaken to
influence water quality. A plausible argument that
what was done on the ground improved water
quality is often the best that can be hoped for and
that is usually not a simple task at the watershed
level. The ability to control for factors other than
land treatment (e.g., weather, hydrology, land use
change) is a key ingredient in making such a
plausible argument.
Control refers to eliminating or accounting for all
factors that may affect the response to the treatment
so that the treatment effect can be isolated. In a
laboratory experiment, control is usually obtained
by subjecting the entire system to the same
conditions, varying only the treatment variable and
selecting replicates at random to assure that
unmeasured sources of variability do not affect the
interpretation. Such an approach is rarely if ever
possible for monitoring projects in watersheds
dominated by nonpoint sources. Instead, we hope to
show an association between change in water
quality and change in land use or management by
selecting a project design that includes monitoring for important explanatory variables (covariates) and
applying appropriate statistical tools to include and adjust for these covariates in the analysis. By
factoring explanatory variables into trend analyses, we remove some of the noise in the data to uncover
water quality trends that are closer to those that would have been measured had no changes in climatic or
other explanatory variables occurred over time. When performing statistical analyses with both water
quality and land treatment data, it is important to note that it is not necessary to summarize the water
quality data on the same (less frequent) time scale as the land treatment data. Rather, land treatment data
can be incorporated within a trend analysis, for example, as repeating explanatory variables. That is, the
values of land treatment and land use are treated as X variables in a statistical trend model. Because land
management data are usually taken less frequently than water quality data, the land management
information for a given X variable can be repeated for the time range of water quality samples that is
represented by the land management value.
Although association by itself is not sufficient to infer causal relationships, it can contribute to a plausible
argument that pollution control activities have resulted in environmental improvement. Thus, knowledge
Box 7.8-2. Cause-effect requirements
(Mosteller and Tukey 1977).
A cause-effect relationship must satisfy the
following criteria:
• Consistency- the direction and degree of
the relationship between the measured
variables (such as TP loads and acres
treated with nutrient management) holds
in each data set.
• Responsiveness - as one variable
changes in a known manner, the other
variable changes similarly. For example,
as the amount of land treatment
increases, further reduction of pollutant
delivery to the water resource is
documented.
• Mechanistic - the observed water quality
change is that which is expected based
on the known or hypothesized physical
processes involved in the installed BMPs.
7-91
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
of land management and land treatment in the watershed is essential to demonstrate an association
between changes on the land and changes in water quality. For example, section 7.8.2.2 described how
the Sycamore Creek (MI) NNMP project used multiple linear regression to link log[TSS] load to the
percentage of land under no-till cropping (Grabow 1999). Additional explanatory variables included the
logs of total storm discharge and peak stream discharge.
Data on both the temporal progress and spatial extent of land treatment and other watershed land
use/management activities should be used to build an association between land treatment and observed
water quality. For example, on a temporal scale, land treatment and management data can be analyzed
and linked to water quality in these ways:
Define monitoring periods: Documentation of BMP implementation can be used to define critical
project periods, like pre- and post-treatment periods in before/after and paired-watershed designs or to
establish a hypothesis on the timing of a step trend.
Explain observed water quality: Knowledge of not only BMP implementation history but also dates of
tillage, manure or agrichemical applications, street sweeping, and other watershed management activities
can be extremely useful in qualitatively explaining observed water quality patterns, especially extreme or
unusual values.
Quantify the level of treatment: Quantitative expressions of land treatment can become the independent
variable in an analysis of correlation between land management and water quality. Analyze land treatment
data collected in the watershed monitoring program to form such variables as:
« Number or percent of watershed animal units under animal waste management
* Acres or percent of cropland in cover crops
« Acres or percent of cropland under conservation tillage
« Annual manure or fertilizer application rate and extent
* Extent and capacity of storm water infiltration practices
Such variables can be tested for correlation with mean total P concentration, annual suspended sediment
load, or other annual water quality variables.
Document areas receiving BMPs: Use knowledge of locations of land treatment to:
« Select appropriate watersheds for analysis in a multiple watershed design
« Confirm conditions in above/below and nested watershed designs
* Document the integrity of the control and treatment watersheds in a paired-watershed design
Relate land treatment to critical source areas: A comparison of critical pollutant sources to locations
that received treatment can assist in evaluating effectiveness of land treatment efforts and establish
expectations for how much of the NPS problem the land treatment program potentially addresses.
The Walnut Creek (IA) NNPSMP project, for example, monitored stream NOs-N concentrations and
tracked conversion of row crop land to restored prairie vegetation (Schilling and Spooner 2006). By
linking the two monitored variables, the project was able to suggest a clear association between restoring
native prairie and reducing stream nitrate levels (see Figure 7-26).
7-92
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
c
0
H— «
CD
-*-•
C
CD
O
O
O
z
3
2
z
c
0
6
Relating Changes in Stream Nitrate Concentrations
To Changes in Row Crop Land Cover
In Walnut Creek, Iowa
12 -
§ 8 -
£
un 4 —
0
o
CM
£ 0 -
LO
CD
- -4 -
y = 0.195 x + 1.57 .
r2 = 0.70
•
^^
^^
.^f
^^^
* ^^
*^^ •
I i i i i i i i
-40 -30 -20-10 0 10 20 30 40
Change in Row Crop Land Cover
in Watershed Area {%), 1990 to 2005
Relationship between change in percentage of land cover in row crops and
change in stream NO -N concentrations in Walnut Creek, IA
Figure 7-26. Linking stream nitrate concentration to land cover (Schilling and Spooner 2006)
7.9 Load Estimation
Determination of pollutant load is a key objective for many NFS monitoring projects. The mass of
nutrients delivered to a lake or estuary drives the productivity of the waterbody. The annual suspended
sediment load transported by a river is usually a more meaningful indicator of soil loss in the watershed
than is a suspended sediment concentration. The foundation of water resource management embodied in
the TMDL concept lies in assessment of the maximum pollutant load a waterbody can accept before
becoming impaired and in the measurement of changes in pollutant loads in response to implementation
of management measures.
Estimation of pollutant load through monitoring is a complex task that requires accurate measurement of
both pollutant concentration and water flow and careful calculation, often based on a statistical approach.
It is imperative that an NFS monitoring program be designed for good load estimation at the start. This
section addresses important considerations and procedures for developing good pollutant load estimates in
NFS monitoring projects. Much of the material is taken from an extensive monograph written by Dr. R.
Peter Richards, of Heidelberg University, Estimation of Pollutant Loads in Rivers and Streams: A
Guidance Document for NFS Programs. The reader is encouraged to consult that document and its
7-93
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
associated tools for additional information on load estimation. Much of this information is also
summarized in Meals et al. (2013).
7.9.1 General Considerations
7.9.1.1 Definitions
Load may be defined as the mass of a substance that passes a particular point of a river (such as a
monitoring station on a watershed outlet) in a specified amount of time (e.g., daily, annually).
Mathematically, load is essentially the product of water discharge and the concentration of a substance in
the water. Flux is a term that describes the
loading rate, i.e., the instantaneous rate at which
the load passes a point in the river. Water
discharge is defined as the volume of water that
passes a cross-section of a river in a specified
amount of time, while flow refers to the
discharge rate, the instantaneous rate at which
water passes a point. Refer to Meals and
Dressing (2008) for guidance on appropriate
ways to estimate or measure surface water flow
for purposes associated with NFS watershed
projects.
If we could directly and continuously measure the flux of a pollutant, the results might look like the plot
in Figure 7-27. The load transported over the entire period of time in the graph would simply be equal to
the shaded area under the curve.
Basic Terms
Flux- instantaneous loading rate (e.g., kg/sec)
Flow rate - instantaneous rate of water passage
(e.g., L/sec)
Discharge - quantity of water passing a
specified point (e.g., m3)
Load - mass of substance passing a specified
point (e.g., metric tons)
time-
Figure 7-27. Imaginary plot of pollutant flux over time at a monitoring station (Richards 1998)
7-94
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
However, we cannot measure flux directly, so we calculate it as product of instantaneous concentration
and instantaneous flow:
Load = k I c(f)q(f)dt
where c is concentration and q is flow, both a function of time (t), and k is a unit conversion factor.
Because we must take a series of discrete samples to measure concentration, the load estimate becomes
the sum of a set of n products of concentration (c), flow (q), and the time interval (At) over which the
concentration and flow measurements apply:
Load =
The main monitoring challenge becomes how best to take the discrete samples to give the most accurate
estimate of load. Note that the total load is the load over the timeframe of interest (e.g., one year)
determined by summing a series of unit loads (individual calculations of load as the product of
concentration, flow, and time over smaller, more homogeneous time spans). The central problem is to
obtain good measures of concentration and flow during each time interval; calculation of total load by
summing unit loads is simple arithmetic.
7.9.1.2 Issues of Variability
Both flow and concentration vary considerably overtime, especially in NFS situations. Accurate load
estimation becomes an exercise in both how many samples to take and when to take them to account for
this variability.
Sampling frequency has a major influence on the accuracy of load estimation, as shown in Figure 7-28.
The top panel shows daily suspended solids load (calculated as the products of daily total suspended
solids (TSS) concentration and mean daily discharge measured at a continuously recording USGS station)
for the Sandusky River in Ohio. The middle panel represents load calculated using weekly TSS samples
and mean weekly discharge; the lower panel shows load calculated from monthly TSS samples and mean
monthly discharge data. Clearly, very different pictures of suspended solids load emerge from different
sampling frequencies, as decreasing sampling frequencies tend to miss more and more short-term but
important events with high flow or high TSS concentrations.
Because in NFS situations most flux occurs during periods of high discharge (e.g., -80 - 90 percent of
annual load may be delivered in -10 - 20 percent of time), choosing when to sample can be as important
as how often to sample. The top panel in Figure 7-29 shows a plot of daily suspended solids load derived
from weekly sampling superimposed on daily flux data; the bottom panel shows daily loads derived from
monthly and quarterly sampling on top of the same daily flux data. Weekly samples give a reasonably
good visual fit over the daily flux pattern. The monthly series gives only a very crude representation of
the daily flux, but it is somewhat better than expected, because it happens to include the peaks of two of
the four major storms for the year. A monthly series based on dates about 10 days later than these would
have included practically no storm observations, and would have seriously underestimated the suspended
solids load. Quarterly samples result in a poor fit on the actual daily flux pattern.
7-95
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
60DO
50001
4000
3000-
2000-
1000
1
c
1 .
laily samples
1 A
100
200
300
6000
5000
4000-
3000-
2000
10DO
Weekly samples
:co
300
Figure 7-28. Plot of suspended solids loads for the Sandusky River, water year 1985 (Richards
1998). Top, daily TSS samples; Middle, weekly samples; Bottom, monthly samples. Weekly and
monthly sample values were drawn from actual daily sample data series. Flux is on y-axis, time is
on x-axis, and area under curve is load estimate.
The key point here is that many samples are typically needed to accurately and reliably capture the true
load pattern. Quarterly observations are generally inadequate, monthly observations will probably not
yield reliable load estimates, and even weekly observations may not be satisfactory, especially if very
accurate load estimates are required to achieve project objectives.
7.9.1.3 Practical Load Estimation
Ideally, the most accurate approach to estimating pollutant load would be to sample very frequently and
capture all the variability. Flow is relatively straightforward to measure continuously (see Meals and
Dressing 2008), but concentration is expensive to measure and in most cases impossible to measure
continuously. It is therefore critically important to choose a sampling interval that will yield a suitable
characterization of concentration.
7-96
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
lODOQi
200
300
10000
100
200
300
Figure 7-29. Weekly (red line in top panel), monthly (red line), and quarterly (black line in bottom
panel) suspended solids load time series superimposed on a daily load time series (Richards
1998). Log of flux is on y-axis, time is on x-axis, and area under curve is load estimate.
There are three important considerations involved in sampling for good load estimation: sample type,
sampling frequency, and sample distribution in time. Grab samples represent a concentration only at a
single point in time and the selection of grab sampling interval must be made in consideration of the
issues of variability discussed above. Integrated samples (composite samples made up of many individual
grab samples) are frequently used in NFS monitoring. Time-integrated or time-proportional samples are
either taken at a constant rate over the time period or are composed of subsamples taken at a fixed
frequency. Time-integrated samples are poorly suited for load estimation because they are taken without
regard to changes in flow (and concentration) that may occur during the integration period and are usually
biased toward the low flows that occur most often. Flow-proportional samples (where a sample is
collected for every n units of flow that pass the station), on the other hand, are ideally suited for load
estimation, and in principle should provide a precise and accurate load estimate if the entire time interval
is properly sampled. However, collecting flow-proportional samples is technically challenging and may
not be suitable for all purposes. Also, even though a flow-proportional sample over a time span (e.g., a
week) is a good summation of the variability of that week, ability to see what happened within that week
(e.g., a transient spike in concentration) is lost. Flow-proportional sampling is also not compatible with
some monitoring demands, such as monitoring for ambient concentrations that are highest at low flow or
for documenting exceedance of critical values (e.g., a water quality standard).
7-97
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Sampling frequency determines the number of unit load estimates that can be computed and summed for
an estimate of total load. Using more unit loads increases the probability of capturing variability across
the year and not missing an important event (see Figure 7-29); in general, the accuracy and precision of a
load estimate increases as sampling frequency increases. Over a sufficiently short interval between
samples, a sampling program will probably not miss a sudden peak in flux. If, for example, unit loads are
calculated by multiplying the average concentration for the time unit by the discharge over the same time
unit, the annual load that is the sum of four quarterly unit loads will be considerably less accurate than the
annual load that is the sum of twelve monthly loads. Note that this example does not mean that an annual
load calculated from 12 monthly loads is sufficiently accurate for all purposes.
There is a practical limit to the benefits of increasing sampling frequency, however, due to the fact that
water quality data tend to be autocorrelated (see section 7.3.6). The concentration or flux at a certain point
today is related to the concentration or flux at the same point yesterday and, perhaps to a lesser extent, to
the concentration or flux at that spot last week. Because of this autocorrelation, beyond some point,
increasing sampling frequency will accomplish little in the way of generating new information. This is
usually not a problem for monitoring programs, but can be a concern, however, when electronic sensors
are used to collect data nearly continuously.
Consideration of the basic sampling frequency - n samples per year - does not address the more complex
issue of timing. The choice of when to collect concentration samples is critical. Most NFS water quality
data have a strong seasonal component as well as a strong association with other variable factors such as
precipitation, streamflow, or watershed management activities such as tillage or fertilizer application.
Selecting when to collect samples for concentration determination is essentially equivalent to selecting
when the unit loads that go into an annual load estimate are determined. That choice must consider the
fundamental characteristics of the system being monitored. In northern climates, spring snowmelt is often
the dominant export event of the year; sampling during that period may need to be more intensive than
during midsummer in order to capture the most important peak flows and concentrations. In southern
regions, intensive summer storms often generate the majority of annual pollutant load; intensive summer
monitoring may be required to obtain good load estimates. For many agricultural pesticides, sampling
may need to be focused on the brief period immediately after application when most losses tend to occur.
Issues of random sampling, stratified random sampling, and other sampling regimes should be
considered. Simple random sampling may be inappropriate for accurate load estimation if, as is likely, the
resulting schedule is biased toward low flow conditions. Stratified random sampling - division of the
sampling effort or the sample set into two or more parts which are different from each other but relatively
homogeneous within - could be a better strategy. In cases where there is a conflict between the number of
observations a program can afford and the number needed to obtain an accurate and reliable load
estimate, it may be possible to use flow as the basis for selecting the interval between concentration
observations. For example, planning to collect samples every x thousand ft3 of discharge would
automatically emphasize high flux conditions while economizing on sampling during baseflow
conditions. Sampling levels following this strategy could be based on an annual average flow, recognizing
that the number of samples per year would vary.
7-98
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.9.1.4 Planning for Load Estimation
Both discharge and concentration data are needed to calculate pollutant loads, but monitoring programs
designed for load estimation will usually generate more flow than concentration data. This leaves three
basic choices for practical load estimation:
1. Find a way to estimate un-measured concentrations to go with the flows observed at times when
chemical samples were not taken;
2. Throw out most of the flow data and calculate the load using the concentration data and just those
flows observed at the same time the samples were taken; and
3. Do something in between - find some way to use the more detailed knowledge of flow to adjust
the load estimated from matched pairs of concentration and flow.
The second approach is usually unsatisfactory because the frequency of chemical observations is likely to
be inadequate to give a reliable load estimate when simple summation is used. Thus almost all effective
load estimation approaches are variants of approaches 1 or 3.
Unfortunately, the decision to calculate loads is sometimes made after the data are collected, often using
data collected for other purposes. At that point, little can be done to compensate for a data set that
contains too few observations of concentration, discharge, or both, collected using an inappropriate
sampling design. Many programs choose monthly or quarterly sampling with no better rationale than
convenience and tradition. A simulation study for some Great Lakes tributaries revealed that data from a
monthly sampling program, combined with a simple load estimation procedure, gave load estimates
which were biased low by 35 percent or more half of the time (Richards and Holloway 1987).
To avoid such problems, the sampling regime needed for load estimation must be established in the initial
monitoring design, based on quantitative statements of the precision required for the load estimate. The
resources necessary to carry out the sampling program must be known and budgeted for from the beginning.
The following steps are recommended to plan a monitoring effort for load estimation:
• Determine whether the project goals require knowledge of load, or if goals can be met using
concentration data alone. In many cases, especially when trend detection is the goal, concentration
data may be easier to work with and be more accurate than crudely estimated load data.
" If load estimates are required, determine the accuracy and precision needed based on the uses to
which they will be put. This is especially critical when the purpose of monitoring is to look for a
change in load. It is foolish to attempt to document a 25 percent load reduction from a watershed
program with a monitoring design that gives load estimates +50 percent of the true load (see
Spooner et al. 201 la).
" Decide which approach will be used to calculate the loads based on known or expected attributes of
the data.
" Use the precision goals to calculate the sampling requirements for the monitoring program.
Sampling requirements include both the total number of samples and, possibly, the distribution of
the samples with respect to some auxiliary variable such as flow or season.
" Calculate the loads based on the samples obtained after the first full year of monitoring, and
compare the precision estimates (of both flow measurement and the sampling program) with the
initial goals of the program. Adjust the sampling program if the estimated precision deviates
substantially from the goals.
7-99
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
It is possible that funding or other limitations may prevent a monitoring program from collecting the data
required for acceptable load estimation. In such a case, the question must be asked: is a biased, highly
uncertain load estimate preferable to no load estimate at all? Sometimes the correct answer will be no.
7.9.2 Approaches to Load Estimation
Several distinct technical approaches to load estimation are discussed below. The reader is encouraged to
consult Richards (1998) for details and examples of these calculations. Do not estimate annual loads
based on simple multiplication of an annual average concentration and average discharge as load
estimates will be biased low for positively correlated parameters such as suspended sediment and total
phosphorus.
7.9.2.1 Numeric Integration
The simplest approach is numeric integration, where the total load is given by
n
Load = y c^ti
where c; is the concentration in the ith sample, qi is the corresponding flow, and t; is the time interval
represented by the ith sample, calculated as:
1
It is not required that t; be the same for each sample.
The question becomes how fine to slice the pie - few slices will miss much variability, many slices will
capture variability but at a higher cost and monitoring effort. Numeric integration is only satisfactory if
the sampling frequency is high - often on the order of 100 samples per year or more, and samples must be
distributed so that all major runoff events are captured. Selection of sampling frequency and distribution
over the year is critical - sampling must focus on times when highest fluxes occur, i.e., periods of high
discharge.
As noted above, flow-proportional sampling is a special case of mechanical rather than mathematical
integration that assumes that one or more samples can be obtained that cover the entire period of interest,
each representing a known discharge and each with a concentration that is in proportion to the load that
passed the sampling point during the sample's accumulation. If this assumption is met, the load for each
sample is easily calculated as the discharge times the concentration, and the total load for the year is
derived by summation. In principle, this is a very efficient and cost-effective method of obtaining a total
load.
7.9.2.2 Regression
When, as is often the case with NPS-dominated systems, a strong relationship exists between flow and
concentration, using regression to estimate load from continuous flow and intermittent concentration data
can be highly effective. In this approach, a regression relationship is developed between concentration
and flow based on the days for which concentration data exist. Usually, these data are based on grab
samples for concentration and mean daily flow for the sampling day (see Example 7.9-1). This
7-100
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
relationship may involve simple or multiple regression analysis using covariates like precipitation. In
most applications, both concentration and flow are typically log-transformed to create a dataset suited for
regression analysis (see section 7.3.2 and Meals and Dressing 2005) for basic information on data
transformations). The regression relationship may be based entirely on the current year's samples, or it
may be based on samples gathered in previous years, or both. This method requires that there be a strong
linear association between flow and concentration that does not change appreciably over the period of
interest. If BMP implementation is expected to affect the relationship between flow and concentration,
such relationships must be tracked carefully - if BMPs change the relationship, the concentration
estimation procedure must be corrected.
Once the regression relationship is established, the regression equation is used to estimate concentrations
for each day on which a sample was not taken, based on the mean daily flow for the day. The total load is
calculated as the sum of the daily loads that are obtained by multiplying the measured or estimated daily
concentration by the total daily discharge.
The goal of chemical sampling under this approach is to accurately characterize the relationship between
flow and concentration. The monitoring program should be designed to obtain samples over the entire
range of expected flow rates. If seasonal differences in the flow/concentration relationship are likely, the
entire range of flows should be sampled in each season. In some cases, separate seasonal flow-
concentration regressions may need to be developed and used to estimate seasonal loads. Examples of
such flow-concentration regressions are shown in Figure 7-30 and example 7.9-1.
This approach is especially applicable to situations where continuous flow data already exist, e.g., from
an ongoing USGS hydrologic station. Grab samples can be collected as needed and then associated with
the appropriate flow observations. Economy is another significant advantage of this approach. After an
initial intensive sampling period to develop the regression, it may be possible to maintain the regression
model with -20 samples a year for concentration, focusing on high-flow or critical season events.
Software exists to calculate and manage this approach, e.g. Flux32 (Walker 1990, Soballe 2014). Flux32
is an interactive program designed for use in estimating the loadings of nutrients or other water quality
components passing a tributary sampling station over a given period of time. Data requirements include
(a) grab-sample nutrient concentrations, typically measured at a weekly to monthly frequency for a period
of at least 1 year, (b) corresponding flow measurements (instantaneous or daily mean values), and (c) a
complete flow record (mean daily flows) for the period of interest. Using six calculation techniques,
Flux32 maps the flow/concentration relationship developed from the sample record onto the entire flow
record to calculate total mass discharge and associated error statistics. An option to stratify the data into
groups based upon flow, date, and/or season is also included. The USGS program LOADEST is also
available and is widely used to estimate loads together with an estimate of precision using the regression
approach. LOADEST includes an adjusted maximum likelihood estimation method that can be used for
censored data sets and a least absolute deviation method to use when the regression residuals are not
normally distributed. A web-based version of LOADEST program is available at
https://engineering.purdue.edu/~ldc/LOADEST/. Another USGS load estimation calculation tool -
FLUXMASTER - has been used in the SPARROW (SPAtially Referenced Regressions On Watershed
attributes) watershed modeling technique to compute unbiased detrended estimates of long-term mean
flux, and to provide an estimate of the associated standard error (Schwarz et al. 2006). These models
include seasonal and temporal terms in their formulation that can improve the estimate of load; however,
care is needed to ensure the model form is correct by reviewing the diagnostic plots.
7-101
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
GO
7-
6-
5-
4-
3-
2-
6-
4-
2-
10
h(flow)
summer
10
12
12
Inflow)
Figure 7-30. Flow-concentration regressions from the Maumee River, Ohio (Richards 1998). Top
panel, regression relationship between log of total suspended solids concentration and log of
flow for the 1991 water year dataset; Bottom panel, plot of same data divided into two groups
based on time of year. Within each season, the regression model is stronger, has lower error, and
provides a more accurate load estimate.
7-102
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Example 7.9-1. Mill Creek Watershed, PA NNPSMP
In this project, loads per unit area of nutrients and suspended sediment were estimated by combining
the non-storm (i.e., low flow) and storm-flow loads (Galeone et al. 2006). Low-flow and storm-flow
loads were computed using a multiple regression technique that included explanatory variables such
as discharge, season, and time to estimate concentrations (and subsequently loads). Regressions
were developed separately for low-flow and storm-flow periods, and for both low flow and storm flow,
separate models were generated for the pre- and post-treatment periods for each site. Models were
selected on the basis of the highest adjusted R2 and residuals plots to detect trends, and all F-values
had to exceed the value for the F distribution for the appropriate degrees of freedom and an alpha
equal to 0.05.
Continuous discharge data for the four sites was first separated into low-flow and storm-flow periods
using site-specific criteria defining a storm event. Sampled storms were reviewed to determine the
typical rate of stage-height increase that initiated storm sampling. The recession and subsequent
completion of storm sampling was also reviewed to determine the typical endpoint of storm sampling
at each of the four sites. This information was used with 5- or 15-minute stage data to manually
separate storm-flow discharge data from low-flow data.
For low-flow periods, a subset of the grab-sample data was used to develop the relation between
constituent concentrations and explanatory variables. Prior to using the grab-sample data, the
cumulative frequency distribution for each site was determined using the continuous discharge data
for the entire period of record. Grab samples collected at flows above the 97th percentile were deleted
prior to load analysis. With these higher flows deleted, the relation between constituent concentrations
and explanatory variables was developed. The low-flow constituent concentrations were estimated on
a daily basis using the daily-mean discharge data for low-flow periods. The estimated concentrations
were multiplied by the daily-mean discharge to estimate daily loads.
Storm-flow loads for nutrients and suspended sediment were estimated by use of the mean discharge
and mean constituent concentration for sampled storms. The mean discharge-concentration relation
developed for sampled storms using regression analysis was used to predict the concentrations for
unsampled storms. The mean discharge was calculated for unsampled storms using the 5- or
15-minute continuous-stage data for the sites. This mean discharge was applied to the predicted
concentration to estimate constituent loads for unsampled storms. Increases in stage caused by
snowmelt events were analyzed separately by subsetting the storm events sampled during snowmelts
and using these regression relations to estimate loads for non-sampled snowmelt events. The
percentage of the storms sampled at each site was somewhat dependent on the location of the
surface-water site, ranging from about 50-60 percent at outlet sites and 35-45 percent at upstream
sites where flashiness was greater and defined storms more frequent.
Constituent loads for each continuous surface-water site were estimated by summing the low-flow and
storm-flow loads. The annual load data for the constituents were divided by the basin drainage areas
to determine constituent yields. The percentage of the total yield in storm-flow was determined by
summing the sampled and unsampled storm yields and dividing by the total yield. The remaining yield
was attributed to low-flow periods. Data also were separated into pre- and post-treatment periods.
There are a few potential disadvantages to this approach. First, as mentioned earlier, potential changes or
trends in the concentration-flow relationship - sometimes a goal of watershed projects - must be tracked.
If the relationship changes a new regression model must be constructed. Second, the monitoring program
must be managed to effectively capture the entire range of flows/conditions that occur; the use of data
from fixed-interval time-based sampling is not appropriate for this purpose because of bias toward low
flow conditions.
7-103
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Hirsch et al. (2010) propose a weighted regression on time, discharge, and season (WRTDS) method that
addresses some of these shortcomings. Principally, the WRTDS method relies on the same function
regression structure as LOADEST; however, the fitted coefficients are allowed to vary with time. For
example, the amplitude of the seasonal cycle could be relatively large in some periods of the record and
then dampen to smaller cycles in other portions of the record. This is achieved through using a weighted
regression that "windows in" on a portion of the record in time, flow and season. It is noteworthy that the
researchers recommend that this method is primarily developed for data sets with more than 200 samples
collected over 20 years. Like other flow adjustment tools there is a requirement of flow stationarity, that
is, there isn't a basis for expecting a change in flow over time such as a new reservoir whether that change
is observed over the entire year or just during a portion of the year. Extended dry or wet periods are
simply an expected part of the long term record. WRTDS is generally intended for gradual changes that
might be expected with NFS projects or sites that represent the cumulative effect of multiple point
sources, and less for abrupt changes. WRTDS has been built into Exploration and Graphics for RivEr
Trends (EGRET): An R-package for the analysis of long-term changes in water quality and streamflow.
User guidance is available at https://github.com/USGS-R/EGRET/wiki although more current releases are
available through R (R Core Team 2013). The WRTDS method was applied to eight monitoring sites on
the Mississippi River investigating nitrate (Sprague et al. 2011) and compared to the more traditionally
recommended ESTIMATOR by Moyer et al. (2012) in an evaluation using data from the Chesapeake
Bay.
7.9.2.3 Ratio Estimators
The concept of ratio estimators is a powerful statistical tool for estimating pollutant load from continuous
flow data and intermittent concentration data. Ratio estimators assume that there is a positive linear
relationship between load and flow that passes through the origin. On days when chemistry samples are
taken, the daily load is calculated as the product of grab-sampled concentration and mean daily flow, and
the mean of these loads over the year is also calculated. The mean daily load is then adjusted by
multiplying it by a flow ratio, which is derived by dividing the average flow for the year as a whole by the
average flow for the days on which chemical samples were taken. A bias correction factor is included in
the calculation, to compensate for the effects of correlation between discharge and load. The adjusted
mean daily load is multiplied by 365 to obtain the annual load.
When used in a stratified mode (e.g., for distinct seasons), the same process is applied within each
stratum, and the stratum load is calculated by multiplying the mean daily load for the stratum by the
number of days in the stratum. The stratum loads are then summed to obtain the total annual load. The
Beale Ratio Estimator is one technique, with an example provided by Richards (1998). Several formulas
are available to calculate the number of samples (random or within strata) required to obtain a load
estimate of acceptable accuracy based on known variance of the system. Stratification may improve the
precision and accuracy of the load estimate by allocating more of the sampling effort to the aspects which
are of greatest interest or which are most difficult to characterize because of great variability such as high
flow seasons.
7.9.2.4 Comparison of Load Estimation Approaches
Although strongly driven by available resources, the monitoring program design (that should have
included consideration of load estimation issues from the beginning), and the natural system itself, the
choice of load estimation approach can make an enormous difference in the resulting load estimate.
7-104
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
In an analysis of total suspended solids data from the Maumee River in water year 1991, Richards (1998)
demonstrated that different methods of load estimation applied to different datasets can result in
substantially different estimates of pollutant load. Richards (1998) found that loads were often
underestimated with the Beale Ratio Estimator and regression techniques, attributing this finding to
missed high flow/TSS events and/or the estimation methods being biased toward low flow conditions.
Notably, the Beale Ratio Estimator gave a load estimate closer to the true load (estimated through
numeric integration) than did the regression method. For the full daily dataset, the single flow-
concentration regression over the entire year appeared to seriously underestimate suspended solids load;
while separating the data into summer and winter seasons improved the fit and the accuracy of the load
estimate. In a summary of findings, Harmel et al. (2006) reported that the USGS regression method could
result in annual constituent loads to within 10 percent of true loads in larger watersheds but no less than
30 percent for smaller watersheds.
Harmel and King (2005) and Harmel et al. (2006) concluded that flow-proportional, composite sampling
was the most effective method to obtain high quality data for estimating loads from small agricultural
watersheds. They concluded that composite sampling extended the sampler capacity with little effect on
error, noting that intensive sampling strategies could achieve errors less than 10 percent. In their study,
smaller sampling intervals should be used for constituents such as sediment which varies more during the
course of a rainfall event in comparison to other constituents which vary less during a rainfall event.
Dolan et al. (1981) evaluated total phosphorus loadings to Lake Michigan from Grand River in 1976-77'.
They found that the Beale ratio estimator performed better than regression or other simplified
calculations. Quilbe et al. (2006) evaluated a 1989-1995 nutrient and sediment data set from the
Beaurivage River (Quebec, Canada). They chose to estimate loadings with a Beale ratio estimator because
they found that the correlation between flow and various water quality parameters was too weak to
develop regression equations while noting that regression techniques would have been preferred if good
correlations were found. Marsh and Waters (2009) also found few cases with strong correlations in their
evaluation of 31 storm events in Queensland. They concluded that there was no clear best technique, but
noted that the ratio methods were more robust and regression techniques worked well when there was a
"tight" correlation. Using hourly model output, Zamyadi et al. (2007) found that the Beale ratio did not
perform well in comparison to averaging and interpolation procedures.
Taking the above literature into account, this guidance recommends that numeric integration be used
when the full time series of water quality and flow data are available as in the case of flow-proportional
composited samples. Regression approaches are appropriate for incomplete water quality records if good
correlations between water quality and flow exist, with the Beale ratio recommended otherwise. It is
important to take into account stratification by flow regime, season, and other covariates for both
regression and the Beale ratio.
7.9.3 Load Duration Curves
A particularly useful diagnostic tool for load estimation data is the load duration curve. Simply stated, a
duration curve is a graph representing the percentage of time during which the value of a given parameter
(e.g., flow, concentration, or load) is equaled or exceeded. A load duration curve is therefore a cumulative
frequency plot of mean daily flows, concentrations, or daily loads over a period of record, with values
plotted from their highest value to lowest without regard to chronological order. For each flow,
concentration, or load value, the curve displays the corresponding percent of time (0 to 100) that the value
was met or exceeded over the specified time - the flow, concentration, or load duration interval.
7-105
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Extremely high values are rarely exceeded and have low duration interval values; very low values are
often exceeded and have high duration interval values.
The process of using load duration curves generally begins with the development of a flow duration
curve, using existing historical flow data (e.g., from a USGS gage), typically using mean daily discharge
values. A basic flow duration curve runs from high to low along the x-axis, as illustrated in Figure 7-31.
The x-axis represents the duration or percent of time, as in a cumulative frequency distribution. The y-
axis represents the flow value (e.g., ftVsec (cfs)) associated with that percent of time. Figure 7-31
illustrates that the highest observed flow for the period of record was about 5,400 cfs, while the lowest
flow was 6 cfs. The median flow - the flow exceeded 50 percent of the time - was about 200 cfs.
In the next step, a load duration curve is created from the flow duration curve by multiplying each of the
flow values by the applicable numeric water quality target (usually a water quality criterion) and a unit
conversion factor, then plotting the results as for the flow duration curve. The x-axis remains as the flow
duration interval, and the y-axis depicts the load rather than the flow. This curve represents the allowable
load (e.g., the TMDL) at each flow condition over the full range of observed flow. An example is shown
in Figure 7-32 for the same site as shown in the flow duration curve, using a target of 0.05 mg/L total P.
Then, observed P load observations associated with the flow intervals are plotted along the same axes.
Points located above the curve represent times when the actual loading is exceeding the target load, while
those plotting below the curve represent times when the actual loading is less than the target load.
A key feature of load duration curve analysis is that the pattern of loads - and impairment - can be easily
visualized over the full range of flow conditions. Because flow variations usually correspond to seasonal
patterns, this feature can address the requirement that TMDLs account for seasonal variations. The pattern
of observed loads exceeding target loads can be examined to see if impairments occur only at high flows,
only during low flows, or across the entire range of flow conditions. A common way to look at a load
duration curve is by dividing it into zones representing, for example: high flows (0-10 percent flow
duration interval), moist conditions (10-40 percent), mid-range flows (40-60 percent), dry conditions
(60-90 percent), and low flows (90-100 percent). Data may also be grouped by season (e.g., spring runoff
versus summer base flow). Sometimes, analysis of the load duration curve can provide insight on the
source of pollutant loads. Measured loads that plot above the curve during flow duration intervals above
80 percent (low flow conditions), for example, may suggest point sources that discharge continuously
during dry weather. Conversely, measured loads that plot above the curve during flow duration intervals
of about 10 to 70 percent tend to reflect wet weather contributions by NPS such as erosion, washoff, and
streambank erosion. Figure 7-32 illustrates that allowable total P loads in the Sevier River were exceeded
during all flow intervals, and that P concentrations were independent of flow.
It should be noted that an individual load duration curve applies only to the point in the stream where the
data were collected. A load duration curve developed at a watershed outlet station (e.g., for a TMDL)
applies only to loads observed at that point. If significant pollution sources exist upstream, a single load
duration analysis at the watershed outlet can underestimate the extent of impairment in upstream
segments. For this reason, it is usually wise to develop multiple load duration curves throughout the
watershed to address the spatial distribution of impairments. Such an exercise can also be useful in
targeting land treatment to critical watershed source areas.
7-106
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 7
10,000
1,000
I
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Flow Duration Interval
Figure 7-31. Flow duration curve for the Sevier River nearGunnison, UT, covering the period
January 1977 through September 2002
—Allowable TP Load at USGS Gage 10217000 (kg/day)
a Observed TP Load at Station 494247 (kg/day)
10,000
1,000
I
5
CL
100-
10%
20% 30% 40% 50% 60% 70% 80%
Observed Flow Duration interval at USGS Gage 10217000
90%
100%
Figure 7-32. Load duration curve for the Sevier River near Gunnison, UT, January 1977 through
September 2002. Blue line represents allowable total P load calculated as the product of each
observed flow duration interval and the target total P concentration of 0.05 mg/L. Yellow points
represent observed total P loads at the same flow duration intervals.
7-107
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
For more detailed discussion of load duration curves, particularly their application to the TMDL process,
refer to:
• USEPA. 2007. An Approach for Using Load Duration Curves in the Development of TMDLs
7.9.4 Assessing Load Reductions
The same statistical tools recommended for flow and concentration data in section 7.8.2 and elsewhere in
this chapter can be used to analyze program effectiveness with regard to load reductions. For example,
loads might be estimated on a weekly basis using numeric integration and flow-proportional, composite
sample data. Under a paired-watershed approach, the weekly-paired loads would be grouped as pre- and
post-treatment and analyzed using ANCOVA.
For comparisons of annual loadings, the analyst will have limited data to perform analyses (i.e., one
annual loading value per site-year) and will be generally limited to reporting simple change in loading and
drawing anecdotal comparisons to the control watershed. Normalizing the loadings based on watershed
size, annual rainfall, and other covariates might prove helpful.
Depending on the watershed and the types of installed BMPs, it is also appropriate to compare storm
loadings from individual storms before and after BMP implementation in a single watershed. The particular
challenge here is to control for other covariates and select/analyze storms of a certain size (e.g., rainfall
between 2.5-5.0 cm) and occurring at key times during the year (e.g., within 6 weeks of spring planting).
This type of analysis might also be limited to drawing simple comparisons due to sample size.
7.10 Statistical Software
Modern computers and software packages make it simple to perform the statistical analyses described in
this chapter. Most standard spreadsheet programs include basic statistical functions and graphing
capabilities, but more sophisticated and powerful statistical software packages might be needed for
advanced analyses such as ANCOVA or cluster analysis. An extensive list and comparison of statistical
software packages is available at Wikipedia. Practical Statistics, a web site maintained by Dennis Helsel,
provides a more environmental-centric review of low-cost software tools. Table 7-9 lists some examples
and websites to visit for more information about the many statistical packages available.
Table 7-9. Sampling of available statistics software packages
Package Name
Analyse-lt (add in for MS Excel)
DataDesk
JMP
Mathematica
MATLAB
MINITAB
R
SAS/Stat, SAS/lnsight
SPSS
SYSTAT
WINKS
Web Site URL
http://www.analyse-it.com
http://www.datadesk.com
http://www.imp.com/en qb/software.html
http://www.wolfram.com/mathematica/
http://www.mathworks.com/products/matlab/
https://www.minitab.com/en-us/
https://www.r-proiect.orq/
http://www.sas.com/technoloqies/analvtics/statistics/index.html
http://www.spss.com/spss/
http://www.SYstat.com/products/Svstat/
http://www.texasoft.com/
7-108
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
7.11 References
Belsley, D.A., E. Kuh, and R.E. Welsch. 1980. Regression Diagnostics: Identifying Influential Data and
Sources ofCollinearity. John Wiley and Sons, New York.
Bernstein, B.B. and J. Zalinski. 1983. An optimum sampling design and power tests for environmental
biologists. Journal of Environmental Management 16(1): 35-43.
Bishop, P.L., W.D. Hively, J.R. Stedinger, M.R. Rafferty, J.L. Lojpersberger, and J.A. Bloomfield. 2005.
Multivariate analysis of paired watershed data to evaluate agricultural best management practice
effects on stream water phosphorus. Journal of Environmental Quality 34:1087-1101.
Box, G.E.P. and D.R. Cox. 1964. An analysis of transformations - series B (methodological). Journal of
the Royal Statistical Society 26(2):211-252.
Box, G.E.P. and G.M. Jenkins. 1976. Time Series Analysis: Forecasting And Control. Revised Edition.
Holden-Day, Oakland, CA.
Carpenter, S.R., T.M. Frost, D. Heisey, and T.K. Kratz. 1989. Randomized intervention analysis and the
interpretation of whole-ecosystem experiments. Ecology 70(4): 1142-1152.
Chambers, J.M., W.S. Cleveland, B. Kleiner, P.A. Tukey. 1983. Graphical Methods for Data Analysis.
Duxbury Press, Boston.
Clausen, J.C. 2007. Jordan Cove Watershed Project Final Report. University of Connecticut, College of
Agriculture and Natural Resources, Department of Natural Resources Management and
Engineering. Accessed January 8, 2016.
http://iordancove.uconn.edu/jordan cove/publications/final report.pdf
Clausen, J.C. and K.N. Brooks. 1983. Quality of runoff from Minnesota peatlands: II. a method for
assessing mining impacts. Water Resources Bulletin 19(5):769-772.
Clausen, J.C. and J. Spooner. 1993. Paired Watershed Study Design. 841-F-93-009. Prepared for S.
Dressing, U.S. Environmental Protection Agency, Office of Water, Washington, DC. Accessed
February 12,2016.
Cleveland, W.S. 1993. Visualizing Data. AT&T Bell Laboratories/Hobart Press, Murray Hill, NJ/Summit,
NJ.
Clifford, R., Jr., J.W. Wilkinson, and N.L. Clesceri. 1986. Statistical Assessment of a Limnological Data
Set. In Statistical Aspects of Water Quality Monitoring, Proceedings of the Workshop Held at the
Canada Centre for Inland Waters, October 7-10, 1985, Volume 27 of Developments in Water
Science series, ed. A.H. El-Shaarawi and R.E. Kwiatkowski. Elsevier Publishers, New York. pp.
363-380.
Dolan, D.M., A.K. Yui and R.D. Geist. 1981. Evaluation of river load estimation methods for total
phosphorus. Journal of Great Lakes Research 7(3):207-214.
Dolan, D.M. and K.P. McGunagle. 2005. Lake Erie total phosphorus loading analysis and update: 1996-
2002. Journal of Great Lakes Research 3 l(Supplement 2): 11-22. Accessed March 24, 2016.
http://www.cee.mtu.edu/~nurban/classes/ce5508/2007/Readings/dolan05.pdf
7-109
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Drake, D. 1999. Multivariate Analysis of Fish and Environmental Factors in the Grande Ronde Basin of
Northeastern Oregon. Oregon Department of Environmental Quality, Biomonitoring Section,
Laboratory Division, Portland. Accessed March 24, 2016.
http://www.deq.state.or.us/lab/techrpts/docs/Bio012.pdf.
Elliott, A.C. 2012. Descriptive Statistics Using Microsoft Excel. Texa Soft Mission Technologies, Cedar
Hill, TX. Accessed March 24, 2016.
httD://www.stattutorials.com/EXCEL/EXCEL-DESCRIPTIVE-STATISTICS.html.
Erickson, A.J., P.T. Weiss, J.S. Gulliver, R.M. Hozalski. 2010a. Analysis of Individual Storm Events. In
Stormwater Treatment: Assessment and Maintenance ed. J.S. Gulliver and A.J. Erickson,
University of Minnesota, St. Anthony Falls Laboratory. Minneapolis, MN. Accessed March 24,
2016. http://stormwaterbook.safl.umn.edu/
Erickson, A.J., P.T. Weiss, J.S. Gulliver, and R.M. Hozalski. 2010b. Analysis of Long-Term
Performance. In Stormwater Treatment: Assessment and Maintenance, ed. J.S. Gulliver, A.J.
Erickson, and P.T. Weiss. University of Minnesota, St. Anthony Falls Laboratory. Minneapolis,
MN. Accessed March 24, 2016.
http://stormwaterbook.safl.umn.edu/content/analysis-long-term-performance.
Farnsworth, R.K. and E.S. Thompson. 1982. Mean Monthly, Seasonal, and Annual Pan Evaporation for
the United States. Technical Report NWS 34. National Oceanic and Atmospheric Administration,
National Weather Service. Accessed March 24, 2016.
http://www.nws.noaa.gov/oh/hdsc/PMP related studies/TR34.pdf
Fuller, W.A. 1976. Introduction to Statistical Time Series. John Wiley & Sons, Inc. New York.
Galeone, D.G., R.A. Brightbill, D.J. Low, and D.L. O'Brien. 2006. Effects of StreambankFencing of
Pastureland on Benthic Macroinvertebrates and the Quality ofSurface Water and Shallow
Ground Water in the Big Spring Run Basin of Mill Creek Watershed, Lancaster County,
Pennsylvania, 1993-2001. Scientific Investigations Report 2006-5141. U. S. Geological Survey,
Reston, VA. Accessed March 24, 2016. http://pubs.usgs.gov/sir/2006/5141/.
Geosyntec and WWE (Wright Water Engineers, Inc.). 2009. Urban Stormwater BMP Performance
Monitoring. Prepared for U.S. Environmental Protection Agency, Water Environment Research
Foundation, Federal Highway Administration, and Environmental and Water Resources Institute
of the American Society of Civil Engineers, by Geosyntec Consultants and Wright Water
Engineers, Inc., Washington, DC. Accessed March 24, 2016.
http://www.bmpdatabase.org/Docs/2009%20Stormwater%20BMP%20Monitoring%20Manual.pdf
Gilliom, R.J., R.M. Hirsch, and E.J. Gilroy. 1984. Effect of censoring trace-level water-quality data on
trend-detection. Environmental Science and Technology 18(7):530-535.
Grabow. G.L. 1999. Summary of Analyses Performed for Sycamore Creek Section 319 NNMP Project.
North Carolina State University, NCSU Water Quality Group, Raleigh, NC. Accessed April 29,
2016. https://www.epa.gov/polluted-runoff-nonpoint-source-pollution/monitoring-and-
evaluating-nonpoint-source-watershed.
Grabow, G.L., J. Spooner, L.A. Lombardo, and D.E. Line. 1998. Detecting water quality changes before
and after BMP implementation: use of a spreadsheet for statistical analysis. NWQEP Notes 92:1-
7-110
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
9. North Carolina State University Cooperative Extension, Raleigh. Accessed March 24, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/issues/92.pdf.
Grabow, G.L., J. Spooner, L.A. Lombardo, and D.E. Line. 1999. Detecting water quality changes before
and after BMP implementation: use of SAS for statistical analysis. NWQEP Notes 93:1-11. North
Carolina State University Cooperative Extension, Raleigh. Accessed March 24, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/issues/93.pdf.
Harmel, R.D. and K.W. King. 2005. Uncertainty in measured sediment and nutrient flux in runoff from
small agricultural watersheds. Transactions of the American Society of Agricultural and
Biological Engineers 48(5): 1713-1721.
Harmel, R.D., K.W. King, B.E. Haggard, D.G. Wren, and J.M. Sheridan. 2006. Practical guidance for
discharge and water quality collection in small watersheds. Transactions of the American Society
of Agricultural and Biological Engineers 49(4):937-948.
Harstine, L.J. 1991. Hydrologic Atlas for Ohio: Average Annual Precipitation, Temperature, Streamflow,
and Water Loss for a 50-Year Period, 1931-1980. Water Inventory Report No. 28. Ohio
Department of Natural Resource, Division of Water, Ground Water Resources Section.
Helsel, D.R. 2012. Statistics for Censored Environmental Data Using Minitab and R. 2nd ed. Wiley and
Sons, New York.
Helsel, D.R. and T.A. Cohn. 1988. Estimation of descriptive statistics for multiply censored water quality
data. Water Resources Research 24(12): 1997-2004.
Helsel, D.R., and R.M. Hirsch. 2002. Statistical Methods in Water Resources. Book 4, Chapter A3 in
Techniques of Water-Resources Investigations. U.S. Geological Survey, Reston, VA. Accessed
February 10, 2016. http://pubs.usgs.gov/twri/twri4a3/.
Hewlett, J.D. and L. Pienaar. 1973. Design and Analysis of the Catchment Experiment, In Proceedings of
a Symposium on Use of Small Watersheds in Determining Effects of Forest Land Use on Water
Quality, ed. E. H. White, University of Kentucky, Lexington, KY, May 22-23, 1973. Accessed
January 26, 2016. http://coweeta.uga.edu/publications/hewlett 73 catachment.pdf
Hibbert, A.R. 1969. Water yield changes after converting a forested catchment to grass. Water Resources
Research 5(3):634-640.
Hirsch, R.M. 1988. Statistical methods and sampling design for estimating step trends in surface water
quality. Water Resources Research 24:493-503.
Hirsch, R.M. and J.R. Slack. 1984. A nonparametric trend test for seasonal data with serial dependence.
Water Resources Research 20(6):727-732.
Hirsch, R.M., R.B. Alexander, and R.A. Smith.1991. Selection of methods for the detection and
estimation of trends in water quality. Water Resources Research 27:803-813.
Hirsch, R.M. D.L. Moyer, and S.A. Archfield. 2010. Weighted regressions on time, discharge and season
(WRTDS), with an application to Chesapeake Bay river inputs. Journal of the American Water
Resources Association 46(5):857-880.
7-111
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Hoorman, J., T.Hone, T.Sudman Jr., T.Dirksen, J. lies, and K.R. Islam. 2008. Agricultural impacts on
lake and stream water quality in Grand Lake St. Marys, Western Ohio. Water, Air, and Soil
Pollution 193: 309-322.
Hornbeck, J.W., R.S. Pierce, and C.A. Federer. 1970. Streamflow changes after forest clearing in New
England. Water Resources Research 6(4): 1124-1132.
Jambu, M. 1991. Exploratory andMultivariate Data Analysis. Academic Press, Inc., Boston.
Kilpatrick, F.A. and J.F. Wilson, Jr. 1989. Measurement of Time of Travel in Streams by Dye Tracing.
Book 3, Chapter A9 in Techniques of Water-Resources Investigations. U.S. Geological Survey,
Reston, VA. Accessed March 24, 2016. http://pubs.usgs.gov/twri/twri3-a9/pdf/twri_3-A9.pdf.
Lettenmaier. D.P. 1976. Detection of trends in water quality data from records with dependent
observations. Water Resources Research 12:1037-1046.
Lettenmaier, D.P. 1978. Design considerations for ambient stream quality monitoring. Water Resources
Bulletin 14(4): 8 84-902.
Lewis, J. 2006. Fixed and Mixed-Effects Models for Multi-Watershed Experiments. U.S. Department of
Agriculture, Forest Service, Pacific Southwest Research Station, Arcata, CA. Accessed March 24,
2016. http://www.fs.fed.us/psw/publications/4351/Lewis06.pdf
Line, D.E., W.A. Harman, G.D. Jennings, E.J. Thompson, and D.L. Osmond. 2000. Nonpoint-source
pollutant load reductions associated with livestock exclusion. Journal of Environmental Quality
29:1882-1890.
Loftis, J.C. and R.C. Ward. 1980a. Sampling frequency selection for regulatory water quality monitoring.
Water Resources Bulletin 16:501-507.
Loftis, J.C. and R.C.Ward, 1980b. Water quality monitoring - some practical sampling frequency
considerations. Environmental Management 4:521-526.
MacKenzie, M.C., R.N. Palmer, and S.P. Millard. 1987. Analysis of statistical monitoring network
design. Journal of Water Resources Planning and Management 113(5):599-615.
Marsh, N. and D. Waters. 2009. Comparison of Load Estimation Methods and Their Associated Error. In
18th World IMACS Congress andMODSIM09 International Congress on Modelling and
Simulation, ed. R.S. Anderssen, R.D. Braddock and L.T.H. Newham, Modelling and Simulation
Society of Australia and New Zealand and International Association for Mathematics and
Computers in Simulation, July 2009, pp. 3322-3328. Accessed March 24, 2016.
http://mssanz.org.au/modsim09/I4/marsh_I4.pdf.
Martz, E. 2013. Enough Is Enough! HandlingMulticollinearity in Regression Analysis. The Minitab
Blog, Minitab Inc. Accessed March 24, 2016. http://blog.minitab.com/blog/understanding-
statistics/handling-multicollinearitv-in-regression-analysis.
Matalas, N.C. and W.B. Langbein. 1962. Information content of the mean. Journal of Geophysical
Research 67(9):3441-3448.
7-112
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Meals, D.W. 1987. Detecting changes in water quality in the LaPlatte River watershed following
implementation of BMPs. Lake and Reservoir Management 3(1):185-194.
Meals, D.W. 2001. Lake Champlain Basin Agricultural Watersheds Section 319 National Monitoring
Program Project, Final Project Report: May, 1994-September, 2000. Vermont Department of
Environmental Conservation, Waterbury, VT.
Meals, D.W. and S.A. Dressing. 2005. Monitoring Data - Exploring Your Data, the First Step, Tech
Notes #1, July 2005. Prepared for U.S. Environmental Protection Agency, by Tetra Tech, Inc.,
Fairfax. Accessed March 24, 2016. https://www.epa.gov/polluted-runoff-nonpoint-source-
pollution/nonpoint-source-monitoring-technical-notes.
Meals, D.W. and D.C. Braun. 2006. Demonstration of methods to reduce E.coli runoff from dairy manure
application sites. Journal of Environmental Quality 35:1088-1100.
Meals, D.W. and S.A. Dressing. 2008. Surface Water Flow Measurement for Water Quality Monitoring
Projects, Tech Notes #3, March 2008. Prepared for U.S. Environmental Protection Agency, by
Tetra Tech, Inc., Fairfax, VA. Accessed March 24, 2016. https://www.epa.gov/polluted-runoff-
nonpoint-source-pollution/nonpoint-source-monitoring-technical-notes.
Meals, D.W., J. Spooner, S.A. Dressing, and J.B. Harcum. 2011. Statistical Analysis for Monotonic
Trends, Tech Notes #6, September 2011. Prepared for U.S. Environmental Protection Agency, by
Tetra Tech, Inc., Fairfax, VA. Accessed March 24, 2016. https://www.epa.gov/polluted-runoff-
nonpoint-source-pollution/nonpoint-source-monitoring-technical-notes.
Meals, D.W., R.P. Richards, and S.A. Dressing. 2013. Pollutant Load Estimation forWater Quality
Monitoring Projects. Tech Notes #8. Prepared for U.S. Environmental Protection Agency, by
Tetra Tech, Inc., Fairfax, VA. Accessed March 24, 2016. https://www.epa.gov/polluted-runoff-
nonpoint-source-pollution/nonpoint-source-monitoring-technical-notes.
Minitab. 2016. Minitab 17. Minitab Inc., State College, PA. Accessed January 22, 2016.
http ://www.minitab. com/en-US/products/minitab/.
Mosteller, F. and J.W. Tukey. 1977. Data Analysis and Regression: Second Course in Statistics. Addison-
Wesley Pub. Co., Reading, MA.
Moyer, D.L., R.M. Hirsch, and K.E. Hyer. 2012. Comparison of Two Regression-Based Approaches for
Determining Nutrient and Sediment Fluxes and Trends in the Chesapeake Bay Watershed.
Scientific Investigations Report 2012-5244. U.S. Geological Survey, Reston, VA. Accessed
March 24, 2016. http://pubs.usgs.gov/sir/2012/5244/.
NCDENR (North Carolina Department of Environment and Natural Resources). 2016. Sanitary Survey.
North Carolina Department of Environment and Natural Resources, Division of Marine Fisheries,
Raleigh, NC. Accessed March 24, 2016. http://portal.ncdenr.org/web/mf/sanitary-survey.
Newbold, J.D., S. Herbert, B.W. Sweeney, and P. Kiry. 2009. Water quality functions of a 15-year-old
riparian forest buffer system. NWQEP Notes 130:1-9. North Carolina State University
Cooperative Extension, Raleigh. Accessed March 15, 2016.
http://www.bae.ncsu.edu/programs/extension/wqg/issues/notesl30.pdf
7-113
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Newbold, J.D., S. Herbert, and B.W. Sweeney. 2009. Mitigation of Nonpoint Pollution by a Riparian
Forest Buffer in an Agricultural Watershed of the Mid-Atlantic Piedmont: Stroud Preserve
Watersheds National Monitoring Project Final Report. Stroud Water Research Center, Avondale
PA. Accessed March 24, 2016.
http://www.stroudcenter.org/research/projects/StroudPreserve/StroudNMPFinalReport2009.pdf.
ODNR (Ohio Department of Natural Resources). 2013. Grand Lake St. Marys State Park. Ohio
Department of Natural Resources. Accessed March 24, 2016.
http: //parks. ohiodnr. gov/grandlake stmarys.
OWEB (Oregon Watershed Enhancement Board). 1999. Oregon Aquatic Habitat Restoration and
Enhancement Guide. Oregon Watershed Enhancement Board, Salem, OR. Accessed April 25,
2016. http://www.oregon.gov/OWEB/docs/pubs/habguide99-complete.pdf
Palmer, R.N. and M.C. MacKenzie. 1985. Optimization of water quality monitoring networks. Journal of
Water Resources Planning and Management 111(4):478-493.
Perkins, W.W., E.B. Welch, J. Frodge, and T. Hubbard. 1997. A zero degree of freedom total phosphorus
model: application to Lake Sammamish, Washington. Lake and Reservoir Management 13:131-
141.
Porter, P.S., R.C. Ward, and H.F. Bell. 1988. The detection limit, water quality monitoring data are
plagued with levels of chemicals that are too low to be measured precisely. Environmental
Science and Technology 22:856-861.
Primrose, N.L. 2003. Report on Nutrient Synoptic Surveys in the Corsica River Watershed, Queen Annes
County, Maryland, April 2003. Maryland Department of Natural Resources, Watershed Services,
Annapolis, MD.
Quilbe, R., A.N. Rousseau, M. Duchemin, A. Poulin, G. Gangbazo, J. Villeneuve. 2006. Selecting a
calculating method to estimate sediment and nutrient loads in streams: application to the
Beaurivage River (Quebec, Canada). Journal of Hydrology 326(2006):295-310.
R Core Team. 2013. The R Project for Statistical Computing. R Foundation for Statistical Computing,
Vienna, Austria. Accessed April 25, 2016. http://www.R-project.org/.
Richards, R.P. 1998. Estimation of Pollutant Loads in Rivers and Streams: A Guidance Document for
NFS Programs. Prepared for U.S. EPA Region VIII, by Heidelberg University, Water Quality
Laboratory, Tiffin, OH. Accessed February 5, 2016.
http://141.139.110.110/sites/default/files/ifuller/images/Load Estl.pdf
Richards, R.P. and J. Holloway. 1987. Monte Carlo studies of sampling strategies for estimating tributary
loads. Water Resources Research 23:1939-1948.
Roseboom, D., T. Hill, J. Rodsater, J. Beardsley, and L. Duong. 1999. Evaluation of Sediment Delivery to
Lake Pittsfield After Best Management Practice Implementation-National Watershed Monitoring
Project. Illinois Environmental Protection Agency, Springfield, IL.
Rosgen, D.L. 1997. A Geomorphological Approach to Restoration of Incised Rivers. In Proceedings of
the Conference on Management of Landscapes Disturbed by Channel Incision, ed. S.S.Y. Wang,
E.J. Langendoen and F.D. Shields, Jr. 1997. Accessed March 24, 2016.
7-114
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
http://www.wildlandhydrology.eom/assets/A Geomorphological Approach to Restoration of I
ncised Rivers.pdf.
SAS Institute. 1985. SAS® User's Guide; Statistics. Version 5 Edition. SAS Institute Inc., Gary, North
Carolina 27513.
SAS Institute. 2010. SAS® Version 9.2, SAS/ETS. SAS Institute Inc., Gary, NC.
SAS Institute. 2016a. Collinearity Diagnostics, SAS 9.2 Documentation, SAS/STAT(R) 9.22 User's Guide.
SAS Institute, Inc., Gary, NC. Accessed March 24, 2016.
http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_reg
_sect038.htm.
SAS Institute. 2016b. JMP® Version 9.0.2, 2010. SAS Institute Inc., Gary, NC. Accessed March 24,
2016. http://www.jmp.com/software/.
SAS Institute. 2016c. Multiple Linear Regression, SAS 9.2 Documentation. SAS Institute, Inc., Gary, NC.
Accessed March 24, 2016.
http://support.sas.com/documentation/cdl/en/anlystug/58352/HTML/default/viewer.htmtfchapll
sect3.htm.
SAS Institute. 2016d. The AutoregProcedure, SAS/ETS(R) 9.2 User's Guide. SAS Institute, Inc., Gary,
NC. Accessed March 31, 2016.
http://support.sas.com/documentation/cdl/en/etsug/60372/HTML/default/viewer.htmtfautoreg toe
.htm.
Schilling, K.E., and J. Spooner. 2006. Effects of watershed-scale land use change on stream nitrate
concentrations. Journal of Environmental Quality 35:2132-2145.
Schwarz, G.E., A.B. Hoos, R.B. Alexander, and R.A. Smith. 2006. The SPARROW Surface Water-
Quality Model: Theory, Application and User Documentation. Techniques and Methods 6-B3.
U.S. Geological Survey, Reston, VA. Accessed March 24, 2016.
http://pubs.usgs.gov/tm/2006/tm6b3/.
Simpson, T. and S. Weammert. 2009. Developing Best Management Practice Definitions and
Effectiveness Estimates for Nitrogen, Phosphorus, and Sediment in the Chesapeake Bay
Watershed. Final Report. University of Maryland Mid-Atlantic Water Program. Accessed March
24, 2016. http://archive.chesapeakebav.net/pubs/BMP ASSESSMENT REPORT.pdf
Smith, R.V., S.D. Lennox, and J.S. Bailey 2003. Halting the upward trend in soluble phosphorus
transported from a grassland catchment. Journal of Environmental Quality 32:2334-2340.
Snedecor, G.W. and W.G. Cochran. 1989. Statistical Methods. 8th Edition. Iowa State University Press,
Ames, IA.
Soballe, D.M. 2014. Flux32. Developed in conjunction with Minnesota Pollution Control Agency, by
U.S. Army Corps of Engineers Waterways Experiment Station, Vicksburg, MS. Accessed March
24,2016.
Spooner, J., C.J. Jamieson, R.P. Maas, and M.D. Smolen. 1987. Determining statistically significant
changes in water pollutant concentrations. Lake Reservoir Management 3:195-201.
7-115
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Spooner, J., S.A. Dressing, and D.W. Meals. 201 la. Minimum detectable change analysis. Tech Notes #7.
Prepared for U.S. Environmental Protection Agency, by Tetra Tech, Inc., Fairfax, VA. Accessed
March 24, 2016. https://www.epa.gov/polluted-runoff-nonpoint-source-pollution/nonpoint-
source-monitoring-technical-notes.
Spooner, J., L.A. Szpir, D.E. Line, D. Meals, G.L Grabow, D.L. Osmond and C. Smith. 201 Ib. 2077
Summary Report: Section 319 National Monitoring Program Projects, National Nonpoint Source
Watershed Project Studies. North Carolina State University, Biological and Agricultural
Engineering Department, NCSU Water Quality Group, Raleigh, NC. Accessed March 15, 2016.
http: //www .bae .ncsu. edu/programs/extension/wqg/319monitoring/toc .html.
Sprague, L.A., R.M. Hirsch, and B.T. Aulenbach. 2011. Nitrate in the Mississippi River and its
tributaries, 1980 to 2008: are we making progress? Environmental Science and Technology 45
(17):7209-7216.
Statistics Solutions. 2016. Correlation (Pearson, Kendall, Spearman). Statistics Solutions, Clearwater,
FL. Accessed March 24, 2016.
http://www.statisticssolutions.com/correlation-pearson-kendall-spearman/.
Stuntebeck, T.D. 1995. Evaluating Barnyard Best Management Practices in Wisconsin Using Upstream-
DownstreamMonitoring. Fact Sheet FS-221-95. U.S. Geological Survey, Madison, WI. Accessed
February 5, 2016. http://pubs.usgs.gov/fs/1995/fs221-95/.
Stuntebeck, T.D. and R.T. Bannerman. 1998. Effectiveness of Barnyard Best Management Practices in
Wisconsin. Fact Sheet FS-051-98. U.S. Geological Survey, Madison, WI. Accessed March 24,
2016. http://pubs.er.usgs.gov/publication/fs05198.
Suppnick, J. 1999. Water Chemistry Trend Monitoring in Sycamore Creek and Haines Drain, Ingham
County, Michigan 1990-1997. Staff report MI/DEQ/SWQD-99-085. Michigan Department of
Environmental Quality, Surface Water Quality Division. Accessed April 29, 2016.
https://www.epa.gov/polluted-runoff-nonpoint-source-pollution/monitoring-and-evaluating-
nonpoint-source -watershed.
Tetra Tech. 2013. Preliminary Assessment of Effectiveness of the 2012 Alum Application—Grand Lake St.
Marys. Tetra Tech, Inc., Fairfax, VA. Accessed March 24, 2016.
http://www.lakeimprovement.com/sites/default/files/GLSM%20Alum%20Report%2002202013
.pdf
Thas O., L. Van Vooren, and J.P. Ottoy. 1998. Nonparametric test performance for trends in water quality
with sampling design applications. Journal of the American Water Resources Association
34(2):347-357.
Tukey, J.W. 1977. Exploratory Data Analysis. Addison-Wesley Publishing Co., Reading, MA.
USDA-NRCS (U.S. Department of Agriculture-Natural Resources Conservation Service). 2012.
Assessment of the Effects of Conservation Practices on Cultivated Cropland in the Upper
Mississippi River Basin. U.S. Department of Agriculture, Natural Resources Conservation
Service. Accessed March 24, 2016.
http://www.nrcs.usda.gov/wps/portal/nrcs/detail/national/technical/nra/ceap/na/?&cid=nrcsl43 0
14161.
7-116
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
USEPA (U.S. Environmental Protection Agency). 1997a. Linear Regression for Nonpoint Source
Pollution Analysis. EPA-841-B-97-007. U.S. Environmental Protection Agency, Office of Water,
Washington, DC. Accessed March 24, 2016.
USEPA (U.S. Environmental Protection Agency). 1997b. Monitoring Guidance for Determining the
Effectiveness of Nonpoint Source Controls. EPA 841-B-96-004. U.S. Environmental Protection
Agency, Office of Water, Washington, DC. Accessed March 24, 2016.
https://www.epa.gov/polluted-runoff-nonpoint-source-pollution/monitoring-guidance-
determining-effectiveness-nonpoint.
USEPA (U.S. Environmental Protection Agency). 2007. An Approach for Using Load Duration Curves in
the Development ofTMDLs. EPA 841-B-07-006. U.S. Environmental Protection Agency, Office
of Wetlands, Oceans and Watersheds, Watershed Branch, Washington, DC. Accessed March 24,
2016. https://www.epa.gov/tmdl/approach-using-load-duration-curves-development-tmdls.
USEPA (U.S. Environmental Protection Agency). 2008. Handbook for Developing Watershed Plans to
Restore and Protect Our Waters. EPA 841-B-08-002. U.S. Environmental Protection Agency,
Office of Water, Washington, DC. Accessed March 24, 2016. http://www.epa.gov/polluted-
runoff-nonpoint-source-pollution/handbook-developing-watershed-plans-restore-and-protect.
USEPA (U.S. Environmental Protection Agency). 2010. Causal Analysis/Diagnosis Decision Information
System (CADDIS). U.S. Environmental Protection Agency, Office of Research and Development,
Washington, DC. Accessed March 24, 2016. http://www.epa.gov/caddis.
USEPA (U.S. Environmental Protection Agency). 2013. Information on impaired waters and total
maximum daily loads. U.S. Environmental Protection Agency, Office of Water, Washington, DC.
Accessed March 24, 2016. http://water.epa.gov/lawsregs/lawsguidance/cwa/tmdl/index.cfm.
USF (University of South Florida), n.d. Collinearity. University of South Florida , College of Arts and
Sciences, Tampa, FL. Accessed March 24, 2016.
http://faculty.cas.usf.edu/mbrannick/regression/Collinearity.html.
Vollenweider, R.A. 1976. Advances in defining critical loading levels for phosphorus in lake
eutrophication. Memorie dell'Istituto italiano di idrobiologia dott. Marco DeMarchi 33:53-83.
Vollenweider, R.A., and J. Kerekes. 1982. Eutrophication of Waters: Monitoring, Assessment and
Control. Organization for Economic Co-Operation and Development, Paris.
Walker, J.F. 1994. Statistical techniques for assessing water quality effects of BMPs. Journal of
Irrigation and Drainage Engineering 120(2):334-347.
Walker, W.W. 1990. Flux Stream Load Computations. DOS Version 4.4. Prepared for U.S. Army Corps
of Engineers Waterways Experiment Station, Vicksburg, MS.
Walker, W.W. 1999. Simplified Procedures for Eutrophication Assessment and Prediction: User Manual.
Prepared for U.S. Army Corps of Engineers, Water Operations Technical Support Program,
Instruction Report W-96-2, September 1996 (Updated April 1999). Accessed March 24, 2016.
http://www.wwwalker.net/bathtub/Flux Profile Bathtub DOS 1999.pdf
White, W., J. Beardsley, and S. Tomkins. 2011. Waukegan River Illinois National Nonpoint Source
Monitoring Program Project. Contract Report 2011-01. , University of Illinois at Urbana-
7-117
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 7
Champaign, Institute of Natural Resource Sustainability, Illinois State Water Survey, Champaign,
Illinois. Accessed March 24, 2016.
http://www.isws.illinois.edu/pubdoc/CR/ISWSCR2011-01.pdf.
Whitfield, P.H. 1983. Evaluation of water quality sampling locations on the Yukon River. Water
Resources Bulletin 19(1):115-121.
Whitfield, P.H. and P.P. Woods. 1984. Intervention analysis of water quality records. Water Resources
Bulletin 20(5):657-668.
Wilm, H.G. 1949. How long should experimental watersheds be calibrated? Transactions of the American
Geophysical Union 30(2):272-278.
Zamyadi, A., J. Gallichand, and M. Duchemin. 2007. Comparison of methods for estimating sediment
and nitrogen loads from a small agricultural watershed. Canadian Biosystems Engineering
49:1.27-1.36.
7-118
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8 Quality Assurance and Quality Control
By S.A. Lanberg, J.G. O'Donnell, and J.B. Harcum
8.1 Introduction
Quality assurance and quality control (QA/QC) are commonly thought of as procedures used in the
laboratory to ensure that all analytical measurements made are accurate. Yet QA/QC extends beyond the
laboratory and includes a wide range of issues that nonpoint source (NFS) managers consider when
addressing the challenges of developing a monitoring program (see chapters 2 and 3). When considered
independently from monitoring program design, QA/QC may seem burdensome. Yet the purpose of
QA/QC is the same as a well-intentioned NFS manager, which is to ensure that the monitoring data
generated are complete, accurate, and suitable for the intended purpose. By integrating certain QA/QC
aspects with monitoring program design, NFS managers can reduce repetition and ultimately reduce total
costs by developing a more efficient monitoring design.
The remainder of this section defines QA/QC, discusses their value in NFS monitoring programs, and
explains EPA's policy on these topics. Section 8.2 provides an overview of the Data Quality Objectives
(DQO) process. EPA recommends that organizations use the DQO process to systematically plan their
monitoring programs. Typically, written QA/QC documentation takes the form of a quality assurance
project plan (QAPP). As discussed in section 8.3, a QAPP details the technical activities and QA/QC
procedures that should be implemented to ensure the data meet the specified standards.
The QAPP should identify who will be involved in the project and their responsibilities; the nature of the
study or monitoring program; the questions to be addressed or decisions to be made based on the data
collected; where, how, and when samples will be taken and analyzed; the requirements for data quality;
the specific activities and procedures to be performed to obtain the requisite level of quality (including
QC checks and oversight); how the data will be managed, analyzed, and checked to ensure that they meet
the project goals; and how the data will be reported. The QAPP should be implemented and maintained
throughout a project.
Sections 8.4, 8.5, and 8.6 provide more specific information for preparing QAPPs with respect to field
operations, laboratory operations, and data and reporting requirements, respectively. Although there are
many commonalities, QAPP development to support modeling and secondary data usage is beyond the
scope of this chapter. The reader is referred to CREM (2009) and USEPA (2002b) for guidance on the
development and application of environmental models and related QAPPs. EPA also provides guidance
about the evaluation of existing (secondary) data quality (USEPA 2012) and information needed to
develop QAPPs for secondary data projects (USEPA 2008a).
8-1
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.1.1 Definitions of Quality Assurance and Quality Control
8.1.1.1 Quality assurance:
An integrated system of management activities involving planning, implementation, documentation,
assessment, reporting, and quality improvement to ensure that a process, item, or service is of the type
and quality needed and expected by the client (USEPA 200Ic).
8.1.1.2 Quality control:
The overall system of technical activities that measures the attributes and performance of a process, item,
or service against defined standards to verify that they meet the stated requirements established by the
customer; operational techniques and activities that are used to fulfill requirements for quality (USEPA
2001c).
In a laboratory setting, QC procedures include the regular inspection of equipment to ensure it is
operating properly and the collection and analysis of blank, duplicate, and spiked samples and standard
reference materials to ensure the accuracy and precision of analyses. QA activities are more managerial in
nature and include assignment of roles and responsibilities to project staff, staff training, development of
data quality objectives, data validation, and laboratory audits. Table 8-1 lists some common activities that
fall under the heading of QA/QC. Such procedures and activities are planned and executed by diverse
organizations through carefully designed quality management programs that reflect the importance of the
work and the degree of confidence needed in the quality of the results.
Table 8-1. Common QA/QC activities
QA Activities
Organization of the project into component parts
Assignment of roles and responsibilities to project staff
Determine the number of QC samples and sampling sites needed to obtain data of a required confidence level
Tracking of sample custody from field collection through final analysis
Development and use of data quality objectives to guide data collection efforts
Auditing of field and laboratory operations
Maintenance of accurate and complete records of all project activities
Training of personnel to ensure consistency of sample collection techniques and equipment use
QC Activities
Collection of duplicate samples for analysis
Analysis of blank, duplicate, and spike samples
Regular inspection and calibration of analytical equipment
Regular inspection of reagents and water for contamination
Regular inspection of refrigerators, ovens, etc. for proper operation
Regular evaluation of data against QC objectives
Adapted from Drouse et al., 1986, and Erickson et al., 1991.
8-2
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.1.2 Importance of QA/QC Programs
While it is desirable to stay below 10 percent, development and implementation of a QA/QC program can
require up to 10 to 20 percent of project resources (Cross-Smiecinski and Stetzenback 1994). This cost,
however, can be recaptured in lower overall costs of a well-planned and executed project. Likely
problems are anticipated and accounted for before they arise, eliminating the need to resample, reanalyze
data, or revisit portions of the project to determine where an error was introduced. A QAPP can serve as a
foundation for documenting standard operating procedures for all project activities, ensuring that project
tasks are conducted consistently by all personnel and can support training for new personnel as the project
moves forward. During a project, QA/QC information can provide essential feedback to ongoing project
management. Most importantly, a QA/QC program helps ensure that project data are of known accuracy
and precision, that errors are minimized, and that all critical project activities are conducted consistently.
As long as the QA/QC procedures are followed, the data and information collected by the project will be
adequate to support technical conclusions and choices from among alternative courses of action. These
conclusions and actions will be defensible based on quality of the data and information collected. In short,
QA/QC procedures and activities are cost-effective measures used to determine how to allocate project
energies and resources toward improving the quality of research and the usefulness of project results
(Erickson et al, 1991).
8.1.3 EPA Quality Policy
EPA has established a QA/QC program to ensure that data used in research and monitoring projects are of
known and documented quality to satisfy project objectives. The use of different methods, lack of data
comparability, unknown data quality, and poor coordination of sampling and analysis efforts can delay
the progress of a project or render the data and information collected from it unsuited for decision
making. QA/QC practices should be integral parts of the development, design, and implementation of an
NPS monitoring project to minimize or eliminate these problems (Erikson et al. 1991; Pritt and Raese
1992; USEPA 2001b).
EPA Order CIO 2105.0 (formerly EPA Order 5360.1 A2), EPA's Policy and Program Requirements for
the Mandatory Agency-wide Quality System (USEPA 2000b), provides requirements for the conduct of
quality management practices, including QA/QC activities, for all environmental data collection and
environmental technology programs performed by or for EPA. The EPA Quality Manual for
Environmental Programs (USEPA 2000a) provides program requirements for implementing EPA's
mandatory quality system. In accordance with EPA Order CIO 2105.0, EPA requires that environmental
programs be supported by a quality system that complies with the quality system standard developed by
the American National Standard ANSI/ASQC E4-1994, Specifications and Guidelines for Quality
Systems for Environmental Data Collection and Environmental Technology Programs (ANSI/ASQC
1994). The ANSI/ASQC E4-1994 quality system standard was later updated as ANSI/ASQ E4-2004,
Quality Systems for Environmental Data and Technology Programs - Requirements with Guidance for
Use (ANSI/ASQ 2004).
EPA's mandatory agency-wide Quality System Policy (EPA Policy CIO 2106.0) requires each office or
laboratory generating data to implement minimum procedures to ensure that precision, accuracy,
completeness, comparability, and representativeness of data are known and documented (Erickson et al.
1991; USEPA 2008b). This policy is now based on the quality system standard developed by the
American National Standards Institute and the American Society of Quality Control (ANSI/ASQ 2004).
Each office or laboratory is required to specify the quality levels that data must meet to be acceptable and
8-3
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
satisfy project objectives. This requirement applies to all environmental monitoring and measurement
efforts mandated or supported by EPA through regulations, grants, contracts, or other formal agreements.
To ensure that this responsibility is met uniformly across EPA, each organization performing work for
EPA must document in a Quality Management Plan (QMP) that is approved by its senior management
how it will plan, implement, and assess the effectiveness of QA/QC operations applied to environmental
programs (USEPA 200Ib). In addition, each non-EPA organization must have an approved QAPP that
covers each monitoring or measurement activity associated with a project (Erickson et al. 1991, USEPA
1983, 2008b). Additional implementation guidance is provided in EPA Quality Manual for
Environmental Programs (USEPA 2000a).
The purpose of writing a QAPP prior to undertaking an NPS monitoring project is to establish clear
objectives for the program, including the types of data needed and the quality of the data generated
(accuracy, precision, completeness, representativeness, and comparability) in order to meet the project's
water quality and land treatment objectives. See section 2.1 for a discussion of appropriate objectives for
NPS monitoring projects.
The QAPP should specify the policies, organization, objectives, functional activities, QA procedures, and
QC activities designed to achieve the data quality goals of the project. It should be distributed to all
project personnel, and they should be familiar with the policies and objectives outlined in the QAPP to
ensure proper interaction of the sampling and laboratory operations and data management. Although a
QA/QC officer oversees major aspects of QAPP implementation, all persons involved in an NPS
monitoring project who either perform or supervise the work done under the project are responsible for
ensuring that the QA/QC procedures and activities established in the QAPP are adhered to.
The QMP and each QAPP must be submitted for review to the EPA organization responsible for the work
to be performed, and they must be approved by EPA or its designee (e.g., federal or state agency) as part
of the contracting or assistance agreement process before data collection can begin. In addition, it is
important to note that the QMP and QAPP are "live" documents and programs in the sense that once they
have been developed they cannot be placed on a shelf for the remainder of the project. All QA/QC
procedures should be evaluated and plans updated as often as necessary during the course of a project to
ensure that they are in accordance with the present project direction and efforts (Knapton and Nimick
1991, USEPA 2001c).
8.2 Data Quality Objectives
When monitoring data are being used to assess water quality and the effects of land-based activities on
water quality or the effectiveness of best management practices, EPA recommends that states, tribes, and
non-governmental organizations (NGOs) consider using the systematic planning tool called the Data
Quality Objectives (DQO) Process. The DQO process should be part of project planning and development
of a proposed monitoring strategy.
The DQO process is used to establish performance or acceptance criteria that serve as the basis for
designing a plan for collecting data of sufficient quality and quantity to support the objectives of a study.
The DQO process consists of seven iterative steps (USEPA 2006):
1) State the problem: define the problem that necessitates the study; identify the planning team,
examine budget, schedule.
8-4
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
2) Identify the goal of the monitoring program: state how monitoring data will be used in
meeting objectives and solving the problem, identify study questions, define alternative
outcomes.
3) Identify information inputs: identify data and information needed to answer questions.
4) Define the boundaries of the study: specify the target population and characteristics of interest,
define spatial and temporal limits, scale of inference.
5) Develop the analytic approach: define the parameters of interest, specify the type of inference,
and develop the logic for drawing conclusions from findings.
6) Specify performance or acceptance criteria: develop performance criteria for new data being
collected or acceptance criteria for existing data being considered for use.
7) Develop the plan for obtaining data: select the resource-effective monitoring plan that meets
the performance criteria.
Several iterations of the process might be required to specify the DQOs for a monitoring program.
Because DQOs are continually reviewed during data collection activities, any needed corrective action
can be planned and executed to minimize problems before they become significant. General guidance and
examples of planning for monitoring programs are also provided in related guidance (USEPA 2003a).
8.2.1 The Data Quality Objectives Process
The DQO process takes into consideration the factors that will depend on the data (most importantly, the
decision(s) to be made) or that will influence the type and amount of data to be collected (e.g., the
problem being addressed, existing information, information needed before a decision can be made, and
available resources). From these factors the qualitative and quantitative data needs are determined. The
purpose of the DQO process is to improve the effectiveness, efficiency, and defensibility of decisions
made based on the data collected, and to do so in a resource-effective manner (USEPA 2006).
DQOs are qualitative and quantitative statements that clarify the study objective, define the most
appropriate type of data to collect, and determine the most appropriate conditions under which to collect
them. DQOs also specify the minimum quantity and quality of data needed by a decision maker to make
any decisions that will be based on the results of the project. By using the DQO process, investigators can
ensure that the type, quantity, and quality of data collected and used in decision making will be
appropriate for the intended use. Similarly, efforts will not be expended to collect information that does
not support defensible decisions. The products of the DQO process are criteria for data quality and a data
collection design that ensures that data will meet the criteria.
A brief description of each step of the DQO process and a list of activities that are part of each step
follow. For a detailed discussion of the DQO development process, refer to EPA's Guidance on
Systematic Planning Using the Data Quality Objectives Process (USEPA 2006). This reference contains a
case study example of the DQO process. A computer program, Data Quality Objectives Decision Error
Feasibility Trials (USEPA 200la), is also available to help the planning process by generating cost
information about several simple sampling designs based on the DQO constraints before the sampling and
analysis design team begins developing a final sampling design in the last step of the DQO process.
8-5
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.2.1.1 (1) State the problem
In this first step, concisely describe the problem to be studied . A review of prior studies and existing
information is important during this step to gain a sufficient understanding of the problem in order to
define it. The specific activities to be completed during this step (outputs) are:
• Identify members of the planning team.
• Identify the primary decision maker of the planning team and define each member's role and
responsibilities during the DQO process.
• Develop a concise description of the problem.
• Specify the available resources and relevant deadlines for the study.
8.2.1.2 (2) Identify the goal of the monitoring program
Identify what questions the study will attempt to resolve and what actions might be taken based on the
study. This information is used to prepare a "decision statement" or an objective that will link the
principal study question to one or more possible actions that should solve the problem. Example NFS
monitoring program objectives might be to "determine the sources of bacteria causing the water quality
standard violation in Duck Creek" or "determine the effects of land treatment program xyz on phosphorus
loads to Lake Eutrophy." Results from the monitoring program would then support management
decisions to take action, modify an action, or take no action.
The specific activities to be completed during this step are:
• Identify the principal study question.
• Define the alternative actions that could result from resolution of the principal study question.
• Combine the principal study question and the alternative actions into a decision statement.
• If applicable, organize multiple decisions to be made by priority.
8.2.1.3 (3) Identify information inputs
Identify the information that needs to be obtained and the measurements that need to be taken to resolve
the decision statement. The specific activities to be completed during this step are:
• Identify the information that will be required to resolve the decision statement.
• Determine the sources for each item of information identified above.
• Identify the information that is needed to establish the threshold value that will be the basis of
choosing among alternative actions.
• Confirm that appropriate measurement methods exist to provide the necessary data.
8.2.1.4 (4) Define the boundaries of the study
Specify the time periods and spatial area to which decisions will apply and determine when and where
data should be collected. This information is used to define the population(s) of interest. The term
population refers to the total collection or universe of objects from which samples will be drawn. The
population could be the concentration of a pollutant in sediment, a water quality variable, algae in the
8-6
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
river, or bass in the lake. It is important to define the study boundaries to ensure that data collected are
representative of the population being studied (because every member of a population cannot be sampled)
and will be collected during the time period and from the place that will be targeted in the decision to be
made. The specific activities to be completed during this step are:
• Specify the characteristics that define the population of interest.
• Identify the geographic area to which the decision statement applies (such as a county) and any
strata within that area that have homogeneous characteristics (e.g., recreational waters, dairy farms).
• Define the time frame to which the decision applies.
• Determine when to collect data.
« Define the scale of decision making, or the actual areas that will be affected by the decision
(e.g., first-order streams, dairy farms with streams running through them, a county).
• Identify any practical constraints on data collection.
8.2.1.5 (5) Develop the analytic approach
Define the statistical parameter of interest, specify the threshold at which action will be taken, and
integrate the previous DQO outputs into a single statement that describes the logical basis for choosing
among alternative actions. This statement is known as a decision rule. It is often phrased as an
"If...then..." statement. For example, "If septic systems are contributing to water quality standard
violations, then failing septic systems will be remediated; otherwise, no action will be taken." The
specific activities to be completed during this step are:
• Specify the statistical parameter that characterizes the population (the parameter of interest), such
as the mean, median, or percentile.
• Specify the numerical value of the parameter of interest that would cause a decision maker to take
action, i.e., the threshold value.
• Develop a decision rule in the form of an "if...then..." statement that incorporates the parameter of
interest, the scale of decision making, the threshold level, and the actions that would be taken.
8.2.1.6 (6) Specify performance or acceptance criteria
Define the decision maker's tolerable limits of making an incorrect decision (or decision error) due to
incorrect information (i.e., measurement and sampling error) introduced during the study. These limits are
used to establish performance goals for the data collection design. Base the limits on a consideration of
the consequences of making an incorrect decision. The decision maker cannot know the true value of a
population parameter because the population of interest almost always varies over time and space and it is
usually impractical or impossible to measure every point (sampling design error). In addition, analytical
methods and instruments are never absolutely perfect (measurement error). Thus, although it is
impossible to eliminate these two errors, the combined total study error can be controlled to reduce the
probability of making a decision error. The specific activities to be completed during this step are:
• Determine the possible range (likely upper and lower bounds) of the parameter of interest.
• Identify the decision errors and choose the null hypothesis. Decision errors for NFS pollution
problems might take the general form of deciding there is an impact when there is none [a false
8-7
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
positive, or type I error] or deciding there is no impact when there is [a false negative, or type II
error].
• Specify the likely consequences of each decision error. Evaluate their potential severity in terms of
ecological effects, human health, economic and social costs, political and legal ramifications, and
other factors.
• Specify a range of possible parameter values where the consequences of decision errors are
relatively minor (gray region). The boundaries of the gray region are the threshold level and the
value of the parameter of interest where the consequences of making a false negative decision begin
to be significant.
• Assign probability limits to points above and below the gray region that reflect the tolerable
probability for the occurrence of decision errors.
8.2.1.7 (7) Develop the plan for obtaining data
Evaluate information from the previous steps and generate alternative data collection designs. Some
aspects of this may be considered informally during the project planning process, and less attention can be
given to some alternatives. The designs should specify in detail the monitoring that is required to meet the
DQOs, including the types and quantity of samples to be collected; where, when, and under what
conditions they should be collected; what variables will be measured; and the QA/QC procedures that will
ensure that the DQOs are met. The QA/QC procedures are fully developed when the QAPP is written
(see below). Choose the most resource-effective design that meets all of the DQOs. As resources dictate,
it may be necessary to reduce or restate the DQOs. The specific activities to be completed during this step
are:
• Review the DQO outputs and existing environmental data.
• Develop general data collection design alternatives.
• Formulate the mathematical expressions needed to solve the design problem for each data
collection design alternative. This involves selecting a statistical test method (e.g., Student's ^test),
developing a statistical model that relates the measured value to the "true" value, and developing a
cost function that relates the number of samples to the total cost of sampling and analysis.
• Select the optimal sample size that satisfies the DQOs for each data collection design alternative.
• Select the most resource-effective data collection design that satisfies all of the DQOs.
• Document the selected design's key features and the statistical assumptions of the selected design. It
is particularly important that the statistical assumptions be documented to ensure that, if any
changes in analytical methods or sampling procedures are introduced during the project, these
assumptions are not violated.
The DQO process should be used during the planning stage of any study that requires data collection, and
before the data are collected. EPA's policy is to use the DQO process to plan all data collection efforts
that will require or result in a substantial commitment of resources. The DQO process is applicable to all
studies, regardless of size; however, the depth and detail of the DQO development effort depends on the
complexity of the study. In general, more complex studies benefit more from more detailed DQO
development.
8-8
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.2.2 Data Quality Objectives and the QA/QC Program
The DQOs and the quality objectives for measurement data that will be specified in the QAPP are
interdependent. The DQOs identify project objectives; evaluate the underlying hypotheses, experiments,
and tests to be performed; and then establish guidelines for the data collection effort needed to obtain data
of the quality necessary to achieve these objectives (Erickson et al. 1991, USEPA 2006). The QAPP
presents the policies, organization, and objectives of the data collection effort and explains how particular
QA/QC activities will be implemented to achieve the DQOs of the project, as well as to determine what
future research directions might be taken (Erickson et al, 1991, USEPA 2006). At the completion of data
collection and analysis, the data are validated according to the provisions of the QAPP and a Data Quality
Assessment (DQA), using statistical tools, is conducted to determine:
• Whether the data meet the assumptions under which the DQOs and the data collection design were
developed.
• Whether the total error in the data is small enough to allow the decision maker to use the data to
support the decision within the tolerable decision error rates expressed by the decision maker
(USEPA 2006).
Thus, the entire process is designed to assist the decision maker by planning and obtaining environmental
data of sufficient quantity and quality to satisfy the project objectives and allow decisions to be made
(USEPA 200 Ic, 2006). The DQO process is the part of the quality system that provides the basis for
linking the intended use of the data to the QA/QC requirements for data collection and analysis (USEPA
2006).
8.3 Elements of A Quality Assurance Project Plan
QAPPs must be prepared in accordance with EPA Requirements for Quality Assurance Project Plans
(USEPA, 2001b) and Guidance for Quality Assurance Project Plans (USEPA 2002a). EPA requires that
four types of elements be discussed in a Quality Assurance Project Plan (QAPP): Project Management,
Measurement and Acquisition, Assessment and Oversight, and Data Validation and Usability. These
elements are listed in Table 8-2. For complete descriptions and requirements, see USEPA (200Ib).
Additional information on the contents of a QAPP is contained in Drouse et al. (1986), Erickson et al.
(1991), and Cross-Smiecinski and Stetzenback (1994). Drouse et al. (1986) and Erickson et al. (1991) are
examples of EPA QAPPs prepared under previous guidance.
The elements in Table 8-2 should always be addressed in the QAPP, unless otherwise directed by the
overseeing or sponsoring EPA organization(s). Both laboratory and field operations should be included.
The types, quantity, and quality of environmental data collected for each project could be quite different.
The level of detail in each QAPP will vary according to the nature of the work being performed and the
intended use of the data (USEPA 200 Ib). If an element is not applicable or required, then this should be
stated in the QAPP. For some complex projects, it might be necessary to add special requirements to the
QAPP. Again, the QAPP must be approved by the sponsoring EPA organization before data collection
can begin.
8-9
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 8
Table 8-2. Elements required in an EPA Quality Assurance Project Plan. (USEPA, 2001 b)
QAPP Element
A1
A2
A3
A4
A5
A6
A7
A8
A9
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
C1
C2
D1
D2
D3
Title and Approval Sheet
Table of Contents
Distribution List
Project/Task Organization
Problem Definition/Background
Project/Task Description
Quality Objectives and Criteria
Special Training/Certification
Documents and Records
Sampling Process Design (Experimental Design)
Sampling Methods
Sampling Handling and Custody
Analytical Methods
Quality Control
Instrument/Equipment Testing, Inspection, Maintenance
Instrument/Equipment Calibration and Frequency
Inspection/Acceptance of Supplies and Consumables
Non-direct Measurements
Data Management
Assessments and Response Actions
Reports to Management
Data Review, Verification, and Validation
Verification and Validation Methods
Reconciliation and User Requirements
Standard Operating Procedures (SOPs) must be provided or referenced in the QAPP such that they are
available to all participants. An SOP typically presents in detail the method for a given technical
operation, analysis, or action in sequential steps and it includes specific facilities, equipment, materials
and methods, QA/ QC procedures, and other factors necessary to perform the operation, analysis, or
action for the particular project. By following the SOP, the operation should be performed the same way
every time. Activities typically include field sampling, laboratory analysis, software development, and
database management. EPA presents examples of the format and content of SOPs (USEPA, 2007). The
format and content requirements for an SOP are flexible because the content and level of detail in SOPs
vary according to the nature of the procedure. SOPs should be revised when new equipment is used, when
comments by personnel indicate that the directions are not clear, or when a problem occurs. Organizations
should ensure that current SOPs are used. SOPs are critical in the training of new personnel during the
conduct of a long-term project.
8-10
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
Definitions of selected data quality terms
Precision (reproducibility) is an expression of mutual agreement of multiple measurements of the same property
(e.g., duplicate field samples or duplicate lab samples) conducted under similar conditions. It is evaluated by
recording and comparing multiple measurements of the same parameter on the same exact sample under the
same conditions. Relative percent difference (RPD) is a measure of precision and is calculated with the following
formula (Cross- Smiecinski and Stetzenback, 1994):
where
x-i = analyte concentration of first duplicate and
X2 = analyte concentration of second duplicate.
Accuracy (bias) is the degree of agreement of a measurement (or an average of measurements), X, with an
accepted reference or true value, T. Accuracy is expressed as the percent difference from the true value {1 00 [(X-
T)/T]} unless spiking materials are used and percent recovery is calculated (Erickson et al., 1991). Accuracy can
be determined by analyzing a sample and its corresponding matrix spike. Accuracy can be expressed as percent
recovery and calculated using the following formula (Air National Guard, 1993):
where
A = spiked sample result;
B= sample result; and
C= spike added.
Comparability is defined as the confidence with which one data set can be compared to another (Erickson et al.,
1991). Consistent sampling methodology, handling, and analyses are necessary to ensure comparability. Also,
assurance that equipment has been calibrated properly and analytical solutions prepared identically is necessary
to attain data comparability (Air National Guard, 1993).
Representativeness is a measure of how representative the data obtained for each parameter are compared
with the values of the same parameter within the population being measured. Because the total population cannot
be measured, sampling must be designed to ensure that the samples are representative of the population being
sampled (Air National Guard, 1 993). A relevant sampling design issue, for example, is to determine how a sample
will be collected to ensure it is representative of the desired characteristic (Erickson et al., 1991).
Completeness is defined as the amount of valid data obtained from a measurement system compared to the
amount that was expected to be obtained under anticipated sampling/analytical conditions (Erickson et al., 1991).
An assessment of the completeness of data is performed at the end of each sampling event, and if any omissions
are apparent, an attempt is made to resample the parameter in question, if feasible. Data completeness should
also be assessed prior to the preparation of data reports that check the correctness of all data. An example of a
formula used for this purpose is
\V
%C = 100
-}
.n\
where
%C = percent complete;
V = number of measurements judged valid; and
n = total number of measurements necessary to achieve a specified level of confidence in decision
making (Cross-Smiecinski and Stetzenback, 1994).
8-11
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.4 Field Operations
Field operations are an important activity in an NFS monitoring program. Field operations involve the
organization and design of the field operation, selection of sampling sites, selection of sampling
equipment, sample collection, sample handling and transport, and safety and training issues. For the
purposes of QA/QC, the process of conducting field operations should be broken down into as many
separate steps as are necessary to ensure complete consideration of all of the elements and processes that
are a part of field activities. Field operations described in this section have been broken down into the
phases mentioned above, but individual monitoring programs might require the use of more or fewer
phases. For example, if the sample collection phase is very complex or if it is anticipated that sample
collection will often be done under inclement weather conditions when field personnel might experience
discomfort and feel rushed, it is advisable to break sample collection into separate preparation, sampling,
and termination phases and discuss QA/QC for each of the phases separately. This will ensure that no
details are omitted.
8.4.1 Field Design
Adherence to the procedures specified in the QAPP for field operations and documentation of their use
for all aspects of field operations are extremely important if the data obtained from the project are to be
useful for decision making, supportable if questioned, and comparable for use by future researchers
(Knapton and Nimick 1991). Data sheets, for recording site visit information and field data, should be
prepared beforehand. Where applicable data sheets should include data quality reminders to help ensure
that all data are collected and QA/QC procedures are followed during all field activities.
General information that should be included in the documentation of the design for field operations
includes the scale of the operations (laboratory, plot, hillslope, watershed); size of plots/data collection
sites; designation of control sites; basin characteristics; soil and vegetation types; maps with the location
of plots/data collection sites within the basin/catchment; weather conditions under which sampling is
conducted; equipment and methods used; problems that might be encountered during sampling; dates of
commencement and suspension of data collection; temporal gaps in data collection; frequency of data
collection; intensity of data collection; and sources of any outside information (e.g., soil types, vegetation
identifications) (Erickson et al., 1991). Some of these aspects are discussed in greater detail in the
following sections.
8.4.2 Sampling Site Selection
The selection of sampling sites is important to the validity of the results. Sites must be selected to provide
data to meet the goals/objectives of the project. The QAPP should provide detailed information on
sampling site locations (e.g., latitude and longitude); characteristics that might be important to data
interpretation (e.g., percent riparian cover, stream order); and the rationale for selecting the sites used
(Knapton and Nimick, 1991). Sites from other studies can be convenient to use due to their familiarity
and the availability of historical data, but such sites should be scrutinized carefully to be certain that data
obtained from them will serve the objectives of the project. If during the course of the project it is found
that one or more sampling sites are not providing quality data, alternative sites might be selected and the
project schedule adjusted accordingly. The adequacy of the sampling locations and the sampling program
should be reviewed periodically by project managers, as determined by data needs (Knapton and Nimick,
1991).
8-12
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
Sampling sites should be visited before sampling begins. It is important to verify that the sites are
accessible and are suitable for collection of the data needed. Consideration should be given to
accessibility in wet or inclement weather if samples will be taken during such conditions. The sites should
be visited, if possible, in the type(s) of weather during which sampling will occur.
Plastic-laminated pictures of each sampling site with an arrow pointing to each monitoring location can
assist field personnel in finding the sites during inclement weather when the sites might appear different.
If permission to access a site is needed (for instance, if one or more sites are on or require passage
through private property), such permission must be obtained before sampling begins. The person(s)
granting the permission should be fully informed about the number of persons who will be visiting during
each sampling event, frequency of sampling, equipment that will have to be transported to the sampling
site(s), any hazardous or dangerous materials that will be used during sampling, and any other details that
might affect the decision of the person(s) to grant access permission. A lack of full disclosure of
information to gain access permission creates a risk of the permission's being revoked at some point
during the project. A copy of the site entry permission letter or document should be taken to the site at the
time of field visit.
8.4.3 Sampling Equipment
Equipment for field operations includes field-resident equipment such as automatic samplers and stage-
level recorders and nonresident sampling equipment such as flow, pH, and conductivity meters;
equipment needed to gain access to sampling sites such as boats; and equipment for field personnel health
and safety, such as waders, gloves, and life vests. The condition and manner of use of the field equipment
determines the reliability of the collected data and the success of each sampling event. Therefore,
operation and maintenance of the equipment are important elements of field QA/QC. All measurement
equipment must be routinely checked and calibrated to verify that it is operating properly and generating
reliable results, and all access and health and safety equipment should be routinely checked to be certain
that it will function properly under all expected field conditions.
A manual with complete descriptions of all field equipment to be used should be available to all field
personnel. The manual should include such information as model numbers for all measurement
equipment, operating instructions, routine repair and adjustment instructions, decontamination techniques,
sampling preparation instructions (e.g., washing with deionized water), and use limitations (e.g.,
operating temperature range). If any samples are to be analyzed in the field, the techniques to be used
should be thoroughly described in the manual.
8.4.4 Sample Collection
A Sampling Plan should be developed and approved prior to sampling. The process of sample collection
should be described with the same amount of detail as the equipment descriptions. A thorough description
of the sample collection process includes when the sampling is to be done (e.g., time of day, month, or
year; before and/or after storms); the frequency with which each type of sample will be collected; the
location at which samples are to be taken (i.e., depth, distance from shore, etc.); the time between samples
(if sampling is done repetitively during a single sampling site visit); and how samples are to be labeled.
Each field person must be thoroughly familiar with the sampling techniques (and equipment) prior to the
first sampling event. Holding practice sampling events prior to the commencement of actual sampling is
an excellent way to prepare all field personnel and will help to identify potential problems with the
8-13
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
sampling sites (access, difficulty under different weather conditions), sampling equipment, and sampling
techniques.
Quality control activities for field operations must ensure that all field operations are conducted so that
sampling is done in a consistent manner and that all generated information is traceable and of known and
comparable quality. Each field activity should be standardized. Standard operating procedures (SOPs) for
field sampling have been developed and might be required depending on the agency for which the
sampling is being conducted. Elements of the field operations section of a QAPP should include clear
statements of the regulatory requirements applicable to the project. Any SOPs that are part of regulatory
requirements should be followed precisely. The pictures taken of each sampling site to aid in locating the
sampling sites also help ensure consistency of field monitoring across time and personnel by ensuring that
the same spot is used at each sampling event.
Depending on the DQOs and data requirements of the program (type of data and frequency of collection),
additional quality control samples might be needed to monitor the performance of various field (as well as
laboratory) operations including sampling, sample handling, transportation, and storage.
As the samples are collected, they must be labeled and packaged for transport to a laboratory for analysis
(or other facility for nonchemical analyses). Computer-generated sample bottle labels prepared before the
sampling event and securely attached to each bottle help minimize mistakes. Sampling location and
preservation, filtration, and laboratory procedures to be used for each sample should be recorded on each
label. Be sure these labels are printed with waterproof ink on waterproof paper, and use aNo. 2 pencil or
waterproof/solvent-resistant marker to record information.
8.4.5 Sample Handling and Transport
Once samples have been collected, they must be analyzed, usually in a laboratory. Handling and transport
of sampling containers and custody of sample suites is also a part of field operations. Sample transport,
handling, and preservation must be performed according to well-defined procedures. The various persons
involved in sample handling and transport should follow SOPs for this phase of the project. This will help
ensure that samples are handled properly, comply with holding time and preservation requirements, and
are not subject to potential spoilage, cross-contamination, or misidentification.
The chain of custody and communication between the field operations and other units such as the
analytical laboratory also need to be established so that the status of the samples is always known and can
be checked by project personnel at any time. The chain of custody states who the person(s) responsible
for the samples are at all times. It is important that chain of custody be established and adhered to so that
if any problem with the samples occurs, such as loss, the occurrence can be traced and possibly rectified,
or it can be determined how serious the problem is and what corrective action needs to be taken. Field
data custody sheets are essential for this effort (Cross-Smiecinski and Stetzenback, 1994). Chain-of-
custody seals must be applied to sample containers and shipping containers.
8.4.6 Safety and Training
When dealing with NPS monitoring, sampling activities often occur during difficult weather and field
conditions. It is necessary to assess these difficulties and establish a program to ensure the safety of the
sampling personnel. The following types of safety issues, at a minimum, should be considered and
included in training and preparation activities for sampling: exposure, flood waters, debris in rivers and
8-14
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
streams, nighttime collecting, criminal activity, and first aid for minor injuries. The trade-off between the
need for data quality and the safety of personnel is a factor that project staff should consider collectively.
Finally, the QAPP for the field operations should include provisions for dealing with any foreseeable
problems such as droughts, floods, frozen water, missing samples, replacement personnel during sickness
or vacation, lost samples, broken sample containers, need for equipment spare parts, and other concerns.
8.5 Laboratory Operations
Laboratory operations should be conducted with great care and attention to detail. Often, an independent
laboratory conducts sample analyses, so QA/QC for the laboratory are not under the direct control of
project personnel. However, it is important that project personnel are certain that the laboratory chosen to
do analyses follows acceptable QA/QC procedures so that the data produced meet the DQOs established
for the project. Laboratories should be selected based on quality assurance criteria established early in the
project. The Quality Assurance Officer for the project should be certain that these criteria are used for
selecting a laboratory to perform any necessary analyses for the project and that any laboratories selected
meet all criteria. Laboratories can be evaluated through the following measures (Air National Guard,
1993):
• Performing proficiency testing through analysis of samples similar to those which will be collected
during the project.
« Performing inspections and audits.
• Reviewing laboratory QA/QC plans.
• One or more of these measures should be used by the project manager, and the laboratories should
be visited before entering into a contract for sample analyses.
8.5.1 General Laboratory QA/QC
EPA recommends using an accredited laboratory with an established QA/QC policy to ensure that results
will be defensible. The National Environmental Laboratory Accreditation Conference (NELAC) Institute
provides accreditation of environmental testing laboratories. Numerous references are available on
laboratory QA/QC procedures, and one or more should be consulted to gain an understanding of
laboratory QA/QC requirements if project personnel are not familiar with them already. The details of a
laboratory's QA/QC procedures must be included in the QAPP for the NPS monitoring project. Some
elements to look for in a laboratory QA/QC plan include (Cross-Smiecinski and Stetzenback, 1994):
• How samples are received
• Proper documentation of their receipt
• Sample handling
• Sample analysis
• QC requirements (procedures and frequencies of QC checks, criteria for reference materials, types
of QC samples analyzed and frequencies)
• Waste disposal
• Cleanliness and contamination
8-15
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
• Staff training and safety
• Data entry and reporting
• Confidentiality
This section provides some information on laboratory QA/QC procedures to which managers of
monitoring programs should pay particular attention when deciding to use a particular laboratory for
sample analysis.
8.5.2 Instrumentation and Materials for Laboratory Operations
The laboratory chosen to do chemical analyses should have all equipment necessary to perform the
analyses required, including organic analysis, inorganic analysis, and assessments of precision and
accuracy. If any specialized analyses are required (e.g., microbiology, histopathology, toxicology), be
certain that the laboratory has the appropriate equipment and that laboratory staff are adequately trained
to perform the desired analyses. As noted in the elements of the QAPP, periodic calibration checks that
are conducted to ensure that measurement systems (instruments, devices, techniques) are operating
properly should be described in the QAPP, including procedures and frequency (Cross-Smiecinski and
Stetzenback, 1994).
8.5.3 Analytical Methods
The laboratory chosen for sample analysis should use analytical methods approved by the agency for
which the sampling is being conducted or by project personnel, as appropriate. Standard methods include
those published by the U.S. Geological Survey (USGS), the USEPA, and the American Society for
Testing and Materials (ASTM), or those published in Standard Methods for Examination of Water and
Wastewater (Rice et al., 2012). A compendium of methods for environmental analysis is maintained by
the National Environmental Methods Index (NEMI), supported by both USGS and USEPA. If any
methods to be used are not published, they should first be validated and verified as acceptable for the
project. Each approved and published method should be accompanied by an SOP that is followed
rigorously by the laboratory (Pritt and Raese 1992).
8.5.4 Method Validation
The laboratory chosen for sample analysis should have well-developed procedures for method validation.
Method validation should account for and document the following (at a minimum): Known and possible
interferences; method precision; method accuracy, bias, and recovery; method detection level; and
method comparability to superseded methods, if applicable (Pritt and Raese 1992).
8.5.5 Training and Safety
An analytical laboratory should be able to ensure its customers that its personnel are adequately trained to
perform the necessary analyses. Individual laboratory staff should be independently certified for each of
the analyses they will be allowed to perform in the laboratory. Selection of a laboratory for sample
analysis should be based on queries about how often training is conducted, whether employees are limited
to using equipment for which they have been adequately trained, whether the training program is
8-16
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
independently certified, who conducts the training, how the staffs competence with individual
instruments is measured, and other factors (Pritt and Raese 1992).
Safety for staff is an important consideration when choosing a laboratory because, aside from the paramount
concern for human well-being, accidents can seriously delay sample analyses or create a need for
resampling. Prospective laboratories should be inspected for their attention to safety procedures, including
the availability of safety equipment such as fire extinguishers, safety showers and eyewashes, fume
hoods, and ventilation systems; use and disposal practices for hazardous materials; and compliance with
environmental regulations. Safety equipment should be tested on a regular basis (Pritt and Raese 1992).
Additionally, laboratory safety includes procedures for ensuring that the laboratory is accessible only to
authorized personnel to ensure confidentiality of the data. The laboratory should have a system for
accounting for and limiting (or denying) laboratory access to all visitors, including persons affiliated with
projects for which the laboratory is analyzing samples (Pritt and Raese 1992).
8.5.6 Procedural Checks and Audits
A laboratory should have established procedures (SOPs) for conducting internal checks on its analyses
and taking corrective action when necessary. If more than one laboratory is used for sample analyses, it
will be important to know that the data obtained from the two are of the same quality and consistency. A
protocol for conducting interlaboratory comparisons should also be an element of a laboratory's QA/QC
plan. For many projects occasional samples are analyzed by a second laboratory to determine whether
there is any bias in the data associated with the primary laboratory's analyses.
Laboratory audits by independent auditors are normally conducted on a prescribed basis to ensure that
laboratory operations are conducted according to accepted and acceptable procedures (Cross- Smiecinski
and Stetzenback, 1994). Determination that a laboratory undergoes such audits and reviews audit results
might be sufficient to determine that a laboratory will be adequate for conducting analyses of samples
generated by the NPS monitoring project.
8.6 Data and Reports
It is essential during the conduct of an NPS monitoring project to document all data collected and used, to
document all methods and procedures followed, and to produce clear, concise, and readable reports that
will provide decision makers with the information they need to choose among alternative actions, as
described in the DQOs. See sections 3.9 and 3.10 for additional details on data management, reporting,
and presentation.
8.6.1 Generation of New Data
All data generated during the project, whether in the field, laboratory, or some other facility, should be
recorded. Include with the data any reference materials or citations to materials used for data analyses.
These include computer programs, and all computer programs used for data reduction should be validated
prior to use and verified on a regular basis. Calculations should be detailed enough to allow for their
reconstruction at a later date if they need to be verified (Cross-Smiecinski and Stetzenback 1994). Data
generated by a laboratory should be accompanied by pertinent information about the laboratory, such as
its name, address, and phone number, and names of the staff who worked directly with the project
samples.
8-17
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.6.2 Use of Historical Data
Historical data are data collected for previous projects that concerned the same resource in the same area
as the project to be implemented. Historical data sometimes contain valuable information, and their use
can save time and effort in the implementation and/or data analysis phases of a new project. Before new
data are collected, all historical data available should be obtained and their validity and usability should
be assessed. Data validity implies that individual data points are considered accurate and precise because
the field and laboratory methods used to generate the data points are known. Data usability implies that a
database demonstrates an overall temporal or spatial pattern, though no judgment of the accuracy or
precision of any individual data point is made (Spreizer et al., 1992). The validity of historical data can be
difficult to ascertain, but data usability can be assessed through a combination of graphical and statistical
techniques (Spreizer etal. 1992).
Specifically, historical data that can be shown to be either valid or usable can be applied to a new project
in the following ways (Coffey 1993, Spreizer et al. 1992, USEPA 200Ic):
" If the quality (i.e., accuracy and precision) of historical data is sufficiently documented, the data can
be used alone or in combination with new data. The quality of historical data should be evaluated
relative to the project requirements.
" Characteristics derived from the historical data, such as the variability or mean of data, can be used
in the development or selection of a data collection design. Knowledge of expected variability
assists in determining the number of samples needed to attain a desired confidence level, the length
of monitoring program necessary to obtain the necessary data, and the required sampling frequency
(see section 3.4.2).
" Spatial analysis of historical data can indicate which sampling locations are most likely to provide
the desired data.
" Historical data can provide insights about past impacts and water quality that can be useful in
defining an NFS pollution problem.
" Past trends can be ascertained, and the present tendency of water quality characteristics (degrading,
stable, or improving) can be established for trend analysis (see section 7.8.2.4).
8.6.3 Documentation, Record Keeping, and Data Management
" All information and records related to the NFS monitoring project should be kept on file and kept
current. This documentation should include:
" A record of decisions made regarding the monitoring project design
" Records of all personnel, with their qualifications, who participated in the project
• Intended and actual implementation schedules, and explanations for any differences
" A description of all sampling sites
• Field records of all sampling events, including any sampling problems and corrective actions taken
" Copies of all field and laboratory SOPs
" Equipment manuals and maintenance schedules (intended and actual, with explanations for any
discrepancies)
8-18
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
• Printouts from any equipment
• Sample management and custody records
• Laboratory procedures
• Copy of the laboratory QA/QC plan
« Personnel training sessions and procedures, including any training manuals or other materials
• All data generated during the project in hard copy and electronic forms
• All correspondence related to the project
• Project interim and final reports
One aspect that merits further discussion is documentation and management of data, from the collection
process through the data analysis. Data management activities include documenting the nature of the data
and subsequent analyses so that the data from different sites are comparable. Data management also
includes handling and storing both hard copies and electronic files containing field and laboratory data. A
data management system that addresses project needs should be selected at the beginning of the
monitoring program (see section 3.9). It is also important to understand and comply with applicable state
agency and/or grant policies and standards regarding data collection and generation.
Some grants might require local NPS and water resources managers to add their data to EPA's storage
and retrieval (STORET) database (https://www.epa.gov/waterdata/storage-and-retrieval-and-water-
quality-exchange). STORET contains raw biological, chemical, and physical data on surface water and
ground water collected by federal, state, and local agencies; tribes; volunteer groups; academics; and
others. Each sampling result in STORET is accompanied by information on where the sample was taken
(latitude, longitude, state, county, hydrologic unit code, and brief site identification), when the sample
was gathered, the medium sampled (e.g., water, sediment, fish tissue), and the name of the organization
that sponsored the monitoring. Staff working with the database should have expertise and training in the
software and in the procedures for data transport, file transfer, and system maintenance.
The operation of the data management system should include QA oversight and QC procedures. If
changes in hardware or software become necessary during the course of the project, the data manager
should obtain the most appropriate equipment and test it to verify that the equipment can perform the
necessary jobs. Appropriate user instructions and system documentation should be available to all staff
using the database system. Developing spreadsheet, database, and other software applications involves
performing QC reviews of input data to ensure the validity of computed data.
8.6.4 Report Preparation
The original project description should include a schedule and format for required reports, including the
final report. Adherence to this schedule is important to provide information and documentation of project
progress, problems encountered, and corrective actions taken. Reports are also valuable for supporting
continuation of a project if at any point during the project its continuation is scrutinized or if additional
funding must be secured to ensure its completion. Reports can also become the primary sources of
historical information on projects if there are changes in project personnel during the project. Project
managers should decide on the necessary content and format of all reports prior to commencement of the
project, and these will differ depending on funding and intended audience.
8-19
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 8
8.7 Geospatial Data
Projects should incorporate procedures for documenting geospatial data appropriately. Geospatial
Information System (GIS) data can vary from relatively simple site locations to complex with many
overlapping contextual boundaries. For example, the development of a watershed implementation plan
may involve analyzing water samples from industrial dischargers, developing water quality models,
creation of new geospatial data, or even updating existing geospatial data. Use of geospatial data from
external sources may require the development of a secondary data QAPP. QAPPs also apply to geospatial
data (USEPA 200 Ib), but should vary with the complexity of the project (see Table 8-3). The project
planning phase should determine the scope and complexity of the project that will inform the complexity
of the QAPP (USEPA, 2003b).
Table 8-3. Continuum of Geospatial Projects with Differing Intended Uses
Purpose of Project
Regulatory compliance
Litigation
Congressional testimony
Regulatory development
Spatial data development
(Agency infrastructure development)
Trends monitoring
(non-regulatory)
Reporting guidelines
(e.g., Clean Water Act)
"Proof of principle"
Screening analyses
Hypothesis testing
Data display
Typical Quality Assurance Issues
Legal defensibility of data sources
Compliance with laws and regulatory mandates
applicable to data gathering
Legal defensibility of methodology
Compliance with regulatory guidelines
Existing data obtained under suitable QA program
Audits and data reviews
Use of accepted data-gathering methods
Use of accepted models/analysis techniques
Use of standardized geospatial data models
Compliance with reporting guidelines
QA planning and documentation as appropriate
Use of accepted data sources
Peer review of products
Level of QA
>
^
Source: USEPA, 2003b.
8.7.1 Performance Criteria fora Geospatial Data Project
Projects with geospatial components will likely follow the same DQO process described in section 8.2 of
this chapter. In decision-making programs taking the form of the DQO process, data quality to achieve a
desired level of confidence in the decision takes a number of typical forms as listed below (USEPA
2003b):
• A description of the resolution and accuracy needed in input data sources
• Statements regarding the speed of applications programs written to perform data processing
(e.g., sampling at least "n" points in "m" minutes)
• Criteria for choosing among several existing data sources for a particular geospatial theme
(e.g., land use); geospatial data needs are often expressed in terms of using the "best available"
data, but different criteria—such as scale, content, time period represented, quality, and format—
8-20
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
may need to be assessed to decide which are the "best available" (when more than one is available)
to use on the project
• Specifications regarding the accuracy needs of coordinates collected from GPS receivers
• Criteria for aerial photography or satellite imagery geo-referencing quality, such as specifications as
to how closely these data sources need to match spatially with ground-based reference points or
coordinates
• Criteria for minimum overall match rate, tolerances including whether or not spatial offsets are to
be supplied in the resulting coordinates procedures, and if so, the offset factor in address matching
• Topology, label errors, attribute accuracy, overlaps and gaps, and other processing quality
indicators for map digitizing
• Criteria to be met in ground-truthing classified satellite imagery
8.7.2 Spatial Data Quality Indicators for Geospatial Data
The most comprehensive way to track the quality and applicability of a geospatial data set is through the
use of metadata. EPA requires that appropriate metadata accompany every data set, in accordance with
Federal Geographic Data Committee standards (FGDC 1998). There are five components applicable to
the Federal Geographic Data Committee metadata requirements (FGDC 1998, USEPA 2003b):
« Accuracy - positional: The closeness of the locations of the geospatial features to their true
position.
• Accuracy - attribute: The closeness of attribute values (characteristics at the location) to their true
values.
• Completeness: The degree to which the entity objects and their attributes in a data set represent all
entity instances of the abstract universe (defined by what is specified by the project's data use in
systematic planning). It is in the metadata where the user may define the abstract universe with
criteria for selecting features to include in the data set. The information is relevant to any user who
wishes to independently replicate geospatial procedures. Missing, or incomplete data can affect
logical consistency needed for correct processing of data by software.
• Logical consistency: The data in any spatial data set is logically consistent when it complies with
the structural characteristics of the data model and is compatible with attribute constraints defined
for the system.
• Lineage: The description of the origin and processing history of a data set.
8-21
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
8.8 References
Air National Guard. 1993. Draft Final Remedial Investigation/Feasibility Study Work Plan. Prepared for
Wyoming Air National Guard 153rd Tactical Airlift Group, Cheyenne Municipal Airport,
Cheyenne, Wyoming, by Aepco, Inc. and Tetra Tech, Inc.
ANSI/ASQ (American National Standards Institute/American Society for Quality). 1994. Specifications
and Guidelines for Quality Systems for Environmental Data Collection and Environmental
Technology Programs. American Society for Quality Control (ASQC). ASQC Quality Press,
Milwaukee, Wisconsin.
ANSI/ASQ (American National Standards Institute/American Society for Quality). 2004. American
National Standard, Quality systems for environmental data and technology programs -
Requirements with guidance for use. American Society for Quality (ASQ). ASQ Quality Press.
Milwaukee, Wisconsin.
Coffey, S.W., J. Spooner, and M.D. Smolen. 1993. The Nonpoint Source Manager's Guide to Water
Quality and Land Treatment Monitoring. North Carolina State University, Department of
Biological and Agricultural Engineering, NCSU Water Quality Group, Raleigh, NC.
CREM (Council for Regulatory Environmental Modeling). 2009. Guidance on the Development,
Evaluation, and Application of Environmental Models. EPA/100/K-09/003. U.S. Environmental
Protection Agency, Council for Regulatory Environmental Modeling, Washington, DC.
Cross-Smiecinski, A., and L.D. Stetzenback. 1994. Quality Planning for the Life Science Researcher:
Meeting Quality Assurance Requirements. CRC Press, Boca Raton, Florida.
Drouse, S.K., D.C. Hillman, J.L. Engles, L.W. Creelman and S.J. Simon. 1986. National Surface Water
Survey. National Stream Survey (Phase 1 - Pilot, Mid-Atlantic Phase 1 Southeast Screening, and
Episodes Pilot) Quality Assurance Plan. EPA/600/4-86/044. NTIS No. PB87-145819. Prepared
for U.S. Environmental Protection Agency, Office of Research and Development, Environmental
Monitoring Systems Laboratory, Las Vegas, Nevada, by Lockheed Engineering and Management
Services Co., Inc., Las Vegas.
Erickson, H.E., M. Morrison, J. Kern, L. Hughes, J. Malcolm and K. Thornton. 1991. Watershed
Manipulation Project: Quality Assurance Implementation Plan for 1986-1989. EPA/600/3-
91/008. NTIS No. PB91-148395. Prepared for Corvallis Environmental Research Laboratory,
Oregon, by NSI Technology Services Corporation, Corvallis, OR.
FGDC (Federal Geographic Data Committee). 1998. Content Standard for Digital Geospatial Metadata.
FGDC-STD-001-1998. Federal Geographic Data Committee, Washington, DC.
Knapton, J.R., and D.A. Nimick. 1991. Quality Assurance for Water-Quality Activities of the
U.S. Geological Survey inMontana. Open File Report 91-216. U.S. Geological Survey, Helena,
Montana.
Pritt, J.W., and J.W. Raese, ed. 1992. Quality Assurance/Quality Control Manual. Open File Report
92-495. U.S. Geological Survey, National Water Quality Laboratory, Reston, Virginia.
Rice, E.W., R.B. Baird, A.D. Eaton, and L.S. Clesceri, ed. 2012. Standard Methods for the Examination
of Water and Wastewater. 22nd ed. American Public Health Association, American Waterworks
8-22
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
Association, Water Environment Federation. American Public Health Association Publications,
Washington, DC.
Spreizer, G.M., T.J. Calabrese and R.S. Weidner. 1992. Assessing the Usability of Historical Water
Quality Data for Current and Future Applications. In Current Practices in Ground Water and
Vadose Zone Investigations, ASTM STP 1118. Ed. D.M. Nielsen and M.N. Sara, pp. 377-390.
American Society for Testing and Materials, Philadelphia, Pennsylvania.
USEPA (U.S. Environmental Protection Agency). 1983. Interim Guidelines and Specifications for
Preparing Quality Assurance Project Plans. EPA-600/4-83-004. QAMS-005/80. U.S.
Environmental Protection Agency, Office of Monitoring Systems and Quality Assurance, Office
of Research and Development, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2000a. EPA Quality Manual for Environmental
Programs. CIO 2105-P-01-0. U.S. Environmental Protection Agency, Office of Environmental
Information. Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2000b. Policy and Program Requirements for the
Mandatory Agency-wide Quality System. CIO 2105.0. U.S. Environmental Protection Agency,
Office of Environmental Information, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2001a. EPA Data Quality Objectives Decision Error
Feasibility Trials Software (DEFT) - USER'S GUIDE, EPA QA/G-4D. EPA/240/B-01/007. U.S.
Environmental Protection Agency, Office of Environmental Information. Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2001b. EPA Requirements for Quality Assurance
Project Plans, EPA QA/R-5. EPA 240/B-01/003. U.S. Environmental Protection Agency, Office
of Environmental Information. Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2001c. EPA Requirements for Quality Management
Plans, EPA QA/R-2. EPA/240/B-01/002. U.S. Environmental Protection Agency, Office of
Environmental Information. Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2002a. Guidance for Quality Assurance Project Plans,
EPA QA/G-5. EPA 240/R-02/009. U.S. Environmental Protection Agency, Office of
Environmental Information, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2002b. Guidance for Quality Assurance Project Plans
for Modeling, EPA QA/G-5M. EPA/240/R-02/007. U.S. Environmental Protection Agency,
Office of Environmental Information, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2003a Elements of a State Water Monitoring and
Assessment Program. EPA 841-B-03-003. U.S. Environmental Protection Agency, Office of
Water, Office of Wetlands, Oceans and Watersheds, Assessment and Watershed Protection
Division, Washington, DC. Accessed January 28, 2016.
USEPA (U.S. Environmental Protection Agency). 2003b. Guidance for Geospatial Data Quality
Assurance Project Plans, EPA QA/G-5G. EPA/240/R-03/003. U.S. Environmental Protection
Agency, Office of Environmental Information, Washington, DC.
8-23
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 8
USEPA (U.S. Environmental Protection Agency). 2006. EPA Guidance on Systematic Planning Using
the Data Quality Objectives Process, EPA QA/G-4. EPA/240/B-06/001. U.S. Environmental
Protection Agency, Office of Environmental Information. Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2007. Guidance for Preparing Standard Operating
Procedures (SOPs), EPA QA/G-6. EPA 600/B-07/001. U.S. Environmental Protection Agency,
Office of Environmental Information, Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2008a. NRMRL QAPP Requirements for Secondary
Data Projects. U.S. Environmental Protection Agency, National Risk Management Research
Laboratory, Cincinnati, OH.
USEPA (U.S. Environmental Protection Agency). 2008b. U.S. Environmental Protection Agency Quality
Policy. CIO 2106.0. U.S. Environmental Protection Agency, Office of Environmental
Information. Washington, DC.
USEPA (U.S. Environmental Protection Agency). 2012. Guidance for Evaluating and Documenting the
Quality of Existing Scientific and Technical Information Addendum to: A Summary of General
Assessment Factors for Evaluating the Quality of Scientific and Technical Information. U.S.
Environmental Protection Agency, Science and Technology Policy Council, Washington, DC.
8-24
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
9 Monitoring Costs
By S.A. Dressing and D.W. Meals
9.1 Introduction
Monitoring plans must be designed to help achieve watershed project or program goals. This could be a
relatively simple task with an unlimited budget, but perhaps the most frequently cited problems for those
who design and carry out water quality monitoring programs are the limitations and unpredictability of
funding. Although cost should not be the defining factor in the design of monitoring plans, it must be
considered from the start. Both "cheap" monitoring programs that are inadequate to achieve project
objectives or great monitoring programs that are discontinued because funding disappears are worse than
no monitoring at all because much or all the money spent is essentially wasted.
While funding can almost never be guaranteed over the course of a multi-year monitoring effort, careful
cost analysis at the beginning can help design a monitoring plan that will meet objectives and fit within a
cost range that can be sustained until the project ends. In some cases, project budgets might be
insufficient to carry out meaningful monitoring; in such cases, monitoring should not be done. In all other
cases, project staff must seek a balance that provides the ability to achieve monitoring objectives that are
supportive of project or program goals at an affordable cost.
Although an exact monitoring budget will be highly specific to the setting of a particular project,
monitoring costs can be estimated reasonably well as part of project planning. Even a very good cost
estimate, however, will miss the mark on category specific costs. For example, sampling trips may take
more or less time than anticipated, equipment costs can change drastically if equipment is washed away
or needed equipment suddenly becomes available from a discontinued monitoring effort, or data analysis
and reporting requirements change under new management or because of unexpected findings or
additional requests for information. While the total budget allotted to a monitoring project may not
change, projects should maintain flexibility to shift resources within a budget to ensure that project
objectives are met with maximum cost efficiency.
In this chapter, potential monitoring costs for the types of monitoring described in this guidance
document are illustrated using a spreadsheet tool that has been developed to estimate monitoring costs for
nonpoint source watershed projects (Dressing 2012, 2014). Two user-editable versions of the spreadsheet
can be downloaded at this site: (https://www.epa.gov/polluted-runoff-nonpoint-source-
pollution/monitoring-and-evaluating-nonpoint-source-watershed). The master spreadsheet allows users to
determine every detail in their cost estimation, whereas the simplified spreadsheet includes default
assumptions for monitoring designs, sampling types, and parameters, as well as basic algorithms to allow
users to generate cost estimates with as little input as possible. See Appendix 9-1 for additional details on
the cost estimation spreadsheets.
9.2 Monitoring Cost Items and Categories
A complete accounting of monitoring costs begins with watershed characterization and development of a
QAPP (see chapter 9) and ends with data analysis (see chapter 8) and reporting. Costs incurred by
9-1
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
monitoring efforts can include monitoring site selection, construction of monitoring stations, installation
and setup of monitoring equipment, sample collection, laboratory analysis, and the ultimate removal of
monitoring sheds at the conclusion of the project (see chapters 2 and 3 for details on monitoring designs).
Some monitoring efforts include the cost of contracts or grants for monitoring support.
Specific cost items can be grouped and summarized in many (and often overlapping) ways, including the
categories shown in Table 9-1. These categories are basically people and things, whereas the categories
shown in Table 9-2 are organized by project phases and key project elements. Some costs are incurred
once during a project (e.g., site establishment) while others are recurring (e.g., sampling site visits), so
annual costs often vary, particularly for the first and last years of a project.
Table 9-1. Costs grouped by type of item or activity
Cost Category
Labor
Installed Structures
Other Site Establishment Costs
Purchased Equipment
Rental Equipment
Monitoring Supplies
Office Equipment
Office Supplies
Travel/Vehicles
Laboratory Analysis
Data Purchases
Printing/Media
Electricity/Fuel
Site Service and Repair
Annual Site Fees
Contracts
Grants
Items Included In Category
All labor costs (inclusion of fringe benefits optional).
Materials and labor costs.
One-time fee, electricity connection, setup, etc.
All purchased monitoring equipment.
All rented monitoring equipment.
All startup and annual monitoring supplies.
All purchased office equipment.
All startup and annual office supplies.
All use of vehicles for travel, construction, sample pickup, etc.
Annual laboratory analysis.
Maps, data, satellite & aerial photography.
Printing and other report output media (e.g., CD, web).
All fuel and power costs for operating sites.
All service, repair, and replacements of sites and equipment.
All annual fees for site access. Does not include initial fee.
All non-itemized contracts costs.
All non-itemized grants costs.
Table 9-2. Costs grouped by project phase or element
Cost Category
Items Included in Category
One-Time Costs
Proposal and QAPP
Watershed Characterization
Site Establishment
Portable Sampling Equipment and
Startup Supplies Costs
Cost for development of proposal and QAPP or equivalent document (added to Year
1 cost).
Cost for characterization of watershed to aid monitoring design (added to Year 1
cost). Includes windshield surveys and analysis of existing data and maps.
Includes one-time costs for setting up station, including purchase of equipment that
remains at site. Site selection, preparation, and excavation costs are all included.
Includes one-time costs for all portable sampling equipment or instruments that are
taken to the site for use and then taken away for use at another site or time.
Equipment includes such items as kick nets, pH meters, etc. Also includes one-time
cost for initial purchase of supplies such as pipettes, vials, and bottles.
9-2
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Cost Category
One-Time Office Equipment and
Startup Supplies Costs
Station Demolition and Site
Restoration
First-Year Report
Final Report
Items Included in Category
Includes computer hardware and software and related items.
Includes all costs associated with tearing down the station and restoring the site at
the end of the project.
This is the cost for data analysis and writing, printing, and distribution of the first-year
report. Data analysis and reporting can be combined or kept separate.
This is the cost for data analysis and writing, printing, and distribution of the final
report. Data analysis and reporting can be combined or kept separate.
Annual Costs
Access Fees
Sampling Trips to Sites
Volunteer Training
Sample Analysis
Annual Data Analysis and Reports
Site Operation and Maintenance
Supplies and Rental Equipment
Land Use Tracking
Total Cost of Monitoring
Any fees paid to landowners for allowing access to the site.
Includes labor, vehicle use, and other equipment (e.g., boat) costs for site visits.
Annual cost to train volunteers or others collecting data for the project.
Cost for laboratory analysis of samples. Includes travel to and from laboratory if done
in addition to sampling trip travel. Can include costs for shipping samples to
laboratories as "Other" cost.
This is the cost for annual analysis of project data and annual or more frequent
reporting in years other than the first and last year. Includes labor and materials. Data
analysis and reporting can be combined or kept separate.
Includes service/repair/replacement of equipment and structures, electric and fuel
bills (e.g., for heating), and annual cost to establish and update stage/discharge
relationship.
This cost is primarily for consumable supplies (e.g., sample preservative), but can
include sample bottles and other items. Also includes rental equipment and office
supplies.
Labor, travel, and services (e.g., aerial photography or data purchase) needed to
track land use and land treatment.
Total cost of monitoring for the entire project period.
9.3 Cost Estimation Examples
The cost spreadsheets have been used to estimate costs for a wide variety of monitoring designs and
applications. The cost estimates highlighted here were developed for three different purposes. First, the
master spreadsheet was used to provide a range of estimates for a diverse set of monitoring options, with
estimated costs generated for eight different monitoring scenarios covering a wide range of timeframes
(see section 9.3.1). The ten cost estimates summarized in section 9.3.2 cover various monitoring
approaches relevant to assessing the watershed-scale water quality impacts of programs such as USDA's
National Water Quality Initiative (NWQI). Finally, the simplified spreadsheet was used to estimate costs
for 60 basic, 5-year monitoring scenarios that are summarized in section 9.3.3. It is important to note that
assumptions regarding the need and cost for labor, equipment, monitoring parameters, sampling
frequency, and sampling duration are all important determinants of the final cost estimates, so costs are
presented in this section more for a comparative analysis than as accurate estimates for any specific
monitoring type or effort. The examples are particularly useful to contemplate trade-offs among cost
categories and to evaluate where cost-effectiveness can be improved, e.g., offsetting high labor costs with
the purchase of automated equipment.
9-3
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
9.3.1 Cost Estimates for a Diverse Range of Monitoring Options
Cost estimates for the following eight monitoring scenarios are presented in this section.
1. Synoptic Survey
2. TMDL - Water Quality Standards
3 TMDL - Loads
4. Paired-Watershed - Loads
5. Long-term Single Station - Biomonitoring
6. Above/Below BMP Effectiveness - Biomonitoring
7. Input/Output Urban Low Impact Development (LID) Effectiveness
8. Photo-Point Monitoring
These eight scenarios were chosen to represent a wide range of monitoring approaches, addressing both
problem assessment and project evaluation, using chemical, physical, and biological (Barbour et al. 1999)
monitoring methods. With the exception of 1-year synoptic surveys (Scenario 1), costs are estimated for
1, 2, 5, and 8 years. A more detailed comparison of Scenarios 2-8 is based on five-year cost estimates. See
Appendix 9-2 for additional details on these eight scenarios.
9.3.1.1 Discussion
Table 9-3 summarizes the total costs for each scenario for 1, 2, 5, and 8 years. Cost totals are taken from
the base scenarios in which all equipment is purchased and all monitoring is stand-alone; that is, there are
no cost savings assumed for monitoring activities that may be combined with other activities (e.g.,
another monitoring effort in the same area) to save on travel or labor. It should be no surprise that
biological (Scenarios 5 and 6) and photo-point (Scenario 8) monitoring are the least expensive monitoring
approaches in this analysis. Sampling frequency (2x/year) for biological and photo-point monitoring is far
less than is assumed for water quality monitoring and load estimation, and laboratory and equipment costs
are generally lower as well.
While total cost provides the best measure for comparing the costs for alternative monitoring designs, the
breakout of costs by category gives a better picture of where cost savings can be found within each
monitoring design. For example, labor accounted for the greatest share of total costs in all five-year
scenarios, ranging from 68 percent for Scenario 7 (urban LID) to 90 percent for quantitative photo-point
monitoring (Figure 9-1). Labor accounted for only 45 percent of the total cost for the 1-year synoptic
survey.
Equipment costs ranged from 2 percent for Scenario 2 (TMDL water quality standards) to 12 percent of
total 5-year costs for qualitative photo-point monitoring. About 45 percent of the 1-year budget for
synoptic surveys was devoted to equipment. Laboratory analysis costs accounted for 16 percent of total
5-year costs for Scenario 7 (urban LID), 9 percent of the 1-year cost for a synoptic survey, and 5 percent
of the 5-year cost for Scenario 2 (TMDL water quality standards), but were responsible for less than
1 percent of costs for all other scenarios.
Vehicle (mileage) costs ranged from 1 percent for Scenario 1 and quantitative photo-point monitoring to
10 percent of total 5-year costs for Scenario 2. Both Scenario 3 and Scenario 7 had 5-year budgets in
which vehicle costs accounted for 9 percent of the total cost.
9-4
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
It is important to ensure that whatever monitoring approach is used will provide the type, quality, and
quantity of information necessary to meet project monitoring objectives. Despite its lower cost, photo-
point monitoring will usually not be appropriate as a stand-alone monitoring approach for tracking
progress in achieving a TMDL. Likewise, biological monitoring cannot be used to estimate pollutant
loads. On the other hand, weekly grab sampling for water chemistry may be wasteful if monitoring is
intended to track attainment of aquatic life support, and photo-point monitoring could be appropriate for a
trash TMDL such as that established for the Anacostia River (MDOE and DDOE 2010).
Table 9-3. Summary of scenario costs for diverse range of monitoring options
Scenario
1 . Synoptic Survey
2. TMDL WQS
3. TMDL Loads
4. Paired-Watershed Load
5. Long-Term Biological
6. Above/Below BMP Effectiveness - Biological
7. Input/Output Urban LID Effectiveness
8. Photo-Point Monitoring - Qualitative Analysis
8. Photo-Point Monitoring - Quantitative Analysis
Total Cost ($1,000)
1Year
30
47
62
93
16
17
68
8
25
2 Years
n/a
90
107
158
26
28
115
11
39
5 Years
n/a
215
238
348
53
58
252
19
75
8 Years
n/a
339
368
537
80
88
388
26
111
Above/Below Biological
LID
Paired-Load
©
Synoptic
Photo Qualitative
TMDL-Load
Long-Term Biological
Photo Quantitative
TMDL-WQS
Category
H Labor
I I Equipment
I I Laboratory Analysis
D Vehicles
fj Monitoring Structures
Q Supplies
Figure 9-1. Breakout of costs for diverse range of monitoring options
9-5
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
9.3.2 Cost Estimates for Watershed-Scale Evaluation of Agricultural BMP
Implementation
This cost analysis was performed to explore different options for planning and assessing the water quality
impacts of watershed-scale implementation of agricultural BMPs. The setting assumed for the cost
scenarios is a 12-digit HUC watershed covering 10,117 ha (25,000 ac), primarily in agricultural use.
Monitoring is performed in perennial streams with the exception of the paired-watershed scenario that
assumes intermittent flow.
Cost estimates were generated for a total of 84 scenarios, including synoptic surveys, compliance
monitoring, soil testing, multiple-watershed monitoring, and paired-watershed, trend, and above/below
monitoring. Cost estimates were developed for three different driving distances to the watershed to
illustrate how that factor influences costs, particularly the labor share of total costs. Three timeframes
were considered (three, five, and seven years) for all but synoptic surveys which were assumed to be
completed within one year.
For simplicity, all labor was assumed to be performed by contractors, but this may not be affordable in
many situations. Pay rates assumed (including fringe and overhead) and basic job functions are
summarized in Table 9-4. Rates for government or university employees and volunteers would clearly
differ, and contractor rates would vary depending on location.
Additional assumptions about number of sampling sites, monitoring frequency, monitoring variables, and
various other aspects of the monitoring designs are documented in Appendix 9-3.
Table 9-4. Labor costs assumed for watershed-scale evaluation scenarios
Pay Level
4
3
2
1
Rate ($/hr)1
130
80
56
34
Job Functions
Monitoring design, statistical analysis, oversight, etc.
Lead field person for monitoring, data collection, bulk of writing
Field technician, lab tech, etc.
Secretarial and support staff
Includes fringe and overhead.
9.3.2.1 Discussion
Results for 5-year monitoring efforts are summarized in Figure 9-2. Not shown in this figure are 1-year
synoptic surveys which had the lowest cost, ranging from $12,000 to $18,000 depending on distance
traveled to the watershed. The low cost of synoptic surveys compared to the cost of other scenarios
indicates that they can be a very good investment for generating additional information to support final
decisions on both the land treatment plan and long-term monitoring design.
Compliance monitoring is also relatively inexpensive as defined in these scenarios, ranging from $21,000
to $55,000 for 5-year efforts depending on distance traveled. The cost for a soil testing program ranges
from $32,000 to $50,000 for five years with a far smaller influence of distance traveled on total cost
compared to compliance monitoring. This is because soil testing requires a large amount of time
collecting samples at the site, whereas sampling for compliance monitoring is relatively quick once the
site is reached.
9-6
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
900-
§ SOD-
S' 70°"
£ GOO-
'S SOD-
'S 400-
5 300-
i 200-
ii 100-
0_
[
Distance s<>'&/$- s<>'
Scenario <& .0
1 1
^* \
\
^
1-1
^r^T
1 1 III III 1
^ ^V ^A* ^
n
]
1 1
^ s,
-
1
^»
r
1
V
~|
1
$.
r
i
^
, —
i
v
i
$.
i i i
v^ •$• &
o «& & ^f> & £ _oS- „-
O
/ ^
$
^> ^ 6
f& ,£•
•0^ *C-
.0°
A.
^
/vC* •*& A.^
<^ <<^ 0^
Oc- »v- c-u
. X
**
^
^
In=contractor inside watershed, Near=contractor 241 km (150 miles) away,
and Far=contractor 483 km (300 miles) away.
Figure 9-2. Cost estimates for watershed-scale assessment of agricultural BMP projects
Trend monitoring costs can range from $68,000 to $275,000 for a 5-year effort with grab sampling to a
range of $172,000 to $630,000 for a 5-year effort with automated sampling and pollutant load estimation.
Although not shown in Figure 9-3, this analysis presents an interesting choice between a 7-year grab
sampling effort ($92,000-$382,000) and a 3-year load estimation effort ($112,000-$391,000) fortrend
analysis. This cost information coupled with an MDC analysis (see section 9.4) could lead to cost-
effective solutions to monitoring needs.
The cost of above/below monitoring ranges from $152,000 to $553,000 for a 5-year grab sampling effort
to $268,000 to $799,000 for a 5-year load estimation effort. Costs for above/below monitoring designs are
roughly twice the cost of the parallel trend monitoring designs for grab sampling, but can be much less
than double the cost for load estimation. For example, comparing 5-year costs for above/below with trend
concentration monitoring shows that the "near" cost for above/below ($329,000) is about twice the "near"
cost for the trend design ($159,000). However, the 5-year cost for above/below load monitoring
($466,000) is far less than double the cost for trend load monitoring ($371,000). The different patterns are
largely explained by the costs for site establishment and automated sampling equipment for load
estimation.
Paired-watershed monitoring (loads) are found to be similar to above/below monitoring in this analysis.
Costs ranged from $176,000 to $455,000 for a 5-year effort on an intermittent stream to $294,000 to
$824,000 for 5 years on a perennial stream. The major difference between paired-watershed and
above/below monitoring costs is the travel between watersheds and larger area involved in land
use/treatment tracking for paired-watershed monitoring.
9-7
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
The cost of monitoring 20 subwatersheds in a multiple-watershed design is estimated to range from
$125,000 to $185,000 for a 5-year effort. Grab sampling is assumed for multiple-watershed monitoring
scenarios in this analysis.
For scenarios assuming 5 years of monitoring and the "near" distance (monitoring team 241 km from
watershed), labor consumes 72 to 86 percent of total cost estimates. The proportion of total costs devoted
to labor often changes with project duration, however, as illustrated in Figure 9-3. In this comparison, the
labor share of cost decreases with increasing monitoring duration for soil testing (assuming 20 sites), but
increases for a paired study measuring loads on a perennial stream. The different trends result primarily
from differences in first-year costs. The paired design assumes significant labor and equipment (-equal)
costs for site establishment and purchased equipment, while the soil testing design assumes substantial
labor cost to select sites via desktop analysis. It should be noted that for both scenarios total labor costs
increase overtime, whereas equipment, site selection, and site establishment costs are incurred in the first
year only.
90-
80-
70-
"5 50-
t 40-
o
,0
^ 30-
20-
10-
0-
Years
1357 1357
Category Labor Labor
Scenario Paired Load Perennial Soil Testing
Figure 9-3. Comparison of labor cost category percentage overtime
9.3.3 Cost Estimates for Five-Year Trend and Above/Below Monitoring
Cost estimates were generated for 160 scenarios that address two different designs (trend and
above/below); four different monitoring variable sets (nutrient and sediment grab samples - [NSC],
nutrient and sediment loads - [NSL], biological/habitat with kick net - [BioK], and sondes for nutrients
and turbidity - [SNT]); four watershed sizes (202, 2023, 10117, and 20234 ha)1; and five different
500; 5,000; 25,000; and 50,000 acres
9-8
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
distances to the watershed (0, 40, 80, 121, and 161 km)2. All scenarios assume 5 years of monitoring,
while it was assumed that sampling frequency was 2 and 26 times per year for biological and all other
monitoring, respectively. In addition to sample collection and analysis, total monitoring costs also include
watershed characterization, site establishment, land use/treatment tracking, data analysis, and reporting.
This cost analysis was designed to test application of the simplified spreadsheet to the designs most
commonly used by NFS watershed projects. See Appendix 9-4 for additional details.
Additional scenarios using all combinations of the following conditions were run to illustrate how
assumptions on salary and equipment affect total cost estimates:
* Labor cost of $0 and salary adjustment of factors of 0.5, 0.7, and 1 (baseline).
• Purchase of all equipment (baseline) and equipment cost of $0.
These scenarios were run for a 2,023-ha (5,000 ac) watershed where the monitoring team was 80 km
(50 mi) from the watershed, parameters that best represent the median total costs for each design and
variables set.
9.3.3.1 Discussion
Figure 9-4 summarizes the results from this analysis. The box plots on the top show clearly that load
estimation (NSL) is the most expensive approach when compared to concentration monitoring with grab
samples (NSC) and the use of sondes for nutrients and turbidity (SNT). Biological monitoring (BioK) is
the cheapest option overall, but sampling is only done twice per year versus the assumed 26 times per
year for the other three options. Above/below monitoring is more expensive than trend monitoring for all
variable sets because there are twice as many stations. The cost, however, is less than double because of
efficiencies in labor, travel, analysis and other cost categories. It should be noted that paired designs
would have costs similar to those for the above/below design.
When costs are reduced to cost per sampling trip to each monitoring site (bottom of Figure 9-4),
biological monitoring is by far the most expensive approach of the scenarios considered. This is due
primarily to the fact that only 2 samples are collected each year versus 26 samples per year for the other
scenarios. Load monitoring is more expensive than both concentration and sonde monitoring. This figure
also points out the cost efficiency of above/below versus trend monitoring when using a biological
approach; the extra site is relatively inexpensive. Readers should keep in mind that, as described above,
total costs include more than just sample collection and analysis.
In all cases examined here, labor accounted for the largest share of costs, ranging from 63% to 84 percent
of total cost (66 percent to 85 percent if labor for analysis of biological samples is included). Competitive
contractor rates were assumed for labor, but the importance of labor costs can vary greatly because
monitoring efforts may use far less expensive staff (e.g., volunteers) or assume that labor is not an
additional cost because in-house staff are used.
Labor generally accounted for a larger share of total costs for scenarios that required less equipment,
ranging from 63 percent to 74 percent for biological (74-85 percent including analysis of biological
samples) and 74 percent to 84 percent for nutrient/sediment concentration monitoring. A slightly lesser
share of total cost was devoted to labor in cases where sondes were assumed (66-82 percent) or loads
were estimated with continuous flow measurement and automatic sampling (67-81 percent). Despite the
2 0, 25, 50, 75, and 100 miles
9-9
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
greater importance of labor in costs for biological monitoring, Figure 9-5 illustrates that the dollar amount
is still far less than for other monitoring options, whether labor for analysis of biological samples is
included (BioK-A) or not (BioK).
o
8
+J
3
i_
in
$350
$300
$250
$200
$150
$100
$50-
$0
Variable Set
Design
T
1
BioK NSC NSL SNT
Above/Below
BioK NSC NSL SNT
Trend
I
I
Q.
t
Ol
Q
D.
$7,000-
$6,000 -
$5,000-
$4,000 -
$3,000-
$2,000-
$1,000-
$0-
Variable Set
Design
1
BioK NSC NSL SNT
Above/Below
BioK NSC NSL SNT
Trend
BioK=Biological monitoring with kick net; NSC=Nutrient and sediment concentration; NSL=Nutrient and sediment load:
SNT=sondes for nutrients and turbidity
Figure 9-4. Box plots summarizing cost estimates for five-year monitoring efforts
9-10
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
5-Year Labor Costs
M
_§
$250,000-
$200,000-
$150,000-
$100,000-
$50,000-
$0^
BioK
BioK-A
NSC
Variable Set
NSL
SNT
BioK=biological monitoring w/out analysis labor; BioK-A=BioK plus sample analysis labor cost
NSC=nutrient & sediment cone, NSL=nutrient & sediment load, SNT=sondesfor nutrients & turbidity
Figure 9-5. Box plots summarizing five-year labor costs
Equipment and supplies accounted for 6-27 percent of total costs for BioK, NSL, and SNT, but only a
maximum of 2 percent for NSC. The large difference in importance of this cost category for biological
monitoring versus grab sampling for nutrients and sediments (NSC) hinges largely on the vast differences
in sampling frequencies (2x/yr vs. 26x/yr) and thus labor costs. The difference between NSC and NSL
and SNT is due to the far larger reliance on purchased equipment for monitoring with sondes and
measurement of loads. Sample analysis generally accounted for 2-25 percent of total costs for all
scenarios.
Vehicle costs were typically well under 10% of total costs for these scenarios, and per diem costs were
zero except in cases where watersheds were very large (10,117 or 20,234 ha) and monitoring teams were
remote (121 or 61 km from the watershed). Overnight stays were associated with watershed
characterization and land use/treatment tracking, not water quality monitoring. Each cost scenario
assumes that the watershed will be characterized in the first year of monitoring, and that land
use/treatment will be tracked twice per year every year.
Assumptions regarding salary and equipment costs have a substantial impact on total cost estimates as
illustrated in Table 9-5. If pay rates are reduced to 70 percent of the default values, the total cost is
reduced by 23-25 percent for all 64 scenarios3 included in this analysis versus the baseline scenario of full
pay rates (see Table 9-5) and purchase of all equipment. A reduction to 50 percent of default pay rates
reduces the total cost by 38-42 percent. If labor costs are zeroed out, total costs are reduced by
68-84 percent. If pay rates are maintained at the default values and all equipment is assumed to be in hand
with no purchases required, costs are reduced by 1-20 percent versus the baseline scenario. If equipment
! Two designs, 4 variable sets, 4 salary levels, 2 equipment cost levels (2x4x4x2=64).
9-11
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
purchases are assumed to not be needed and labor costs are reduced to 70 percent of the default values,
total costs are reduced by 25-43 percent. If neither labor nor equipment costs are included, total cost is
reduced by 81-98 percent of baseline cost, clearly illustrating the importance of assumptions on pay rates
and equipment needs when estimating total monitoring program costs.
Table 9-5. Cost reductions due to lowering of labor and equipment costs
Salary Assumption
Full Cost
Reduced to 70%
Reduced to 50%
No cost for Labor
Full Cost
Reduced to 70%
Reduced to 50%
No cost for Labor
Equipment and Supplies
Purchase All
Purchase All
Purchase All
Purchase All
Zero cost
Zero cost
Zero cost
Zero cost
Cost Reduction vs. Base Scenario1
Range
O2
23-25
38-42
68-84
1-20
25-43
41-59
81-98
Median
O2
24
39
77
12
35
51
88
1Base scenario assumes full contractor salary levels and purchase of all equipment and supplies. All scenarios assume 5-yr monitoring in a
2,023-ha watershed 80 km from monitoring team.
2Base scenario of full pay rates (Table 9-4) and purchase of all equipment.
9.3.4 Major Conclusions from Cost Estimation Scenarios
The cost estimates provided in this section are intended to illustrate the importance of estimating the costs
for all elements of monitoring for both the short- and long-term as part of establishing an effective and
sustainable monitoring program that will meet watershed project monitoring objectives. Those who use
either spreadsheet will find that they can tailor assumptions and add localized cost information to improve
their estimation capabilities. With increasing experience, including making adjustments based on
comparison of estimated versus actual costs, users should be able to improve the accuracy of their cost
estimates overtime. In all cases, but especially where budgets for monitoring are limited, accurate cost
estimation is essential to assessing the potential for conducting a monitoring effort that will satisfy project
objectives. Anything short of that is likely to be a waste of resources.
Because labor is such an important cost factor for all monitoring designs considered here, it provides the
greatest opportunity for cost savings. These savings can be generated a number of ways, including:
• Using volunteers whenever possible. (Training costs may be incurred, however, and practical and
legal limitations apply.)
• Using in-house labor. (This is not free and may involve diversion of labor from other projects or
programs.)
• Negotiating contracts to ensure greater use of lower cost staff wherever appropriate.
• Using labor sources based within or near the watershed. (This will also reduce vehicle and lodging
costs, but may limit options.)
• Piggybacking sampling trips with other duties to maximize benefits of travel time.
9-12
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
• Strategic use of in-house, volunteer, and contractor/grant labor to more efficiently match functions
with capabilities and needs.
• Substituting higher initial cost equipment for some labor in long-term projects (e.g.,
telecommunications/data logging to reduce data collection trips).
In many cases the addition of non-instrumented monitoring sites to a watershed project can be relatively
inexpensive because of the labor already invested in getting to the watershed for sampling events, as well
as the labor needed to characterize the watershed, track land use and land treatment, analyze data, and
develop reports. The incremental cost of adding monitoring stations should always be assessed in light of
how they could contribute to achieving project monitoring objectives. For example, paired-watershed and
above/below monitoring designs are inherently more powerful than single-station trend designs for
evaluating the effectiveness of BMP implementation on a localized or watershed scale. The incremental
cost of sampling two stations instead of one may support a stronger monitoring design that could yield
results in a shorter time period, perhaps reducing overall costs in the end. In addition, findings may be
more conclusive and the risk of failure reduced.
Equipment is never cheap, but the relatively low cost for equipment in most cost estimates developed here
suggests that it may be cost-effective to use sophisticated equipment and instruments if they can offset
higher personnel costs. Conversely, substituting labor for equipment (e.g., sending staff out to collect
frequent observations vs. using a data logger) is not likely to be cost-effective. Finally, it is very important
that equipment is maintained and operated in accordance with manufacturer recommendations to both
obtain good data and to ensure that equipment is operable over its expected lifespan.
While this chapter did not focus on how total cost is affected by the selection of monitoring variables, it is
clear that analysis of constituents such as pesticides and metals, as well as advanced methods such as
microbial source tracking will cost more than in situ measurement of temperature or laboratory analysis
of basic variables such as suspended sediment. Planners can use the spreadsheets to assess tradeoffs
between adding more or different variables versus increasing sampling frequency or duration, or adding
monitoring sites. Careful consideration of these and other design options should lead to better decisions
regarding the makeup of a monitoring plan while both achieving monitoring objectives and staying within
the budget.
9.4 Using Minimum Detectable Change to Guide Monitoring
Decisions
As noted earlier, cost should not be the defining factor in the design of monitoring programs. Program
designers must seek a balance that provides the ability to achieve monitoring objectives that are
supportive of watershed project goals at an affordable cost. Monitoring design, for example, should be
guided by the results of MDC analysis (see section 3.4.2) whenever possible. To illustrate this approach,
cost estimates were developed for options considered in Example 1 (A linear trend with autocorrelation
and covariates or explanatory variables; Y values log-transformed) of a technical note on MDC (Spooner
et al. 2011). In the first scenario, weekly samples are collected for five years, resulting in an MDC of
15 percent, or an average of 3 percent change per year. By extending the monitoring period to 10 years,
the MDC is increased to 20 percent, but with a lower average change of 2 percent per year required.
Assuming that total P is the monitoring parameter of interest ($20 per sample analysis) the total cost
(including a QAPP, reports, travel, etc.) for five years is estimated at $190,000, with 83 percent devoted
to labor. A 10-year effort would cost $377,000. So, an additional $187,000 is needed to reduce the
9-13
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
average annual change needed from 3 percent to 2 percent. This type of analysis would provide project
managers with the cost information needed to determine whether they would prefer to enhance
implementation of BMPs to achieve a faster rate of change or commit to a longer monitoring period to
measure a slower rate of change.
The cost-benefit of adding explanatory variables can also be assessed through a combination of MDC
analysis and cost estimation. For example, Spooner et al. (1987) demonstrated that adding salinity as a
covariate in the Tillamook Bay, Oregon watershed study decreases the MDC for fecal coliform (yearly
geometric concentration means) over an 11-year period of time (20 samples/yr; 14 sites) from 42 percent
to 36 percent. For this same study, the MDC for fecal coliform decreases from 55 percent to 42 percent
when doubling sampling frequency from 10 to 20 times per year over an 11-year study.
To estimate costs for the Tillamook Bay scenarios, it is assumed that there are 14 monitoring sites and
fecal coliform is measured from one grab sample per site ($20/sample). Salinity is measured using a
hand-held meter ($765). Sample size is increased by 10 percent for QA/QC. Sampling trips are assumed
to involve 2 people for 8 hours each, including a 322-km (200 mi) round-trip to cover all 14 sites. The
cost for a QAPP is assumed to be $1,400 and data analysis and reporting costs are $2,268 for the first and
last years and $622 for the other nine years. The costs for watershed characterization, site establishment,
and land use/treatment tracking are assumed to be zero.
These scenarios are summarized in Table 9-6. Adding salinity to the base scenario increases the 11-year
cost by only $800 ($75/year) while improving the MDC by 8 percent from 55 percent (5 percent per year)
to 47 percent (4.3% per year). Increasing sampling frequency nearly doubles the total 11-year cost while
improving the MDC by 13 percent, from 55 percent to 42 percent (3.8 percent per year). Adding salinity
measurement to the increased sampling frequency adds just $800 to the total 11-year cost, but reduces the
overall MDC by an additional 6 percent to 36 percent (3.3 percent per year). Clearly, with or without an
increase in sampling frequency, the additional $800 cost for salinity, while almost negligible, buys
substantial additional sensitivity to detect a change in fecal coliform counts.
Table 9-6. Illustration of costs and MDC in response to changes in sampling program in Tillamook
Bay, Oregon (Spooner et al. 1987)
Scenario
Base
Add salinity
Double frequency
Double frequency, add salinity
Sampling Program
10x/yr, FC
10x/yr, FC, salinity
20x/yr, FC
20x/yr, FC, salinity
Cost
(1 1 years)
$182,600
$183,400
$347,400
$348,200
Cost Change1
-
$800
$164,800
$165,600
MDC
55%
47%
42%
36%
MDC Change1
-
8%
13%
19%
1Change versus Base scenario.
9-14
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
9.5 References
Barbour, M.T., J. Gerritsen, B.D. Snyder, and J.B. Stribling. 1999. Rapid Bioassessment Protocols for
Streams and Wadeable Rivers: Periphyton, Benthic Macroinvertebrates and Fish. 2nd ed.
EPA/841-D-99-002. U.S. Environmental Protection Agency, Office of Water, Washington, DC.
Accessed February 9, 2016. http://water.epa.gov/scitech/monitoring/rsl/bioassessment/index.cfm.
Dressing, S.A. 2012. Monitoring Cost Estimation Spreadsheet - Master Version. TetraTech, Inc.,
Fairfax, VA. Accessed April 29, 2016. https://www.epa.gov/polluted-runoff-nonpoint-source-
pollution/monitoring-and-evaluating-nonpoint-source-watershed.
Dressing, S.A. 2014. Monitoring Cost Estimation Spreadsheet- Simplified Version. Tetra Tech, Inc.,
Fairfax, VA. Accessed April 29, 2016. https://www.epa.gov/polluted-runoff-nonpoint-source-
pollution/monitoring-and-evaluating-nonpoint-source-watershed.
MDOE (Maryland Department of the Environment) and DDOE (District of Columbia Department of the
Environment). 2010. Total Maximum Daily Loads of Trash for the Anacostia River Watershed,
Montgomery and Prince George's Counties, Maryland and the District of Columbia-Final.
Maryland Department of the Environment and District Department of the Environment-Natural
Resources Administration. Accessed April 6, 2016.
NEMI (National Environmental Methods Index). 2006. National Environmental Methods Index. National
Water Quality Monitoring Council. Accessed February 5, 2016. www.nemi.gov.
Spooner, J., R.P. Maas, M.D. Smolen, and C.A. Jamieson. 1987. Increasing the Sensitivity of Nonpoint
Source Control Monitoring Programs. In Symposium On Monitoring, Modeling, and Mediating
Water Quality, American Water Resources Association, Bethesda, Maryland, pp. 243-257.
Spooner, J., S.A. Dressing, and D.W. Meals. 2011. Minimum detectable change analysis. Tech Notes #7.
Prepared for U.S. Environmental Protection Agency, by Tetra Tech, Inc., Fairfax, VA. Accessed
March 24, 2016. https://www.epa.gov/polluted-runoff-nonpoint-source-pollution/nonpoint-
source-monitoring-technical-notes.
9-15
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Appendix 9-1. Overview of Cost Estimation
Spreadsheets
Both the master and simplified spreadsheets support cost estimation covering all items shown in Table
9A1-1 and Table 9A1-2. Various options exist for users to change spreadsheet costs (e.g., for labor or
laboratory analysis) based on local information or experience, as well as assumptions regarding labor,
equipment, and other requirements for monitoring designs of interest to the user. The master spreadsheet
provides total flexibility in changing cost assumptions, whereas the simplified spreadsheet is designed to
provide a set of default assumptions that facilitates development of cost estimates with minimal data
entry. The master spreadsheet supports costing of virtually any monitoring design, while the simplified
spreadsheet supports cost estimation for only above/below, paired, and trend monitoring designs.
Data entry requirements for the simplified worksheet are:
• Beginning year for monitoring (for inflation estimates).
• Monitoring design (above/below, paired, or trend - results in 1 or 2 sites).
• Watershed size and size of second watershed for paired design.
• Distance monitoring team is from watershed.
• Extra distance to drop samples off at laboratory.
• Average speed limit for drive to watershed.
• Average speed limit within watershed.
• Mileage rate paid for vehicles.
• Per diem rate (food and non-lodging expenses).
• Lodging rate (including taxes).
• Type of sampling (biological/habitat, grab, sondes, loads).
• Variable set (2 or 3 options per sampling type).
• Sampling frequency (same at each site).
• Duration of monitoring effort.
The simplified spreadsheet provides the sample type and variable set options shown in Table 9A1-1
(Note: codes are used in Appendix 9.4). Variable sets for these options are shown in Table 9A1-2 through
Table 9A1-5. The number of units needed is calculated for each cost item based on the number of
monitoring sites, sampling frequency, and monitoring plan duration. Note that inclusion of specific
vendor products does not indicate EPA endorsement.
9-16
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A1-1. Sample type and variable set options for simplified spreadsheet
Sample Type
Variable Set
Options
Grab
Variables
Nutrients, Sediment
Bacteria, Nutrients,
Sediment
Metals, Sediment
Code
NSC
BNSC
MSC
Load
Variables
Nutrients, Sediment
Bacteria, Nutrients,
Sediment
Metals, Sediment
Code
NSL
BNSL
MSL
Biological/Habitat
Variables
Biological Monitoring
with Kick Net
Biological Monitoring
with D-Frame Dip Net
Code
BioK
BioD
Sondes
Variables
Nutrients, Turbidity
Nutrients, Turbidity,
Metals
Code
SNT
SNTM
Table 9A1-2. Grab sampling variable sets
Variable Set
Nutrients and
Sediment (NSC)
Bacteria,
Nutrients, and
Sediment (BNSC)
Metals (Total and
Dissolved) and
Sediment (MSC)
Cost Items
Equipment and Supplies
Style A Staff Gage (13.5 ft), T-style post, and post
driver
Rain Gage (plastic)
Cooler (54-quart) and ice for cooler
Bottles-1000 ml wide mouth (HOPE, Box of 24)
Sulfuric Acid (1 ON) Liter
Same as above
Above items, minus sulfuric acid and plus the
items below:
Geopump Series 1 Peristaltic Pump AC/DC
Silicone Tubing, Size 24, 25'L (for use with
peristaltic pumps)
12V Battery and Charger (for peristaltic pumps)
Solinst Model 860 Disposable Filters
(0.45 |jm) 1 filter
1:1 Nitric acid 500ml
Laboratory Analysis
Total N using EPA Method 351. 4
Total P using EPA Method 365.4
Suspended Sediment Concentration (USGS Method)
Total N using EPA Method 351.4
Total P using EPA Method 365.4
Suspended Sediment Concentration (USGS Method)
E. coli and total coliform via Micrology Labs Coliscan
Easygel
Suspended Sediment Concentration (USGS Method)
Hardness EPA Method 130.2 - Titrimetry using EDTA
Metals Scan (5 metals) using EPA Method 200.7
($12/metal)
As shown below, the simplified spreadsheet allows users to apply labor adjustment factors (0 to 1.5 times
default assumptions) to better simulate local labor costs. Inflation can also be factored into cost estimates.
The base year assumed for inflation is 2012 because most costs in the spreadsheet are from that year.
Users can also change default assumptions in the simplified spreadsheet to tailor them to local costs, but
this requires a level of effort that mimics what is required for the master spreadsheet.
Salary Adjustment Factor:
Inflation Rate (vs. 2012)
1
0.0
%
9-17
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
The simplified spreadsheet generates simple pie charts to show costs by category (see Figure 9A1-1).
Total cost is also broken down as in Table 9A1-6. Total costs are given with and without inflation
estimates. Annual costs are also generated by the simplified spreadsheet as shown in Table 9A1-7. The
effect of inflation is illustrated by the change in costs between the years 2017 through 2021 which would
all be the same without inflation.
Table 9A1-3. Load monitoring variable sets
Variable Set
Nutrients and
Sediment (NSL)
Bacteria,
Nutrients, and
Sediment
(BNSL)
Metals (Total
and Dissolved)
and Sediment
(MSL)
Cost Items
Equipment and Supplies
USGS portable steel gage house (2'x3'x5' tall),
connection to power grid, and surge protector
Style A Staff Gage (13.5 ft), T-style post, and post
driver
Isco Model 6712FR Fiberglass Refrigerated Sampler,
2-bottle kit (7.5-liter polyethylene), 2 extra 7.5-liter
polyethylene bottles for each site, intake line with
strainer, battery-backed power pack, and Flowlink
Software
Isco 730 Bubbler Flow Module
Isco 581 RTD (rapid transfer device) for field retrieval of
Model 6712FR data
Pygmy-type Current Meter w/ AquaCount data logger
Tipping Bucket Rain Gauge
HOBO Event Rainfall Logger (for tipping bucket rain
gauge) and Boxcar Software
Cooler (54-quart) and ice for cooler
Sulfuric Acid (1 ON) Liter
Same as above
Above items, minus sulfuric acid and plus the items
below:
Bottles-1000 ml wide mouth (HOPE, Box of 24)
Geopump Series 1 Peristaltic Pump AC/DC
Silicone Tubing, Size 24, 25'L (for use with peristaltic
pumps)
12V Battery and Charger (for peristaltic pumps)
Solinst Model 860 Disposable Filters ( 0.45 pm) 1 filter
1:1 Nitric acid 500ml
Laboratory Analysis
Total N using EPA Method 351. 4
Total P using EPA Method 365.4
Suspended Sediment Concentration (USGS
Method)
Total N using EPA Method 351.4
Total P using EPA Method 365.4
Suspended Sediment Concentration (USGS
Method)
E. coli and total coliform via Micrology Labs
Coliscan Easygel ($18.50 for 10 tests)
Suspended Sediment Concentration (USGS
Method)
Hardness EPA Method 130.2 - Titrimetry using
EDTA
Metals Scan (5 metals) using EPA Method 200.7
($12/metal)
9-18
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A1-4. Biological monitoring variable sets
Variable Set
Kick Net Option (BioK)
D-Frame Dip Net Option
(BioD)
Cost Items
Style A Staff Gage (13.5 ft), T-style post, and post driver
YSI 556 D.O., pH, conductivity, temperature meter with pH kit
pH buffer, conductivity, and ORP calibration solutions for YSI 556
Hach Model 21000 Portable Turbidimeter with USB+Power Module for 21000 (for data transfer
to PC) and Gelex Secondary Standards Kit
Silicone oil and portable turbidimeter sample cells for Hach Turbidimeter
Pentax Option W30 waterproof digital camera
Garmin eTrex 30 GPS
Current meter outfit (Pygmy-type). Meter, headphones, and rod.
Bottom kick net (500 urn mesh)
Forceps (straight fine point)
Sieve bucket
First aid kit, 1 19-piece, economy
STEARNS neoprene chest waders and fluorescent orange PVC gloves
Bottles-1000 ml wide mouth (HOPE, Box of 24)
Low plastic specimen jars and black molded caps
Ice (cooler full)
95% Ethanol (3.8 L)
Above items, minus bottom kick net and plus item below
D-Frame dip net (500 urn mesh)
Table 9A1-5. Sondes monitoring variable sets
Variable Set
Nutrients and
Turbidity Set (SNT)
Cost Items
Equipment and Supplies
Style A Staff Gage (13.5 ft), T-style post, and post driver
Rain Gage (plastic)
Hydrolab DataSonde 5 - DS5 w/ built-in data logger, temperature
sensor, and connecting cable (takes 10 sensors, measures up to
15 parameters simultaneously)
pH, polarographic DO, temperature (comes with unit), nitrate,
self-cleaning turbidity, ammonia, chlorophyll a, and conductivity
sensors for DS5
5-meter communication cable and battery pack for DS5
Bottles-1000 ml wide mouth (HOPE, Box of 24)
Cooler (54-quart) and ice for cooler
1:1 Nitric acid 500ml
Sulfuric Acid (1 ON) Liter
Laboratory Analysis
Total P using EPA Method 365.4
9-19
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Variable Set
Nutrients,
Turbidity, and
Metals (Total and
Dissolved) Set
(SNTM)
Cost Items
Equipment and Supplies
Above items plus the items below:
Geopump Series 1 Peristaltic Pump AC/DC
Silicone Tubing , Size 24, 25'L (for use with peristaltic pumps)
12V Battery and Charger (for peristaltic pumps)
Solinst Model 860 Disposable Filters ( 0.45 pm) 1 filter
Laboratory Analysis
Total P using EPA Method 365.4
Hardness EPA Method 130.2 -
Titrimetry using EDTA
Metals Scan (5 metals) using EPA
Method 200.7 ($12/metal)
11%
I Total Labor Cost
iTotal Equipment and
Supplies Cost
Total Lab Chemical
Analysis Cost
I Total Vehicle Cost
Total Per Diem Cost
Figure 9A1-1. Pie chart from simplified spreadsheet
Table 9A1-6. Tabular output from simplified spreadsheet
Cost Category
Labor
Equipment and Supplies
Sampling Analysis
Vehicles
Per Diem
TOTAL COST
Average Annual Cost
Total Cost with Inflation
Average Annual Cost with Inflation
Total Cost
$205,167
$2,158
$53,654
$22,921
$0
$283,900
$40,557
$325,887
$46,555
% of Total
72
1
19
8
0
100
9-20
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A1-7. Annual costs from simplified spreadsheet
Inflation Rate: 2%
Begin: 2016
End: 2022
Year
2016
2017
2018
2019
2020
2021
2022
Inflation Factor
Applied
1.08
1.10
1.13
1.15
1.17
1.20
1.22
TOTAL
Annual Cost
without Inflation
$47,349
$39,279
$39,279
$39,279
$39,279
$39,279
$40,157
$283,900
Annual Inflation-
Adjusted Cost
$51,253
$43,367
$44,234
$45,119
$46,021
$46,942
$48,951
$325,887
9-21
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Appendix 9-2. Cost Estimates for a Diverse Range
of Monitoring Options
As described in Appendix 9-1, a large number of assumptions must be made to estimate costs for various
monitoring scenarios. Thus, while these cost estimates are intended to be informative, they are more or
less relevant to any particular monitoring effort based on how well the assumptions match the realities of
that specific situation. Cost estimates given here are more likely to be high than low because it is always
assumed that contractors perform the monitoring (i.e., no use of in-house labor that was hired to do
monitoring) and all monitoring equipment must either be leased or purchased.
Cost Scenarios and Assumptions
Cost estimates for the following eight monitoring scenarios are presented in this section.
1. Synoptic Survey
2. TMDL - Water Quality Standards
3. TMDL-Loads
4. Paired-Watershed - Loads
5. Long-term Single Station - Biomonitoring
6. Above/Below BMP Effectiveness - Biomonitoring
7. Input/Output Urban LID Effectiveness
8. Photo-Point Monitoring
These eight scenarios address both problem assessment and project evaluation, using chemical, physical,
and biological (Barbour et al. 1999) monitoring methods. Five-year total costs are used for comparing
Scenarios 2-8, but costs are also provided for 1, 2, and 8 years. The synoptic survey is considered a one-
year effort.
The Watershed
The setting assumed for the cost scenarios is a 3,035 ha (7,500 ac) watershed, primarily in agricultural use
with some urban influence. Monitoring is performed in perennial streams.
For the synoptic survey (Scenario 1) it is assumed that the nature and extent of water quality problems in
the watershed are totally unknown. Thus, water chemistry sampling includes a wide range of variables.
For Scenarios 2-7, the problems are assumed to be associated with sediment, nutrients, aquatic life use
support, and cadmium toxicity. Stream channel restoration is the focus of Scenario 8.
9-22
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Labor Costs
All monitoring is assumed to be performed by contractors; different pay rates would apply to government
and university employees, and volunteers would work for free. Pay rates assumed (including fringe and
overhead) and basic job functions are summarized in Table 9A2-1.
Table 9A2-1. Labor costs assumed for scenarios
Pay Level
4
3
2
1
Rate ($/hr)1
130
80
56
34
Job Functions
Monitoring design, statistical analysis, oversight, etc.
Lead field person for monitoring, data collection, bulk of writing
Field technician, lab tech, etc.
Secretarial and support staff
includes fringe and overhead.
Other Cost Assumptions
Monitoring proposals are assumed to be QAPPs (Quality Assurance Project Plans) prepared in 16 hours
by a team that includes an expert and support staff at a cost of $ 1,400 for each scenario.
Transportation costs (vehicle and labor) include driving to and from the watershed, driving to monitoring
sites within the watershed, and delivering samples to a laboratory for analysis. It is assumed that the
watershed is 160 km (100 mi) from the base of those performing the monitoring. The sample analysis
laboratory is assumed to be "on the way," so no additional mileage is added for delivering samples to the
laboratory.
Watershed characterization (windshield survey) costs are included only in Scenario 1. Monitoring site
selection and establishment (as needed) costs are included in all scenarios. While it is a very important
part of most NPS monitoring designs and is addressed by the spreadsheet, costs for meteorological
monitoring were not included in these scenarios.
Analytical methods for water quality variables were obtained from various sources such as NEMI
(http://www.nemi.gov/). Constraints associated with these methods (e.g., cooling samples to 4°C for
suspended sediment, and pre-acidification for hardness) are reflected in the cost estimates through, for
example, the purchase of refrigerated samplers or the use of both pre-acidified and non-acidified sample
containers.
For safety reasons, all sampling is assumed to be performed by teams of at least two people. In some
cases, one or two additional people are added for a limited number of sampling trips. Larger teams are
assumed necessary for QA/QC checks, stage-discharge calibration during a regularly scheduled sampling
event, and scenarios where both intensive water chemistry and biological monitoring are performed. In all
cases where continuous flow is measured, additional labor is assumed for stage-discharge calibration.
9-23
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Scenario Description and Results
Scenario 1: Synoptic Survey
Under this scenario, a windshield survey is performed to characterize the watershed and select monitoring
sites. It is assumed that the survey covers 512 km (320 mi) within an 8-hour day. Water quality
monitoring at six sites (1.6 km [1 mi] apart from each other) is performed on two separate sampling dates
to cover both high-flow and low-flow conditions. Each sampling run is assumed to require 400 km
(250 mi) and a 12-hour day (1 hour per site, plus driving time) for a team of three in a single vehicle.
Equipment and sampling assumptions for this scenario are:
" Equipment: sample bottles and jars, water quality sonde with 6 sensors (D.O., pH, temperature,
conductivity, turbidity, and chlorophyll a), pygmy-type meter with data logger, kick net
" Sampling for all 6 sites: B.O.D., hardness, SSC, TP, TKN, NO2+NO3 -N, E. coli and total
coliforms, biological monitoring, flow
" Sampling for 3 sites (targeted locations to keep costs down): grab sample for pesticides scan and
metals scan (5 metals)
As shown in Table 9A2-2, the total cost for this one-year effort is estimated at $30,000. Equipment and
labor each account for 45% of the total cost. Assuming that the contractor already has the basic
monitoring equipment, however, the one-year total cost is reduced to j ust over $17,000.
Scenario 2: TMDL - Water Quality Standards
Scenario 2 envisions a TMDL under which water quality monitoring is performed at a single site to both
track dissolved cadmium concentration (weekly grab samples) and assess aquatic life use support through
biological monitoring.
Equipment and sampling assumptions for this scenario are:
" Equipment: sample bottles, multi-probe water quality meter for in situ D.O., pH, conductivity, and
temperature measurements, kick net
" Sampling: cadmium and hardness, biological monitoring (2x/yr)
As shown in Table 9A2-2, the total cost for five years is about $214,900. Costs for one year, 2 years, and
8 years are estimated at $47,100, $90,300, and $339,400, respectively. Nearly 83% of the total cost is
associated with sampling trips, with another 7% for analysis of samples for cadmium and hardness. Labor
accounts for 81% of the total budget, and equipment account for only 2% of the total 5-year budget.
Scenario 3: TMDL - Pollutant Load
Under this scenario, weekly flow-weighted composite samples are taken for suspended sediment load
estimation at a single site. Continuous discharge is measured with a bubbler water level sensor and a
pygmy-type current meter is used for calibration.
9-24
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Equipment and sampling assumptions for this scenario are:
• Equipment: sample shed, refrigerated automatic sampler (with bubble flow module, battery backup,
2-bottle kit), data transfer device and software, surge protector, pygmy-type current meter and data
logger
OO
« Sampling: discharge and suspended sediment concentration
As shown in Table 9A2-2, the total five-year cost for this scenario is estimated at $237,500. Total costs
for 1, 2, and 8 years are $61,800, $106,500, and $368,400, respectively. Sampling trips and labor account
for 87% and 84% of the total cost, respectively.
Scenario 4: Paired-Watershed Loads
This scenario is in many ways a doubling of Scenario 3, but shared equipment (e.g., pygmy-type current
meter) is not duplicated and incremental costs for analyzing and reporting on data from the second
monitoring station are assumed to be half the cost for the first monitoring station. The watersheds are
assumed to be 12.8 km (8 mi) apart. Weekly flow-weighted composite samples are taken for suspended
sediment load estimation at each site using an automatic sampler. Continuous discharge is measured with
a bubbler water level sensor and a pygmy-type current meter is used for calibration. Unlike for Scenario
3, tracking of land use and land treatment is included in the analysis, with the cost essentially twice that
for Scenario 5.
Equipment and sampling assumptions for this scenario are:
• Equipment: 2 sample sheds, 2 refrigerated automatic samplers (with bubble flow module, battery
backup, 2-bottle kit), data transfer device and software, 2 surge protectors, pygmy-type current
meter and data logger
« Sampling: discharge and suspended sediment concentration
As shown in Table 9A2-2, this is the most expensive scenario considered here with a total five-year cost
estimated at $347,800. Total costs for 1, 2, and 8 years are $93,400, $158,100, and $537,400,
respectively. Sampling trips account for about three-quarters of the total cost. Site establishment cost is
significant under this scenario, accounting for nearly 7% of the total cost, while sample analysis
represents about 2% of the total cost. Labor is the largest cost category at 84% of the total cost.
Scenario 5: Long-Term Trend Monitoring-Biological
This scenario assumes long-term biological monitoring (2x/yr) at a single site. Stage is measured as a
covariate, but discharge is not estimated. Land use and BMP implementation are tracked via two whole-
watershed surveys per year.
Equipment and sampling assumptions for this scenario are:
« Equipment: multi-probe water quality meter for in situ D.O., pH, conductivity, and temperature
measurements, staff gage, kick net, sample bags
« Sampling: biological monitoring (2x/yr)
The total cost for five years is estimated at $52,800, while the total costs for 1, 2, and 8 years are
estimated at $16,100, $25,800, and $79,800, respectively. As shown in Table 9A2-2, land use tracking
9-25
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
accounts for about 28% of the total five-year cost, while annual sampling trips consume 26% of the five-
year budget. An additional 20% is used for data analysis and reporting. The largest cost category is labor
at 85% of the total cost.
Scenario 6: Above/Below BMP Effectiveness Monitoring-Biological
This scenario assumes long-term biological monitoring (2x/yr) at two monitoring sites in an above/below
design to evaluate individual BMP effectiveness. Stage at the time of sampling is measured as a covariate,
but discharge is not estimated. Land use and BMP implementation are tracked via two partial-watershed
surveys per year.
Equipment and sampling assumptions for this scenario are:
• Equipment: multi-probe water quality meter for in situ D.O., pH, conductivity, and temperature
measurements, 2 staff gages, kick net, sample bags
• Sampling: biological monitoring (2x/yr)
As shown in Table 9A2-2, the five-year total cost is estimated at $58,000. One-year, two-year, and eight-
year total costs are estimated at $17,200, $28,000, and $88,000, respectively. The total cost for this
scenario nearly matches that for Scenario 5. Despite having two sites instead of one, annual sampling trips
for Scenario 6 ($15,720) cost only slightly more than for Scenario 5 ($13,860). The time spent tracking
land use/land treatment is substantially greater for Scenario 5 because the entire watershed is tracked
versus only a portion of the watershed under the Scenario 6 above/below study. This difference explains
the greater amount and percentage of the Scenario 5 budget devoted to land use tracking ($14,640, 28%)
versus that for Scenario 6 ($11,800, 20%). Labor accounts for 85% of the five-year budget.
Scenario 7: Input/Output Urban LID Effectiveness
The analysis of inflow-outflow monitoring of urban LID practices assumes two monitoring stations, one
storm event sampled per week at each station, discharge measurement, and analysis of both suspended
sediment and five metals.
Equipment and sampling assumptions for this scenario are:
• Equipment: 2 small sample sheds, 2 refrigerated automatic samplers (with 2-bottle kit), data
transfer device and software, 2 submersible pressure transducers with data logger, 2 V-notch weir
boxes, 2 surge protectors
• Sampling: discharge, suspended sediment concentration, metals scan (5 metals)
As shown in Table 9A2-2, the five-year total cost for this scenario is estimated at $251,400, while
estimated total costs for 1, 2, and 8 years are $68,000, $114,900, and $387,800. Costs for monitoring site
establishment and equipment contribute to the high first-year cost of this study design. After five and
eight years, however, the average annual costs drop to about $50,300 and $48,500, respectively. Annual
sampling trips account for nearly 71% of the total five-year budget, while annual sample analysis
accounts for 16%, and equipment and site establishment combine for just over 8%. Labor is the largest
cost category at 68% of the total five-year budget.
9-26
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Scenario 8: Photo-Point Monitoring
This scenario assumes repeat photography of a riparian zone restoration project using two photo points
(see chapter 5). Each photo point has a single camera point. Cost estimates were developed for both
qualitative and quantitative approaches, with digital image analysis assumed for the quantitative
approach.
Equipment assumptions for qualitative photo-point monitoring are:
• 2 meter boards, digital camera with tripod, GPS unit, field computer, compass, level, sledge
hammer, measuring tape, rebar, shovel, whiteboard, metric staff gage
As shown in Table 9A2-2, the five-year cost for qualitative photo-point monitoring is estimated to be
about $ 18,600, with 81% of the cost devoted to labor. If it is assumed that the contractor already has the
major equipment, the total cost for five years is reduced to about $16,300. Total costs for 1, 2, and 8 years
are estimated at $8,100, $11,100, and $26,000, respectively. Annual sampling trips account for about 48%
of the total five-year budget, while site establishment, portable sampling equipment, and startup supplies
consume a combined 22% of the budget. Labor is the largest cost category at 81% of the total.
When considering photo-point as an add-on monitoring activity (e.g., the same individuals who perform
biological monitoring or collect water chemistry samples also take the photos), the five-year cost is
reduced to $8,500 due primarily to savings in labor and vehicle costs. Coupled with the assumption that
the contractor already has the major equipment the 5-year cost drops to about $6,200.
Quantitative photo-point analysis requires image processing software, and labor requirements for data
analysis are increased substantially. Because quantitative photo-point analysis has not been used to any
measurable extent in watershed projects, the cost estimates provided here are highly uncertain. The total
cost for five years is estimated at $74,900 with 90% of the cost for labor. Assuming the contractor has all
major equipment and software, the 5-year cost is reduced to about $68,700. If quantitative photo-point
monitoring is added to a water chemistry or biological monitoring program, the cost is estimated at just
over $53,000 for five years. Coupled with the assumption that the contractor already has the major
equipment and software the 5-year cost drops to $46,800.
9-27
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A2-2. Total costs for eight diverse scenarios
Cost Phase or Element
Proposal and QAPP
Watershed
Characterization
Site Establishment
Portable Sampling
Equipment and Startup
Supplies Costs
One-Time Office
Equipment and Startup
Supplies Costs
Station Demolition and
Site Restoration
First- Year Report
Final Report
Annual Access Fees
Annual Sampling Trips to
Sites
Annual Volunteer
Training
Annual Sample Analysis
Annual Data Analysis
Annual Reports
Annual Site Operation
and Maintenance
Annual Supplies and
Rental Equipment
Annual Land Use
Tracking
TOTAL
Scenario
1
1Year
$1,400
$1,858
$0
$13,332
$0
$0
$3,610
$0
$0
$6,008
$0
$3,774
$0
$0
$0
$59
$0
$30,040
2
5 Years
$1,400
$0
$0
$4,210
$0
$0
$1,952
$3,608
$0
$177,660
$0
$15,880
$2,268
$3,588
$0
$4,313
$0
$214,879
3
5 Years
$1,400
$0
$11,409
$4,803
$0
$808
$1,692
$2,976
$0
$205,660
$0
$2,860
$2,268
$3,588
$0
$0
$0
$237,464
4
5 Years
$1,400
$0
$22,829
$4,803
$0
$1,616
$2,448
$4,422
$0
$266,539
$0
$5,720
$3,402
$5,316
$0
$0
$29,280
$347,775
5
5 Years
$1,400
$0
$2,110
$3,595
$0
$0
$1,952
$2,656
$0
$13,860
$0
$4,960
$2,268
$3,588
$0
$1,795
$14,640
$52,824
6
5 Years
$1,400
$0
$2,234
$3,595
$0
$0
$1,952
$2,656
$0
$15,720
$0
$9,920
$2,268
$3,588
$0
$2,855
$11,800
$57,988
7
5 Years
$1,400
$0
$17,598
$3,056
$0
$136
$2,250
$3,336
$0
$177,840
$0
$40,040
$2,412
$3,288
$0
$0
$0
$251,356
8
Qualita-
tive
5 Years
$1,400
$0
$1,860
$2,260
$0
$114
$720
$1,224
$0
$8,820
$0
$0
$342
$1,818
$0
$0
$0
$18,558
8
Quantita-
tive
5 Years
$2,440
$0
$1,860
$6,241
$0
$114
$11,676
$10,980
$0
$13,960
$0
$0
$17,280
$10,368
$0
$0
$0
$74,919
9-28
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Appendix 9-3. Cost Estimates for Watershed-Scale
Evaluation of Agricultural BMP Implementation
As described in Appendix 9-1, a large number of assumptions must be made to estimate costs for various
monitoring scenarios. Thus, while these cost estimates are intended to be informative, they are more or
less relevant to any particular monitoring effort based on how well the assumptions match the realities of
that specific situation. Cost estimates given here are more likely to be high than low because it is always
assumed that contractors perform the monitoring (i.e., no use of in-house labor that was hired to do
monitoring) and all monitoring equipment must either be leased or purchased.
Cost Scenarios
Cost estimates for the following seven monitoring scenarios are described in this section. Results of the
cost analysis are summarized in Figure 9-2. One year is assumed for the synoptic survey, and costs for
other scenarios are estimated for 3, 5, and 7 years.
1. Preliminary Synoptic Survey
2. Compliance Monitoring
3. Above/Below Monitoring (sub-scenarios for concentration and load: 3C, 3L)
4. Multiple-Watershed Monitoring
5. Trend Monitoring (sub-scenarios for concentration and load: 5C, 5L)
6. Paired-Watershed Monitoring (sub-scenarios for perennial and intermittent flows: 6P, 61)
7. Soil Testing
The Watershed
The setting assumed for these cost scenarios is a 12-digit HUC watershed covering 10,117 ha (25,000 ac),
primarily in agricultural use. Monitoring is performed in perennial streams with the exception of Scenario
61 which assumes intermittent flow. Scenario 1 assumes that the nature and extent of water quality
problems in the watershed are totally unknown, so a wider range of monitoring variables is included. For
Scenarios 2-7, the problems are assumed to be associated with nutrients from agricultural sources.
Labor Costs
Labor cost assumptions are the same as described in Appendix 9-2 (Table 9A2-1).
Driving Distances and Sampling Times
Transportation costs include driving to and from the watershed, driving to monitoring sites within the
watershed, and delivering samples to a laboratory for analysis. To bracket a wide range of possibilities for
transportation costs and sampling times, three one-way distances and associated drive times are assumed:
9-29
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
* "In" - Monitoring staff are within the watershed: distance and travel time are zero.
• "Near" - Monitoring staff are based 240 km (150 mi) from the watershed, with a one-way drive
time of 2 hour and 45 minutes.
• "Far" - Monitoring staff are based 480 km (300 mi) from the watershed, with a one-way drive time
of 5.5 hours.
Drive distances and times for sampling runs within (i.e., in addition to travel distance and times to the
watershed) the watershed are assumed to be:
* Zero miles and time for trend monitoring (1 station)
* 25 km (16 mi) and 0.5 hours R/T for compliance and above/below monitoring (2 stations)
* 48 km (30 mi) and 0.75 hours R/T for paired-watershed monitoring (2 stations, 1 in nearby
watershed 24 km [15 mi] away)
« 96 km (60 mi) and 2.5 hour R/T for multiple-watershed study (20 sub-watershed stations all within
same watershed)
* 80 km (50 mi) and 2 hours R/T for a soil testing study (20 fields within the same watershed)
For all scenarios in which driving to the watershed is required, it is assumed that collected samples are
dropped off at the laboratory in transit with no additional driving mileage. For scenarios in which the
contractor is based in the watershed, 80 km (50 mi) is added for delivery of the samples to the nearest
laboratory, except for Scenario 7 for which soil samples are assumed mailed to the laboratory.
It is assumed that contractors within the watershed will not incur lodging fees, while lodging is
(generally) assumed for others when work days exceed 12 hours. Efforts were made to combine activities
(e.g., site establishment and discharge observation) to reduce the need for overnight stays.
For safety reasons, all sampling is assumed to be performed by teams of at least two people. Two-person
teams are assumed for grab sampling and 3 people are assumed necessary for runs including discharge
observations. Periodic trips for QA/QC (e.g., 4 times per year for weekly sampling) by a QA/QC expert
are also included.
The time required for grab sampling is assumed to be 0.5 hours per site, whereas sampling at sites with
automatic sampling and discharge measurements is assumed to require 1.5 hours per site. Scenario 7
incorporates an assumption that 45 minutes is required to collect a composite soil sample for each 4-ha
(10-acre) field that is monitored.
The cost of establishing a stage-discharge relationship is included for Scenarios 3L, 5L, 6P, and 61. It is
assumed that all monitoring is performed on wadeable streams, so time assumed for a discharge
observation is set at 1.5 hours. Requirements for discharge observations on larger streams would be more
expensive. Costs assume eight discharge observations per year, with 6 of these as separate trips and 2 as
additional time during normal sampling runs. The driving distances and hours assumed necessary for
discharge observations made within each study area as separate trips are summarized in Table 9A3-1.
9-30
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A3-1. Driving and labor assumptions for discharge observations as stand-alone trips
Scenario
3L. Above/Below
5L. Trend (Load only)
6P. Paired
61. Paired
Discharge Observation
Number of
Stations
2
1
2
2
Total Drive Distance Within
Watershed1
25 km (16 mi)
0 km (0 mi)
48 km (30 mi)
48 km (30 mi)
Total Hours Within
Watershed1
3.50
1.50
3.75
3.75
1Does not include driving distance and time to arrive at watershed.
Table 9A3-2 summarizes assumptions regarding driving distances and time spent within (and between for
paired-watershed design) each watershed for sampling runs. This does not include round-trip (R/T) travel
to or from the watershed, nor does it include add-ons such as discharge observations.
Table 9A3-2. Sampling distances and times within watersheds
Scenario
1 . Synoptic
2. Compliance
3C. Above/Below (Cone.)
3L. Above/Below (Load)
4. Multiple
5C. Trend (Cone.)
5L. Trend (Load)
6P. Paired (Perennial)
61. Paired (Intermittent)
7. Soil Test
No. of
Sites
8
2
2
2
20
1
1
2
2
20
Travel Within
Watershed
km
32
25
25
25
96
0
0
0
0
80
Hours
0.75
0.5
0.5
0.5
2.5
0
0
0
0
2
Travel Between
Watersheds
km
0
0
0
0
0
0
0
48
48
0
Hours
0
0
0
0
0
0
0
1
1
0
Time at
Each Site
Hours
0.5
0.5
0.5
1.5
0.5
0.5
1.5
1.5
1.5
0.75
Total PER Site
km
4
12.5
12.5
12.5
4.8
0
0
24
24
4
Hours
0.6
0.75
0.75
1.75
0.625
0.5
1.5
2
2
0.85
Quality Assurance Project Plans (QAPPs)
Monitoring proposals are assumed to be QAPPs prepared in 16 hours by a team that includes an expert
and support staff at a cost of $1,400 for each scenario.
Watershed characterization
Watershed characterization costs apply only to Scenario 1, including a windshield survey (240 km,
8 hours) and a review of available data and maps. For all other scenarios it is assumed that the watershed
has been suitably characterized for development of the monitoring program.
9-31
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
9.5.1.1 Site selection and establishment
For Scenarios 1 and 2 (synoptic and compliance) site selection is assumed to be a desktop exercise,
requiring two staff for four hours each. It is assumed that site selection for Scenario 7 involves more time
because information must be gathered to find 20 fields via a random selection process. Two staff for
20 hours each is assumed for this effort, with any additional labor provided by cooperators within the
watershed. For Scenarios 3-6 it is assumed that three staff each devote 2 hours of paper investigation to
each monitoring site prior to traveling to the watersheds for field investigation.
Field costs for site selection include travel (to and within the watersheds) and labor. Costs assumed for
field work under Scenarios 3-6 are summarized in Table 9A3-3. It is assumed that an additional person is
needed for site selection that involves installation of a sampling shed and for Scenario 4 because
20 subwatersheds must be selected.
Monitoring site establishment (as needed) costs are included in Scenarios 3 through 6, with greater cost
for sites with continuous discharge measurement and automated samplers. Major materials and equipment
assumed for stations at which continuous flow is measured are summarized in Table 9A3-4. A tipping
rain gauge, data logger, and software are purchased for Scenarios 3L, 5L, 6P, and 61. Plastic rain gauges
are purchased for Scenario 3C and 5C, while available local precipitation records are used for all other
scenarios.
Table 9A3-3. Field work costs for site selection
Scenario [# stations]
3C. Above/Below (cone.) [2]
3L. Above/Below (load) [2]
4. Multiple Watershed [20]
5C. Trend (cone.) [1]
5L Trend (load) [1]
6P. Paired (perennial) [2]
61. Paired (intermittent) [2]
1 -Way Distance
from Base
km/vehicle
0
240
480
0
240
480
0
241
483
0
240
480
0
240
480
0
240
480
0
240
480
Travel and Site Investigation and
Selection
km/vehicle
50
530
1,010
50
530
1,010
322
804
1,287
10
490
970
10
490
970
40
520
1,000
40
520
1,000
Hours/person
5
10.5
16
5
10.5
16
48
53.5
59
2.5
8
13.5
2.5
8
13.5
5
10.5
16
5
10.5
16
Number of Staff/
Number of
Vehicles
2/1
2/1
2/1
3/1
3/1
3/1
3/1
3/1
3/1
2/1
2/1
2/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
Number of
Overnight
Stays1
0
0
1
0
0
1
3
4
4
0
0
1
0
0
1
0
0
1
0
0
1
1 Except where the contractor is based within the watershed, overnight lodging was assumed as needed to keep the length of work days
reasonable (generally 12 hours or less).
9-32
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A3-4. Major equipment and materials costs for stations measuring continuous discharge
Cost Item
Build sampling shed (labor and materials)
Connection to power grid
Staff gage, post, and post driver
Automatic sampler with bubble flow module,
battery backup, 2-bottle kit, data transfer device, and software
Pygmy-type current meter w/data logger
Unit Cost
$2,000
$800
$154
$10,530
$2,015
Six hours is added per station (18 person-hours) in cases where a monitoring shed is installed for
automatic sampling equipment. Table 9A3-5 summarizes travel and labor assumptions for site
establishment field work for Scenarios 3 and 6.
Table 9A3-5. Site establishment costs for sites designed for load estimation
Scenario
[# stations]
3L. Above /
Below [2]
5L Trend [1]
6P. Paired [2]
61. Paired [2]
2-Way Travel to Site1
km/
Vehicle
0
480
960
0
480
960
48
528
1,008
48
528
1,008
Hours /
Person
0
5.5
11
0
5.5
11
1
6.5
12
1
6.5
12
Shed
Construction
and Setup
Hours / Person
12
12
12
6
6
6
12
12
12
12
12
12
# Staff /#
Vehicles
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
Total Without
Discharge
Observation
Hours / Person
12
17.5
23
6
11.5
17
13
18.5
24
13
18.5
24
Hours Added
for Discharge
Observation2
Hours / Person
0
3
3
0
1.5
1.5
0
3
3
0
3
3
Total
Hours /
Person
12
20.5
26
6
13
18.5
13
21.5
27
13
21.5
27
#
Nights
0
1
2
0
0
1
0
1
2
0
1
2
1 Paired watersheds are assumed to be 24 km apart. Above/below sites are assumed to be less than 1 km apart.
2Hours were added to perform a discharge observation at each site where long-distance travel was involved and pollutant load estimation is
planned.
Site Demolition and Restoration
Site demolition and restoration is only required for sites with sampling sheds. It is assumed that 3 people
are needed for this activity, each working 3 hours at each monitoring station. Assumptions are
summarized in Table 9A3-6.
9-33
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A3-6. Site demolition and restoration costs
Scenario [# stations]
3L. Above / Below [2]
5L Trend [1]
6P. Paired [2]
61. Paired [2]
2-Way Travel to Site1
km /Vehicle
0
480
960
0
480
960
48
528
1,008
48
528
1,008
Hours /
Person
0
5.5
11
0
5.5
11
1
6
11.5
1
6
11.5
Site Demolition
and Restoration
Hours / Person
6
6
6
3
3
3
6
6
6
6
6
6
# Staff /#
Vehicles
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
3/1
Total
Hours / Person
6
11.5
17
3
8.5
14
7
12
17.5
7
12
17.5
# Nights
0
0
1
0
0
1
0
0
1
0
0
1
1 Paired watersheds are assumed to be 24 km apart. Above/below sites are assumed to be less than 1 km apart.
Sample Analysis
Analytical methods for water quality variables included in the spreadsheet were obtained from various
sources such as NEMI (2006). Constraints associated with these methods (e.g., cooling samples to 4°C for
suspended sediment, and pre-acidification for hardness) are reflected in the cost estimates through, for
example, the purchase of refrigerated samplers and sample preservatives.
Sample analysis for total P assumes EPA Method 365.4 (NEMI 2006) at a cost of $21 per sample. Soil
samples under Scenario 7 are analyzed for soil P (Mehlich 3), textural class, and organic matter, at a total
cost of $26 per sample. Soil samples are assumed to be sent by ground shipment to the laboratory.
The number of samples analyzed is increased by 10% for QA/QC.
Land Use/Treatment Tracking
Tracking of BMP implementation is assumed to occur twice per year under Scenarios 3,5, and 6. The
baseline assumption for tracking effort within a 12-digit HUC is 240 km (150 mi) driving and 8 hours
R/T each time, with variations across scenarios due to differing monitoring scales and specifics. Travel
distances and times to the watershed are added as appropriate.
For Scenario 4 it is assumed that a cooperator (e.g., NRCS) provides the data for the 20 subwatersheds on
an annual basis; additional observations can be made during the 30-minute visits for grab sampling in
each of the 405-ha (1,000-acre) subwatersheds. Under Scenario 7, annual data on organic and inorganic
nutrient application rates and crop yields per field are assumed to be provided by a cooperator. The
resulting assumptions are summarized in Table 9A3-7.
9-34
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A3-7. Driving and labor assumptions for land use/treatment tracking
Scenario
1 . Synoptic
2. Compliance
3. Above/Below
4. Multiple
5. Trend
6. Paired
7. Soil Test
Land Use/Treatment Tracking
Drive Distance
Within
Watershed1
n/a
n/a
128 km (80 mi)
n/a
240 km (150 mi)
290 km (180 mi)
n/a
Hours
Including
Drive Time
n/a
n/a
6
n/a
8
9
n/a
Frequency
(#/Yr)
n/a
n/a
2
n/a
2
2
n/a
Comments
Not done.
Not done.
Only part of watershed is tracked.
Not done. Data provided by a cooperating agency.
Baseline assumption.
Tracking intensity varies by source and location.
Not done. Data provided by a cooperating agency.
1Does not include driving distance and time to arrive at watershed.
Supplies
Cost estimates include the purchase of ice for each sampling event and annual purchases of 1-liter HDPE
bottles and sample preservative.
Data Analysis and Reports
Data analysis and reporting costs are set higher for the first and last years compared to the "middle" years.
For example, involvement of higher paid staff is greater in the first and final years because of the
challenges faced in developing data management and analysis procedures and rules. It is assumed that
lower level staff can play a greater role in the middle years with oversight from senior staff.
The cost for analysis and reporting is greater for projects estimating pollutant loads versus those simply
collecting concentration data. Synoptic surveys (Scenario 1) and compliance (Scenario 2) monitoring
efforts are assumed to require less time than other scenarios because of greater simplicity. Data analysis
and reporting for multiple-watershed studies (Scenario 4) is assumed to be the most time consuming
despite less frequent sampling than found in Scenarios 3 and 6 because information is obtained from
20 subwatersheds. More hours are assumed for data analysis than for reporting in all cases for Scenario 7
because reports are assumed to be short and more straight-forward. Table 9A3-8 summarizes assumptions
for data analysis and reporting.
Table 9A3-8. Labor assumptions for data analysis and reporting
Scenario
1 . Synoptic
2. Compliance
3C. Above/Below
3L. Above/Below
4. Multiple
First- Year Report
Data
Analysis
(Hours)
12
10
15
20
36
Report
Preparation
(Hours)
18
12
22
28
34
Middle-Year Reports
Data
Analysis
(Hours)
n/a
7
12
14
26
Report
Preparation
(Hours)
n/a
8
12
14
24
Final-Year Report
Data
Analysis
(Hours)
n/a
10
17
22
38
Report
Preparation
(Hours)
n/a
12
22
26
32
9-35
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Scenario
5C. Trend
5L. Trend
6P. Paired
61. Paired
7. Soil Test
First- Year Report
Data
Analysis
(Hours)
14
16
20
20
16
Report
Preparation
(Hours)
18
26
28
28
12
Middle-Year Reports
Data
Analysis
(Hours)
11
13
14
14
13
Report
Preparation
(Hours)
9
12
14
14
10
Final-Year Report
Data
Analysis
(Hours)
19
20
22
22
20
Report
Preparation
(Hours)
18
23
26
26
11
Scenario Summaries
Scenario 1: Preliminary Synoptic Survey
Under this scenario, grab sampling is performed at 8 sites on two trips (low and high flow conditions). A
team of 3 people conducts a windshield survey to characterize the watershed, but subsequent land
use/land treatment tracking is not performed. Meteorological and flow data are assumed to be obtained as
part of the desktop analysis of the watershed.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler
• Sampling: TP, SSC, B.O.D., E. coll, total coliform, discharge, and suspended sediment
concentration
Scenario 2: Compliance Monitoring
Under this scenario, grab sampling (4x/yr) is performed at 2 sites. Land use/land treatment tracking is not
performed.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler
• Sampling: TP
Scenario 3: Above/Below Monitoring
This scenario has two options. Land use/land treatment tracking (e.g., type and number of practices, acres
treated) is performed 2x/yr for both options via windshield survey and collection of data from cooperators
(e.g., USDA, Soil and Water Conservation District); emphasis is placed on the area between the above
and below stations.
9-36
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
3C. Concentration Option: Weekly grab samples are collected at 2 sites.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, 2 plastic rain gages, 2 staff gages
• Sampling: TP, stage
3L. Load Option: Weekly flow-proportional composite samples are collected at 2 sites.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, 2 sample sheds, 2 tipping bucket rain gages and data logger,
2 staff gages, 2 refrigerated automatic samplers (with bubble flow module, battery backup, 2-bottle
kit), data transfer device and software, 2 surge protectors, pygmy-type current meter and data
logger
• Sampling: TP, continuous flow
Scenario 4: Multiple-Watershed Monitoring
Under this scenario there are 10 small watersheds each (n=20) with/without BMPs. Water quality
sampling occurs six times per year. It is assumed that land use/land treatment tracking is performed 2x/yr
by a cooperator, with additional observations made during water quality sampling runs.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, 20 staff gages
• Sampling: TP, stage
Scenario 5: Trend Monitoring
This scenario has one monitoring site and two options. Land use/land treatment tracking is performed
2x/yr for both options via windshield survey and collection of data from cooperators (e.g., USDA, Soil
and Water Conservation District). Data are collected on the nature, extent, and timing of BMP
implementation - as well as operation and maintenance after implementation.
5C. Concentration Option: Twice-monthly grab samples.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, plastic rain gage, staff gage
• Sampling: TP, stage, precipitation
5L. Load Option: Weekly flow-proportional composite samples.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, sample shed, tipping bucket rain gage and data logger, staff
gage, refrigerated automatic sampler (with bubble flow module, battery backup, 2-bottle kit), data
transfer device and software, surge protector, pygmy-type current meter and data logger
• Sampling: TP, continuous flow, precipitation
9-37
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects Chapter 9
Scenario 6: Paired-Watershed Monitoring
This scenario has two monitoring sites (treated and untreated) and two options that address load
estimation for a perennial and intermittent stream setting. For continuously flowing streams, a single
weekly composite sample is collected for analysis. For intermittent streams, a flow-proportional
composite sample is collected during each of 20 runoff events each year. Land use/land treatment tracking
is performed 2x/yr in both watersheds for both options via windshield survey and collection of data from
cooperators (e.g., USDA, Soil and Water Conservation District).
61. Intermittent Stream Option: Twenty runoff events sampled per year at each site.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, 2 sample sheds, 2 tipping bucket rain gages and data logger,
2 staff gages, 2 refrigerated automatic samplers (with bubble flow module, battery backup, 2-bottle
kit), data transfer device and software, 2 surge protectors, pygmy-type current meter and data
logger
OO
" Sampling: TP, continuous flow, precipitation
6P. Perennial Stream Option: Weekly flow-proportional composite samples at each site.
Equipment and sampling assumptions for this scenario are:
• Equipment: sample bottles and cooler, 2 sample sheds, 2 tipping bucket rain gages and data logger,
2 staff gages, 2 refrigerated automatic samplers (with bubble flow module, battery backup, 2-bottle
kit), data transfer device and software, 2 surge protectors, pygmy-type current meter and data
logger
• Sampling: TP, continuous flow, precipitation
Scenario 7: Soil Testing
This scenario involves random selection of 20 agricultural fields for annual soil sampling. Ten fields are
beginning to adopt nutrient management, and the other ten are conventionally managed. Local
precipitation records are used in lieu of on-site collection of precipitation data. Annual data on nutrient
application and crop yields are provided by a cooperator (e.g., the landowner, USDA).
Equipment and sampling assumptions for this scenario are:
• Equipment: 2 soil probes, 2 buckets, and a supply of bags and ties for soil samples
" Sampling: soil P, textural class (covariate), and organic matter (covariate)
9-38
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Appendix 9-4. Cost Estimates for Five-Year Trend
and Above/Below Monitoring
As described in Appendix 9-1, a large number of assumptions must be made to estimate costs for various
monitoring scenarios. Thus, while these cost estimates are intended to be informative, they are more or
less relevant to any particular monitoring effort based on how well the assumptions match the realities of
that specific situation. Cost estimates given here are more likely to be high than low because it is always
assumed that contractors perform the monitoring (i.e., no use of in-house labor that was hired to do
monitoring) and all monitoring equipment must either be leased or purchased.
The basic scenarios (n=160) assumed for this analysis are summarized in Table 9A4-1. The trend design
assumes one monitoring site and the above/below design assumes two monitoring sites. All monitoring is
assumed to continue for five years. Tracking of land use and land treatment is assumed to occur twice per
year, with costs identical for all scenarios. The five-year costs for this tracking range from about $100 to
$16,500 for all scenarios. Costs vary considerably based on the size of and distance to the watershed.
Table 9A4-1. Factors used in creating cost estimation scenarios
Scenario
Biological
Nutrient and Sediment
Concentration
Nutrient and Sediment
Load
Sondes for Nutrients
and Turbidity
Monitoring
Variables Set
(and source)
BioK(Table9A1-4)
NSC (Table 9A1 -2)
NSL(Table9A1-3)
SNT(Table9A1-5)
Sampling
Frequency
(times/year)
2
26
Monitoring
Designs
Trend and
Above/Below
Watershed
Sizes (ha)
202
2,023
10,117
20,234
Distances to
Watershed1 (km)
0
40
80
121
161
Distance sampling team must travel to reach the watershed or nearest watershed being monitored.
Labor costs for these estimates use the same rates shown in Table 9A2-1. All scenarios include a mix of
fixed labor assumptions (e.g., QAPP development cost is $1,400 for all4 scenarios) and variable labor
assumptions that are based on the monitoring design and watershed size. For example, watershed
characterization costs vary depending on design and watershed size as illustrated in Table 9A4-2. A
simple algorithm in the simplified spreadsheet estimates travel distances and drive times based on
watershed size, affecting both labor and vehicle costs for watershed characterization.
1 "All" scenarios refers to the base scenarios for which pay rates are those found in Table 9A2-1.
9-39
-------
Monitoring and Evaluating Nonpoint Source Watershed Projects
Chapter 9
Table 9A4-2. Watershed characterization costs as function of design and watershed size
Design
Trend
Above/Below
Watershed Size (ha)
202
$1,516
$1,888
2,023
$1,780
$2,152
10,177
$2,952
$3,324
20,234
$4,790
$5,162
Distance to watershed is assumed = 80km.
Labor and vehicle requirements for sampling vary depending upon design, watershed size, and
monitoring variables set. The variability of labor costs for data analysis and report development is
illustrated in Table 9A4-3. These costs reflect the assumption that biological data require more time for
analysis (at species level) than chemical/physical data collected using the other variable sets. Estimation
and analysis of pollutant loads, likewise, is assumed to be more time-consuming than for either sonde or
concentration data. Spreadsheet users, of course, can change these assumptions.
Table 9A4-3. Variability of costs for data analysis and reporting
Design
Trend
Above/Below
Variable Set
BioK
NSC
NSL
SNT
BioK
NSC
NSL
SNT
Samples/Year
2
26
26
26
2
26
26
26
5-Year Labor Cost for Data Analysis and
Reporting
$15,889
$10,051
$16,271
$11,899
$27,068
$16,177
$27,047
$19,873
Assumes 2,023-ha watershed and 50 mile distance.
QA/QC is addressed in a number of ways. For sample analysis, sample size is increased by 10% to
account for replicates. In addition, a QA/QC officer is assumed to join the sampling team once per year,
and stage-discharge relationships are checked 8 times per year.
The results of running 160 scenarios for these above/below and trend monitoring designs are discussed in
section 9.3.3 and summarized in Figure 9-4. Paired designs would have costs similar to those for the
above/below design.
Additional cost estimates were run using a salary adjustment factor to see how this would affect total
costs. Salaries were adjusted across the board by reducing them to 70%, 50%, and 0% of those in Table
9A2-1. Similarly, a rough assessment of the effects of equipment costs on total costs was performed by
estimating costs where all or no equipment was purchased. These two equipment scenarios were also
combined with the four salary options (0%, 50%, 70%, and 100% of the rates in Table 9A2-1) to explore
the impacts of both adjustments on total costs. The results of these analyses are presented in section 9.3.3
and summarized in Table 9A3-3.
9-40
-------