University of Washington

   Seattle, Washington
                          December 1997

                          EPA235-R97-001
   BIOLOGICAL MONITORING

   AND ASSESSMENT:

   USING MULTIMETRIC INDEXES

   EFFECTIVELY

   James R. Karr and Ellen W. Chu
    What to measure? •  How to decide?
0)
CO
c
o
Q.
CO
CD


15
g
"O5
_o
g

m
       ^ „*»*«•^



            _x/     ^ W
****,*




A/b/>,
   0
.J^*


                                9
                      f
              Human influence

-------
                                      NOTICE

The views expressed in this document are the authors' and do not necessarily reflect those of EPA
or the institutions with which the authors are affiliated. The official endorsement of the agency
should not be inferred. The purpose of this document is the objective facilitation of information
exchange among state and federal agencies, university scientists and students, and citizen groups.

The information in this document has been funded in part by the United States Environmental
Protection Agency under cooperative agreement CX-824131-01. It has been subjected to the
agency's peer review and has been approved for publication. Mention of trade names or commer-
cial products does not constitute endorsement or recommendation for use.
This document should be cited as:

Karr, J. R., and E. W. Chu. 1997. Biological Monitoring and Assessment: Using Multimetric Indexes
Effectively. EPA 235-R97-001. University of Washington, Seattle.

-------
BIOLOGICAL MONITORING AND ASSESSMENT:
      USING  MULTIMETRIC  INDEXES  EFFECTIVELY

                                     James R. Karr and Ellen W. Chu
          James R. Karr is a professor of fisheries and zoology and an adjunct professor
                      of civil engineering, environmental health, and public affairs
                                         at the University of Washington
                                        104 Fisheries Center, Box 357980
                                               Seattle, WA 98195-7980
                                          e-mail: jrkarr@u. Washington, edu
                                     Ellen W. Chu is a biologist and editor
                                in the Department of Environmental Health
                                             University of Washington
                                          4225 Roosevelt Way NE #100
                                              Seattle, WA 98105-6099
                                           e-mail: ewc@u.washington.edu
                                                    Funded in part by
                               the United States Environmental Protection Agency
                                    under cooperative agreement CX-824131-01
                                                     December 1997
                                                    EPA 235-R97-001

-------
ACKNOWLEDGMENTS
           1 his report grew out of 25 years' research by James Karr and dozens of students
           and colleagues to develop and test multimetric indexes of biological integrity
           (IBIs). We, the authors, can thus take credit only for what this text says, not for all
           the excellent hard work on which it is based. Indeed, when we say we in this report,
           more often than not we mean those who have worked with Jim over the years and
           use his IBI approach, not just we, the authors. Sometimes, of course, we means all
           of us—we, the people of our nation, who depend on water resources.

           We, the authors, wish first to thank Leska Fore, who, along with Billie Kerans,
           advanced the definition of IBFs statistical properties. Leska also prepared a first
           draft, including figures, of a number of the report's sections. We particularly want
           to thank the following other colleagues who have worked with Jim over the years:
           J. Adams, P. Angermeier, C. Doberstein, D. Dudley, K. Fausch, O. Gorman,
           M. A.  Hawke, E. Helmer, M.Jennings, D. Kimberling, B. Kleindl, S. Morley,
           A. Patterson, D. Ratcliffe, E. Rossano, I. Schlosser, L. Toth, and P. Yant.

           We appreciate the comments, criticisms, and lively discussion from Wayne Davis,
           Phil Larsen, Bob Hughes, Paul Angermeier, Rich Sumner, Eriko Rossano, Kurt
           Fausch, Billie Kerans, and several anonymous reviewers, which helped make this a
           better  manuscript. We are grateful to Cathy Schwartz for redrawing all the figures
           and for designing and producing the book, and  to Sherri Shultz for excellent
           proofreading.
           Finally, we must recognize those dedicated scientists and managers in federal and
           state agencies, especially Chris Yoder, Dan Dudley, Ed Rankin, Roger Thoma, and
           Jeff DeShon of Ohio EPA, whose work to bring multimetric biological assessment
           into the real world offers inspiration to all concerned about the continuing loss of
           biological integrity in the nation's waters.

           This report was requested by Wayne Davis (Project Officer) under US Environmen-
           tal Protection Agency Cooperative Agreement CX-824131-01 and further sup-
           ported by US Environmental Protection Agency Cooperative Agreement
           X-000878-01-6 (Marsha Lagerloef and Richard Sumner, Project Officers) and
           Department of Energy Cooperative Agreement DE-FC01-95-EW55084.S to the
           Consortium for Risk Evaluation with Stakeholder Participation (CRESP).

-------
ACKNOWLEDGMENTS                                                         ii


CONTENTS                                                                   iii

LIST OF FIGURES, TABLES, AND BOXES                                        vi


INTRODUCTION                                                              i

SECTION  I
AQUATIC RESOURCES ARE STILL DECLINING                                             5
    Premise  I
    Water resources are losing their biological components                      6

    Premise  2
    "Clean water" is not enough                                              8

    Premise  3
    Biological monitoring is essential to protect biological resources            10

SECTION  II
CHANGING WATERS AND CHANGING VIEWS  LED  TO BIOLOGICAL  MONITORING                   is
    Premise  4
    Changing waters and a changing society call for better assessment           16

    Premise  5
    Biological monitoring detects biological changes caused by humans         21

    Premise  6
    Ecological risk assessment and risk management depend on
    biological monitoring                                                   26

SECTION  III
MULTIMETRIC INDEXES  CONVEY BIOLOGICAL  INFORMATION                                  29
    Premise  7
    Understanding biological responses requires measuring across
    degrees of human influence                                              30

    Premise  8
    Only a few biological attributes provide reliable signals about
    biological condition                                                     35

    Premise  9
    Simple graphs reveal biological responses to human influences             38

    Premise  10
    Similar biological  attributes are reliable indicators in diverse
    circumstances                                                           44

    Premise  II
    Tracking complex  systems requires a measure integrating multiple factors    45

-------
                  Premise 12
                  Multimetric biological indexes incorporate levels from individuals
                  to landscapes                                                           47

                  Premise 13
                  Metrics are selected to yield relevant biological information at
                  reasonable cost                                                         51

                  Premise 14
                  Multimetric indexes are built from proven metrics and a scoring system     56

                  Premise 15
                  The statistical properties of multimetric indexes are known                 63

                  Premise 16
                  Multimetric indexes reflect biological responses to human activities         66

                  Premise 17
                  How biology and statistics are used is more important than taxon           71

                  Premise 18
                  Sampling protocols are well defined for fishes and invertebrates             73

                  Premise 19
                  The precision of sampling protocols can be estimated by evaluating
                  the components of variance                                              80

                  Premise 20
                  Multimetric indexes are biologically meaningful                           83

                  Premise 21
                  Multimetric protocols can work in environments other than streams         84

               SECTION  IV
               FOR A ROBUST MULTIMETRIC INDEX, AVOID COMMON PITFALLS                            89
                  Premise 22
                  Properly classifying sites  is key                                            90

                  Premise 23
                  Avoid focusing primarily on species                                       93

                  Premise 24
                  Measuring the wrong things sidetracks biological monitoring                95

                  Premise 25
                  Field work is more valuable than geographic information systems           97

                  Premise 26
                  Sampling everything is not the goal                                       98

                  Premise 27
                  Avoid probability-based sampling until metrics are defined                 99

                  Premise 28
                  Counting 100-individual subsamples yields too few data for
                  multimetric assessment                                                 101
IV

-------
   Premise 29
   Avoid thinking in regulatory dichotomies                                107

   Premise 30
   Reference condition must be defined properly                            108

   Premise 31
   Statistical decision rules are no substitute for biological judgment          110

   Premise 32
   Multivariate statistical analyses often overlook biological knowledge        112

   Premise 33
   Assessing habitat cannot replace assessing the biota                       115

SECTION V
MANY CRITICISMS  OF MULTIMETRIC  INDEXES ARE MYTHS                                 117
   Myth I
   "Biology is too variable to monitor"                                      118

   Myth 2
   "Biological assessment is circular"                                        120

   Myth 3
   "We can't prove that humans degrade living systems without
   knowing the mechanism"                                               122

   Myth 4
    "Indexes combine and thus lose information"                            124

   Myth 5
   "Multimetric indexes aren't effective because their statistical properties
   are uncertain"                                                          126

   Myth 6
   "A nontrivial effort is required to calibrate the index regionally"            127

   Myth 1
    "The sensitivity of multimetric indexes is unknown"                      129

SECTION VI
THE FUTURE Is Now                                                          isi
   Premise 34
   We can and must translate biological condition into legal standards        132

   Premise 35
   Citizen groups are changing their thinking faster than bureacracies are      135

   Premise 36
   Can we afford healthy waters? We can afford nothing less                  138


SECTION VII
LITERATURE CITED                                                             139

-------
LIST OF FIGURES
 1.   Fish IBI plotted against chlorine concentration in east-central Illinois streams             13
 2.   Fish IBI for three treatment phases in Copper Slough, Illinois                           13
 3.   Relationships among kinds of variables in biological monitoring                         23
 4.   Classification system for ranking Japanese streams according to human influence          31
 5.   Benthic IBI for 115 Japanese streams in two groups and combined                       32
 6.   Benthic IBI plotted against impervious area for Puget Sound lowland streams             33
 7.   Benthic IBI for streams in or near Grand Teton National Park, Wyoming                 33
 8.   What to measure?                                                                  36
 9.   Taxa richness for Plecoptera and for sediment-intolerant taxa in the
    John Day Basin, Oregon                                                            37
10.  Two hypothetical metrics plotted against gradient of human influence                    39
11.  Hypothetical relationships between human influence and candidate
     biological metrics                                                                   39
12.  Taxa richness for Trichoptera plotted against percentage of watershed logged              41
13.  Relationship between human influence and hypothetical metric A                       41
14.  Relative abundance of tolerant taxa plotted against gradient of human influence           42
15.  Number offish species plotted against stream order for Illinois streams                   42
16.  Mayfly taxa richness plotted against impervious area in Puget Sound lowland streams      43
17.   Taxa richness of mayflies, stoneflies, and caddisflies for the North Fork
     Holston River, Tennessee                                                            43
18.  Trichoptera presence expressed as individuals, relative abundance, and richness            53
19.  Number of invertebrates plotted against impervious area for Puget Sound streams          54
20.  Range and numeric values for six B-IBI metrics for two southwestern
     Oregon streams                                                                     60
21.  Plots of two metrics showing contrasting ways to establish scoring criteria                 61
22.  Cumulative distribution functions for two B-IBI metrics used in southwestern Oregon     62
23.  Distribution offish IBI values from bootstrapping analysis for four Ohio streams          64
24.  Power curves for the fish IBI estimated from nine Ohio streams                          65
25.  Benthic IBI plotted against area logged in southwestern Oregon                         68
26.  Fish IBI values for Jordan Creek, Illinois                                              68
27.   Benthic IBI values in the North Fork Holston River                                    68
28.  Distribution of sites in six midwestern areas according to biological condition             69
29.  Fish IBI values along the Scioto River,  Ohio, 1979 and 1991                             70
30.  Changes in fish IBI values over time in Wertz Drain at Wertz Woods,
    Allen County, Indiana                                                              70
31. Influence of number of sample replicates on estimate of predator relative abundance      75
32.  "Dose-response curves" for family-, genus-, and species-level identifications               77
33.  Sources of variance in samples of herbivorous zooplankton from northeastern lakes        81
VI

-------
34. Components of variance for the B-IBI for Puget Sound lowland and
    Grand Teton streams                                                                82
35. Percentage of individuals in several avian trophic groups in forest fragments               85
36. Hanford Nuclear Reservation study sites for a terrestrial IBI                             86
37.  Changes in plant assemblages among 13 Hanford study sites, 1997                       87
38. Changes in arthropod assemblages among four Hanford study sites, 1997                 87
39. Maximum species richness lines for woodland and grassland streams                      92
40. Hypothetical species composition in two streams before and after human disturbance      94
41. Confidence interval for a fish IBI plotted against number of individuals                  102
42. Number of classes detectable by metrics and index for 10-metric B-IBI                   105
43. Number of clinger taxa plotted against human influence for Japanese streams             123

LIST OF TABLES
 1.  Examples from United States rivers of degradation in aquatic biota                        7
 2.  Elements, processes, and potential indicators of biological condition                       9
 3.  Key terms used in defining biological condition                                        36
 4.  Types of metrics, suggested numbers, and represented levels in the biological hierarchy     48
 5.  Sample biological attributes in five broad categories and their potential as metrics          52
 6.  Metrics that respond predictably to human  influence for various taxa and habitats          57
 7.  Potential metrics for benthic stream invertebrates                                       58
 8.  Fish IBI metrics                                                                    59
 9.  Five water resource altered by the cumulative effects  of human activity                    67
10. Metrics that respond predictably to human  influence across the Pacific Northwest          92
11. Ten-metric B-IBI based on study in six geographic regions                              103
12. Comparative costs of methods for evaluating water resource quality                     128

LIST OF BOXES
 1.  Narrow use of chemical criteria can damage water resources and waste money              12
 2.  How to sample benthic invertebrates                                                  78
                                                                                      VII

-------
                                                            INTRODUCTION
    Can we afford clean water? Can we afford rivers and lakes and streams and oceans, which continue
    to make life possible on this planet? Can we afford life itself?... These questions answer themselves.

                                                            —Senator Edmund Muskie (1972)
The most direct
  and effective
   measure of
 the integrity of
a water body is
the status of its
 living systems
 Ihe story of a continent is reflected in the biology of its rivers. And what a biolo-
gist sees in North America's rivers is a history of damaged landscapes and underval-
ued fresh waters. As a century of dramatic cultural and ecological change in the
United States draws to a close, outdated legal doctrines and weak implementation
of good laws dominate water resource policy throughout the nation. Will they
continue to do so in the twenty-first century?

Water resources are not simply water; their value to a society comes from more
than the quality and quantity of liquid water. Humans depend on living waters for
many essential goods and services, from drink and food  to cleansing of our wastes
to aesthetic and recreational renewal. One explicit, visionary statement in the 1972
Water Pollution Control Act Amendments (PL 92-500, now called the Clean Water
Act) acknowledged  the overarching importance of whole water resources: "The
objective of this Act is to restore and maintain the chemical, physical, and biologi-
cal integrity of the Nation's waters" [Clean Water Act (CWA) § 101(a)].

Although some progress has been made under this law in controlling point-source
pollution, especially organic effluent, other harmful and pervasive forms of degra-
dation—nonpoint pollution, altered hydrological regimes, habitat destruction, and
invasions by alien species—continue to degrade aquatic ecosystems. In short,
despite the clarity of the legal mandate, the condition of America's waters says
unequivocally that we have failed to achieve the Clean Water Act's objectives. How
can we reverse this trend?

The most direct and effective measure of the integrity of a water body is the status
of its living systems. Life depends on water. Do we expect waters that cannot
support healthy biological communities to provide us with the goods and services
                                                                                         1

-------
    Assessing
ecological risks
    accurately
   depends on
      effective
    biological
   monitoring
we need? Choosing and monitoring biological endpoints is thus fundamental for
assessing water resource quality and for charting a course for federal and state
programs to protect society's most basic interests.

Biological monitoring tracks the health of biological systems in much the same
way that investors track the health of the US economy. Biological monitoring aims
to detect change in living systems—specifically, change caused by humans. To
detect the effects of human activities on biological systems, biological monitoring
must study human disturbance apart from disturbances that occur naturally—a
crucial distinction that biological monitoring programs have too often lost sight of.
Tracking, evaluating, and communicating the condition of biological systems, and
the consequences of human activities for those systems, lie at the heart of biologi-
cal monitoring.

To put it another way, biological monitoring identifies ecological risks that are as
important to human health and well-being as the more obvious threats of toxic
pollution or vector-borne disease. Indeed, EPA's Scientific Advisory Board (SAB
1990) stipulated, "Attach as much importance to reducing ecological risk as is
attached to reducing human health risk." Halting the deterioration of the nation's
waters cannot be done if we continue to behave as if our actions had no ecological
risks (Karr 1995a).
Assessing ecological risks accurately depends on effective biological monitoring.
Included by EPA in its framework for ecological risk assessment (USEPA 1992b,
1994a,b, 1996d), biological monitoring aims to identify problems by assessing
biological condition (what EPA calls "characterization of ecological effects") and to
define the nature and magnitude of any problem. The results of these analyses
must then be communicated to citizens and decision makers, who will determine
what to do. Like human-health risk assessors, ecological risk assessors need reliable,
conceptually sound tools for each of these steps.
During a century of evolution, through changing human impacts on water and its
associated resources, biological monitoring programs have taken a variety of
approaches (Davis and Simon 1995; Karr 1998). The approach in this report-
development of multimetric indexes of biological condition—began in 1981 with
the index of biological integrity, or IBI (Karr 1981). Now well documented as
effective for assessing ecological condition in a variety of management settings,
with many taxa, and in diverse geographic regions, multimetric biological indexes
are a logical next step in biological monitoring's evolution. Why? Principally
because these indexes evaluate ecological condition in terms of a system's ability to
support unimpaired living systems—in terms of the biota's ability to sustain itself—
ultimately the most relevant endpoint for sustaining human society.

In much the way economic indexes such as the Dow Jones industrial average and
the index of leading economic indicators combine many financial measures to
assess the state of the national economy, the index of biological integrity integrates
measurements of many biological attributes (metrics) to assess the condition of a
place. Metrics are chosen on the basis of whether they reflect specific and predict-
able responses of organisms to human activities. Ideal metrics should be relatively

-------
easy to measure and interpret. They should increase or decrease as human influ-
ence increases. They should be sensitive to a range of biological stresses, not
narrowly indicative of commodity production or threatened or endangered status.
Most important, biological attributes chosen as metrics must be able to discrimi-
nate human-caused changes from the background "noise" of natural variability.
Human impact is the focus of biological monitoring.

Numerous studies have documented the responses of biological attributes to
human disturbance. Across diverse taxa and regions, similar biological attributes
(e.g., taxa richness and the relative abundance of tolerant organisms) work consis-
tently and reliably as indicators of resource condition. Across regions and agencies,
consensus is emerging about the appropriate level of sampling needed to assess the
condition of biological systems accurately.

Successful multimetric efforts combine biological insight with appropriate sam-
pling design and statistical analyses. Knowledge of regional biology and natural
history—not solely a search for statistical relationships and significance—should
drive both sampling design and analytical protocol. Rigorously done, multimetric
biological monitoring and assessment offer a systematic approach that measures
many dimensions of complex ecological systems—dimensions that have too long
been ignored.
Of course challenges remain. Biologists must extend what they have learned about
monitoring in fresh water to other environments and other taxonomic groups. On
the other hand, they must avoid gathering more data than are necessary for better
management decisions. Like any scientific method, biological monitoring generates
many new and interesting questions, methods, and refinements. But scientists and
managers need to realize that they already know enough about how biological
systems  respond to human influence—enough to make decisions that will stop the
decline of water resources. Managers and policymakers must use what they already
know.

Most important,  however, biologists must communicate ecological condition more
effectively outside biological circles. In  a society that does not value the integrity
of aquatic or other natural systems, no amount of scientific nagging will improve
resource policy. Biologists and all who understand both the value and the declin-
ing health of natural life-supporting systems must share their knowledge widely. In
the end, only an informed public can put adequate pressure on decision makers to
change business as usual. The precision and clarity of information gathered
through multimetric biological monitoring and assessment can help this process.
This report discusses the state of US running waters and the value of multimetric
biological indexes in assessing and communicating their condition. The extent to
which better decisions are made—decisions that maintain or restore aquatic systems
as opposed to the status quo—will be a measure of these indexes' success.

-------
The report is built around numbered statements, each representing a step in the
logical development of multimetric biological indexes or a bone of contention in
the assessment literature. The table of contents offers a document map, from
trends in aquatic resource condition (Section I), to changing scientific and societal
views of water resources Section II), to how and why multimetric indexes work
(Section III), through the most common pitfalls associated with use of multimetric
indexes (Section IV). In Section V, we quote others' objections to multimetric
indexes and try to show that those assertions are at best misleading and often false.
Section VI is a call to arms.

Who will find this document useful? Several audiences, we hope: an agency
scientist trying to decide whether and how to use fish or invertebrates in monitor-
ing work; a researcher designing a study to detect human effects; and a state agency
responding to EPA's mandate to develop biocriteria. This is a handbook for those
working to protect the nation's waters; we hope it will become dogeared and dirty.

-------
                                        SECTION
AQUATIC RESOURCES ARE  STILL DECLINING

                  Ihis first section sets forth the condition of aquatic ecosystems,
                         to inform those unfamiliar with them of the damage
                   that has already occurred and to arm those already concerned
                           with specific details on the extent of degradation.

-------
          PREMISE  1
 WATER RESOURCES  ARE  LOSING  THEIR
 BIOLOGICAL  COMPONENTS
As recently as
a century ago,
 a commercial
   freshwater
fishery second
only to the one
       in the
   Columbia
       River
 flourished in
   the Illinois
River; now it
      is gone
.Despite strong legal mandates and massive expenditures, signs of continuing
degradation in biological systems are pervasive—in individual rivers (Karr et al.
1985b), US states (Moyle and Williams 1990; Jenkins and Burkhead 1994), North
America (Williams et al. 1989; Frissell  1993; Wilcove and Bean 1994), and around
the globe (Hughes and Noss  1992; Moyle and Leidy 1992; Williams and Neves
1992; Allan and Flecker 1993; Zakaria-Ismail 1994; McAllister et al. 1997). Aquatic
systems have been impaired,  and they  continue to deteriorate as a result of human
society's actions (Table 1).

Devastation is obvious, even  to the untrained eye. River channels have been
destroyed by dams; straightening and dredging; and  water withdrawal for irriga-
tion, industrial, and domestic uses. Degradation of living systems inevitably
follows. Biological diversity in aquatic  habitats is threatened; aquatic biotas  have
become homogenized through local extinction, the introduction of alien species,
and declining genetic diversity (Moyle  and Williams 1990; Whittier et al., 1997a).
Who remembers that a freshwater fishery existed in the Illinois River in the  early
1900s that was  second only to the Columbia's?  Now that fishery is gone, and the
one in the Columbia is nearly gone. Since the turn of the twentieth century,
commercial fish harvests in US rivers have fallen by  more than 95%.

Even where commercial and  sport catches offish and shellfish are permitted, one
can no longer assume that  those harvests are safe to eat (USEPA 1996a). In 1996,
fish consumption advisories were imposed on 5% of the river kilometers in the US
(www.epa.gov/OST/fishadvice/index.html). The number offish advisories is rising.
The 2193 advisories reported  for US water bodies in  1996 represent an increase of
26% over 1995 and a 72% increase over 1993. For millennia, humans have de-
pended on the  harvest from terrestrial  (including agricultural), marine, and fresh-
water systems for food. But the supply of freshwater  foods has collapsed. How
would society respond if agricultural productivity declined by more than 80% or if
eating "farm-fresh" products threatened our health? Why then do we continue to
ignore such changes in  "wild-caught" aquatic resources?

Current programs are not protecting rivers or their biological resources because the
Clean Water Act has been implemented as if crystal-clear distilled water running
down concrete  conduits were the act's  ultimate goal  (Karr 1995b). For example, at
least $473 billion was spent to build, operate, and administer water-pollution
control facilities between 1970 and 1989 (Water Quality 2000  1991). Still, the

-------
                decline continues while money is wasted on inadequate or inappropriate treatment
                facilities (Karr et al. 1985a; see Box 1, page 12).

                In many respects, society has been lulled into believing that our individual and
                collective interests in water resources have been protected by national, state, and
                local laws and regulations. We have had faith in the outdated "prior appropriation
                doctrine" of American frontier water law, the implementation of the Clean Water
                Act, or "wild and scenic river" designation when, in fact, our habits as a society
                and the way we have implemented our laws have progressively compromised our
                fresh waters.
TABLE  I. Examples from United States rivers of degradation in aquatic biota (from Karr 1995a).


Proportionately more aquatic organisms are classed as rare to extinct (34% of fish, 75% of unionid mussels, and
65% of crayfish) than terrestrial organisms (from 11% to 14% of birds, mammals, and reptiles; Master 1990).

Twenty percent of native fishes of the western United States are extinct or endangered (Miller et al. 1989; Williams
and Miller 1990).

Thirty-two percent of fish native to the Colorado River are extinct, endangered,  or threatened (Carlson and Muth
1989).

In the Pacific Northwest, 214 native, naturally spawning Pacific salmon and steelhead stocks face "a high or mod-
erate risk of extinction, or are of special concern" (Nehlsen et al. 1991).

Since 1910, naturally spawning salmon runs in the Columbia River have declined by more than 95% (Ebel et al.
1989).

During the twentieth century, the commercial fish harvests of major US rivers have declined by more  than 80%
(Missouri and Delaware Rivers), more than 95% (Columbia River), and 100% (Illinois River) (Karr et al. 1985b; Ebel
et al. 1989; Hesse et al. 1989; Patrick 1992).

Since 1933, 20% of molluscs in the Tennessee River system have been lost (Williams et al. 1993); 46% of the
remaining molluscs are endangered or seriously depleted throughout their range.

In 1910, more than 2600 commercial mussel fishers operated on the Illinois River; virtually none remain today.

Since 1850, many fish species have declined or disappeared from rivers in the United States (Maumee River, Ohio:
45% [Karr et al. 1985b]; Illinois River, Illinois: 67% [Karr et al. 1985b]; California rivers: 67% [Moyle and Williams
1990]). This decline, combined with the introduction of alien species, has homogenized the aquatic biota of many
regions (an average of 28% of the fish species in major drainages of Virginia are introduced; Jenkins and Burkhead
1994).

Thirty-eight states reported fish consumption closures, restrictions, or advisories in 1985; 47 states did in  1991. The
2193 advisories reported for US water bodies in 1996 represent a 26% increase over 1995  and a 72% increase
over 1993 (USEPA1996a). Contaminated fish pose health threats to wildlife and people (Colborn et al. 1990,1996),
including intergenerational consequences  such as impaired cognitive functioning in infants  born to women who
consume contaminated fish (Jacobson et al. 1990; Jacobson and Jacobson 1996).

Riparian corridors have been decimated (Swift 1984).

Native minnows have declined while alien minnows have spread throughout northeastern US lakes (Whittier et al.
1997a).
                                                                                                7

-------
         PREMISE  2
 "CLEAN  WATER" is NOT  ENOUGH
  Pollution is
anything that
    alters the
    physical,
    chemical,
biological, or
  radiological
  integrity of
      water
Society relies on freshwater systems for drinking water, food, commerce, and
recreation as well as waste removal, decomposition, and aesthetics. Yet in the
Pacific Northwest alone, recent declines in salmon runs and closures of sport and
commercial fisheries have led to economic losses of nearly $1 billion and 60,000
jobs per year (Pacific Rivers Council 1995). Retaining the biological elements of
freshwater systems (populations, species, genes), as well as the processes (mutation,
selection, fish migration,  biogeochemical cycles) sustaining these elements, is
crucial to retaining the goods and services fresh waters provide (Table 2).

Waters and fish travel over vast distances in space and time. The integrity of water
resources thus depends on processes spanning many spatial and temporal scales:
from cellular mechanisms producing local and regional adaptations to a massive
transfer of energy and materials as fish migrate between the open ocean and
mountain streams. Protecting the elements and processes society values therefore
demands a broad, all-encompassing view—one  not yet encouraged by conventional
management strategies and terminology.
In particular, the word pollution must take on broader connotations. In conven-
tional usage and agency jargon, pollution refers to chemical contamination. A more
appropriate, yet little-used, definition that more accurately represents what is at
stake as water resources decline is the definition given by the 1987 reauthorization
of the Clean Water Act: pollution is any "manmade or man-induced  alteration of
the physical, chemical, biological, or radiological integrity of water." Under this
definition, humans degrade or "pollute" by many actions, from irrigation with-
drawals to overharvesting, not merely by releasing chemical contaminants.
8

-------
TABLE  2. Elements, processes, and potential indicators of biological condition for six levels of organization
within three biological categories. Indicators from multiple levels are needed to assess the condition of a site
comprehensively. (Modified from Angermeier and Karr 1994.)
 Biological
 category
Elements
(levels)
Processes
Indicators
Taxonomic
Species
Genetic
Ecological
Gene
Individual
Range expansion or contraction
Extinction
Evolution
Mutation
Recombination
Selection

Health
Range size
Number of populations
Population size
Isolating mechanisms

Number of alleles
Degree of linkage
Inbreeding or outbreeding depression

Disease
Deformities
Individual size and condition index
Growth rates
                  Population         Changes in abundance
                                    Colonization or extinction
                                    Evolution

                                    Migration

                  Assemblage       Competitive exclusion
                                    Predation or parasitism
                                    Energy flow
                                    Nutrient cycling

                  Landscape         Disturbance
                                    Succession
                                    Soil formation
                                    Metapopulation dynamics
                                                  Age or size structure
                                                  Dispersal behavior
                                                  Presence of particular taxa
                                                    (e.g., intolerants)
                                                  Gene flow

                                                  Number of species
                                                  Dominance
                                                  Number of trophic links
                                                  Spiraling length

                                                  Fragmentation
                                                  Percentage of disturbed land
                                                  Number of communities
                                                  Sources and sinks
                                                  Number and character of
                                                    metapopulations

-------
           PREMISES
  BIOLOGICAL MONITORING is  ESSENTIAL TO PROTECT
  BIOLOGICAL  RESOURCES
  The status of
 living systems
  provides the
most direct and
  most effective
 measure of the
   "integrity of
    water," the
    resource on
  which all life
      depends
Despite their faith in and reliance on technology, humans are part of the biologi-
cal world. Human life depends on biological systems for food, air, water, climate
control, waste assimilation, and many other essential goods and services (Costanza
et al. 1997; Daily 1997; Pimentel et al. 1997). Biological endpoints are therefore
fundamental. Furthermore, the status of living systems provides the most direct
and most effective measure of the "integrity of water," the resource on which all
life depends.
Degradation of water resources begins in upland areas of a watershed, or catchment,
as human activity alters plant cover. These changes, combined with alteration of
stream corridors, in turn modify the  quality of water flowing in the stream channel
as well as the  structure and  dynamics of those channels and their adjacent riparian
environments. Biological evaluations  focus on living systems, not on narrow chemi-
cal criteria, as integrators of such riverine change. In contrast, exclusive reliance on
chemical criteria assumes that water resource declines have been caused only by
chemical contamination. Yet physical habitat loss and fragmentation, invasion by
alien species,  excessive water withdrawals, and overharvest by sport and commer-
cial fishers do as much if not more harm than chemicals in many waters.

Even measured according to chemical criteria, water resources throughout the
United States are significantly degraded (USEPA 1992a, 1995; see Table 1, page 7).
In 1990 the states reported that 998 water bodies had fish advisories in effect, and
50 water bodies  had fishing bans imposed. More than one-third of river miles
assessed by chemical criteria did not  fully support the "designated uses" defined
under the Clean Water Act. More than half of assessed lakes, 98% of assessed Great
Lakes shore miles, and 44% of assessed estuary area did not fully support desig-
nated uses (USEPA 1992a).

By September 1994, the number offish consumption advisories had grown to
1531 (USEPA 1995). Seven states (Maine, Massachusetts, Michigan, Missouri, New
Jersey, New York, and Florida) issued advisories against eating fish from state waters
in 1994. Fish consumption  advisories increased again in 1995, by 12%; the adviso-
ries covered 46 chemical pollutants (including mercury, PCBs, chlordane, dioxin,
and DDT) and multiple fish species.  Forty-seven states had advisories, representing
15% of the nation's total lake acres and 4% of total river miles. All the Great Lakes
were under advisories. For the first time, EPA reported that 10 million Americans
were at risk of exposure to microbial  contaminants such as  Cryptosporidium because
  10

-------
their drinking water was not adequately filtered (USEPA 1996c). For the same year,
the Washington State Department of Ecology reported that "80 percent of the
hundreds of river and stream segments and half of the lakes tested by the state
don't measure up to water quality standards" (Seattle Times 1996). Outbreaks of
Pfiesteria piscitida, the "cell from hell," have killed millions offish and were also
implicated in human illnesses from Maryland to North Carolina in 1997 (Hager
and Reibstein 1997).

Alarming as they are, these assessments still underestimate the magnitude of real
damage to our waters because they generally do not incorporate biological criteria
or indicators. When compared with strictly chemical assessments, those using
biological criteria typically double the proportion of stream miles that violate state
or federal water quality standards or designated uses (Yoder 1991b; Yoder and
Rankin 1995a). The reasons for this result are simple. Although humans degrade
aquatic systems in numerous ways, chemical measures focus on only one way.
Some states  rely on chemical surrogates to infer whether a water body supports the
"designated  use" of aquatic life; others measure biological condition directly (Davis
et al. 1996).  Only 25% of 392,353 evaluated river miles were judged impaired
according to chemical standards intended to assess aquatic life. But when biologi-
cal condition was assessed directly, 50% of the 64,790 miles evaluated in the US
showed impairment.

Perhaps more important, these numbers suggest that we know more about the
condition of water resources than we actually do. Sadly, despite massive expenditures
and numerous efforts to report water resource trends, "Congress and the current
administration are short on information about the true state of the nation's water
quality and the factors affecting it" (Knopman and Smith 1993). Because assess-
ments emphasize chemical contamination rather than biological endpoints, state
and federal administrators are not well equipped to communicate to the public
either the status of or trends in resource condition. Further, because few miles of
rivers are actually assessed, and because those that are assessed are not sampled
appropriately (e.g., using probability-based surveys; Larsen 1995; Olsen et al., in
press), percentages of impaired river miles are extremely rough at best.

In short, despite explicit mandates to collect data to evaluate the condition of the
nation's water resources, and the existence of a program intended to provide an
inventory under section 305(b) of the Clean Water Act, no program has yet been
designed or carried out  to accomplish that goal (Karr 1991; Knopman and Smith
1993).
The strength of these observations is clearly an important force driving recent state
actions; 42 states now use multimetric assessments of biological condition, and 6
states are developing them. Only 3 states were using multimetric biological ap-
proaches in 1989 (Davis et al.  1996), and none had them in 1981 when the first
multimetric  IBI paper was published. Indeed, hardly any effective biological
monitoring programs were in place before 1981. Most  states still have a long way
to go toward collecting  and using biological data to improve the management of
their waters.

                                                                        11

-------
                Because they focus on living organisms—whose very existence represents the
                integration of conditions around them—biological evaluations can diagnose
                chemical, physical, and biological impacts as well as their cumulative effects. They
                can serve many kinds of environmental and regulatory programs when coupled
                with single-chemical toxicity testing in the laboratory. Furthermore, they are cost
                effective. Chemical evaluations, in contrast, often underestimate overall degrada-
                tion, and overreliance on chemical criteria can misdirect cleanup efforts, wasting
                both money and natural resources (Box 1). Because they focus on what is at risk-
                biological systems—biological monitoring and assessment are less likely to
                underprotect aquatic systems or waste resources.

                Biological evaluations and criteria can redirect management programs toward
                restoring and maintaining "the chemical, physical, and biological integrity of the
                nation's waters." Assessments of species richness, species composition, relative
                abundances of species or groups of species, and feeding relationships among
                resident organisms are the most direct measure of whether a water body meets the
                Clean Water Act's  biological standards for aquatic life (Karr 1993). To protect water
                resources, many states should track the biological condition of water bodies the
                way society tracks local and  national economies,  personal health, and the chemical
                quality of drinking water.
BOX  I,  Narrow use of chemical criteria can damage water resources and waste money.

    Chlorine is added to effluent from secondary sewage treatment because it kills microorganisms that cause
    human disease. But the effects of this chlorine continue after effluent is released into streams or other
    water bodies (Colborn and Clement 1992; Jaeobson and Jacobson 1996). in three Illinois streams receiv-
    ing water from a secondary treatment plant, an IB! based on fish declined significantly as residual chlorine
    concentration increased (Karr et al. 1985a; Rgure 1); the biological effects of chlorine appeared in fish
    assemblages downstream of the effluent inflow (Figure 2). With chldrfnation (treatment phase I), IBIs were
    much lower downstream than upstream, in contrast, when chlorine was removed from secondary effluent
    (phase H), downstream and upstream iBIs did not differ significantly. In other words, chlorine added to
    wastewater effluent continues to kill organisms after the chlorinated water Is released. Furthermore, bio-
    logical condition did not improve when expensive tertiary denitrifieation was added (phase III), even though
    this treatment brought the plant into compliance with chemical water qualify standards for nitrates.
    This example illustrates three important points. First, biological integrity may be damaged by too narrow a
    focus on chemical criteria. Second, such a narrow focus can waste money. Third, many current manage-
    ment approaches and policies are, in essence, untested hypotheses. Managers do not always make the
    effort to look for broader effects or to test beyond their initial criteria.
    Had managers looked for biological effects or reconsidered the levels of chlorine in the effluent instead of
    assuming that their chlorine criteria worked, the biota of these Illinois streams might have suffered less.
12

-------
FIGURE  !. In three streams in east-
central Illinois, the fish indexes of
biological integrity (IBIs) declined
significantly in response to waste-
water inflow from secondary
treatment with chlorination. Fish
IBIs declined as residual chlorine
concentration increased (from
Karr et al. 1985a).
     40


m

."«   30
                                            20
                     • Saline Branch
                     x Copper Slough
                     o Kaskaskia Branch
                                                        0.5     1.0     1.5     2.0
                                                         Chlorine (mg/l)
                                              Fair
                                              Poor
                                                                                        Very poor
FIGURE 1. Fish IBIs for stations
upstream and downstream of
wastewater treatment effluent in
Copper Slough, east-central
Illinois. Phase I: standard second-
ary treatment; phase II: secondary
treatment without chlorination;
phase III: secondary treatment
without chlorination but with
tertiary denitrification. With
chlorination (phase I), IBIs were
much lower downstream than
upstream of effluent inflow.
Upstream and downstream sites
did not differ statistically after
removal of chlorine from second-
ary effluent (phase II). The
addition of expensive tertiary
denitrification (phase III) did not
increase IBIs (from Karr et al.
1985a).
CO
JZ
CO
LJL
       44


       40

       36

       32

       28
        -p<0.001
                         Upstream
                         Downstream

                            n.s.
                                        n.s.

                   Treatment phase
                                                  Fair
                                                  Poor
                                                                                            13

-------
                                         SECTION
CHANGING WATERS AND CHANGING VIEWS

               LED TO  BIOLOGICAL MONITORING

             JDiological monitoring is evolving as societal and scientific thinking changes.
                   Growth in knowledge about aquatic systems—and humans' effects
          on them—has provided a substantial body of theory as well as empirical evidence
                 about how to measure their condition. Multimetric biological indexes
                    synthesize and integrate that expanding knowledge. The goals of
                biomonitoring include improving risk assessment and risk management.
                                                        15

-------
           PREMISE 4
  CHANGING  WATERS  AND A CHANGING SOCIETY
  CALL  FOR BETTER ASSESSMENT
     Chemical
  criteria based
      on dose-
 response curves
     for single
     toxicants
 cannot account
for interactions
    ofmultiple
chemicals or for
  other human
      impacts
At the end of the nineteenth century, discharge of raw sewage was a major cause
of water resource degradation in the United States. Concern about the effects of
excessive organic effluent on the potability of water, the spread of disease, prob-
lems with navigation, and the status of fish populations led Congress to pass the
1899 Rivers and Harbors Act, also called the Refuse Act. The act's goal was to
clean up human wastes and oil pollution in navigable waterways. Protection of the
nation's waters thus came  under the jurisdiction of the US Army Corps of Engineers.

During the World War years and afterward, legal, regulatory, and management
programs concentrated  on controlling organic effluent and a growing array of toxic
chemicals; declining populations of sport and commercial fishes and shellfish were
also targeted. Technology  to clean water and to make more fish became the watch-
word. Point sources of pollution were dealt with by wastewater treatment using
"best available" or "best practical" technologies (Ward and Loftis 1989). Although
the dust bowl of the 1930s prompted an early effort to protect water resources
from nonpoint pollution  due to soil erosion, soil and water conservation contin-
ued to take a back seat to  augmenting agricultural production (Thompson 1995).
From the mid-1800s, hatcheries were built and operated because, like agriculture,
they promised control over production and, thus, unlimited numbers offish
through technology. Technological arrogance fostered a proliferation of hatcheries
(Meffe 1992),  masking the degradation of river environments that was happening
at the same time; yet some of that very degradation was caused by the hatcheries
themselves (White et al. 1995; Bottom 1997). It was not until  the 1970s-encouraged
by growing public environmental awareness and passage of the 1972 Water Pollu-
tion Control Act Amendments (PL 92-500)—that management strategies began to
recognize waters as a whole and the need to protect "the integrity of water"
(Ballentine and Guarraia 1977).

The past 30 years have brought important  gains in the science of water resources.
Societal values, too, have  been changing as human-imposed stresses have become
more complex and pervasive. In addition to sewage and toxic chemicals, the
nation's freshwater environments have suffered from physical destruction, increas-
ing water withdrawals, the spread of alien species, and overharvest by sport and
commercial fishers. The names and language of water laws—Refuse Act, Soil and
Water Conservation Act, Water Pollution Control Act, Clean Water Act—reflect
  16

-------
both society's changing values and attempts to cope with widening problems. Field
monitoring and assessment programs have been evolving as well (Karr 1998).

Early water quality specialists developed biotic indexes sensitive to organic effluent
and sedimentation (Kolkwitz and Marsson 1908); this focus continues in modern
biotic indexes (Chutter 1972; Hilsenhoff 1982; Armitage et al. 1983; Lenat 1988,
1993). The most common approach involves ranking taxa (typically genus or
species) on a scale from 1 (pollution intolerant) to 10 (pollution tolerant). For each
sample site, an average pollution tolerance level (the biotic index value) is ex-
pressed as an abundance-weighted mean to facilitate comparisons among sites.
Some classifications use only three levels; others (Armitage et al. 1983) classify to
family, calculate an average score per taxon, and reverse the scale (1 is pollution
tolerant, and 10 is pollution intolerant).

As toxic chemicals became more widespread, water managers recognized the
limitations of early biotic indexes and began to screen for the biological effects of
synthetic as well as "natural" chemicals. Biologists experimentally exposed fish or
invertebrates—typically fathead minnow (Pimepkalespromelas) or Daphnia spp.—to
contaminants and documented the responses, creating dose-response curves for
individual chemical toxicants. For a given body size, they observed, very low doses
of a contaminant might lead to little or no response (e.g., few or no deaths among
a group of individuals). As dose increased, response increased. The goal was to
establish quantitative chemical criteria to use in water quality standards. These
criteria were presumed to protect human health or populations of desirable aquatic
species by keeping toxic compounds below harmful concentrations—the dilution
solution to pollution.

But just as biotic indexes measure primarily the effects of organic pollution,
chemical criteria based on toxicology apply only to chemical contamination and a
small number of contaminants. Toxicological studies, the foundation for chemical
criteria, typically examine the tolerances of only a few species, usually the most
tolerant taxa, leading to underestimates  of the effect of a contaminant  in the field.
Chemical criteria based on dose-response  curves for single toxicants cannot ac-
count for synergistic or other interactions  of multiple chemicals in the environ-
ment. And criteria for one species (e.g., fathead minnow) do not ensure protection
for others not tested. Moreover, an exclusive focus on toxicology ignores other
human impacts on aquatic biota, such as altered physical habitat or flow.

Much early work to detect the influence of human actions on biological systems
emphasized abundance (or population size or density) of indicator taxa or guilds,
often species with commodity value or thought to be keystone species. But popula-
tion size is notoriously variable even under natural conditions, especially in
comparison with physical or chemical water quality criteria. Data from long-term
studies of marine invertebrates, for example (Osenberg et al. 1994),  show that
temporal variability for population attributes (e.g., densities of organisms) is about
three times as high as for individual attributes (e.g., individual size or body condi-
tion), and nearly four times as high as chemical-physical attributes (e.g., water
                                                                        17

-------
              temperature, sediment quality, water-column characteristics). Such high variances
              make analyses of population size problematic for general monitoring studies.

              Efforts to overcome that problem have led to increasingly sophisticated sampling
              designs. Early field assessment protocols commonly used "control-impact" (CI) or
              "before-after" (BA) sampling designs. In CI designs, abundance is measured at
              unaffected control sites and at sites affected by an impact; in BA designs, abun-
              dance is measured before and then again after the event of interest. Despite the
              strength of these designs, the high variance of population size makes it difficult to
              distinguish between changes caused by the event and variation that would occur
              naturally in time or space.

              Population size changes  in complex ways in response to changes in multiple
              natural factors such as food abundance, disease, predators, rainfall, temperature,
              and demographic lags. Increasingly complex designs (e.g., BACI) were developed
              (Green 1979) to separate the effect of human activity from other sources of
              variability in space or time. But BACI confounds interactions between time and
              location; knowing the magnitude of the interaction and whether the effects are
              additive is critical to interpreting biological patterns—for example, understanding
              whether different streams respond in different ways to the same human activity.
              Still other statistical approaches were proposed to deal with such challenges:
              "before-after-control-impact paired series" (BACIPS; Stewart-Oaten et al. 1986)
              and "beyond BACI" (Underwood 1991, 1994). [See Schmitt and Osenberg (1996)
              for an excellent review of these sampling designs and their use.]

              Use of these designs for biological monitoring raises a number of difficulties.
              First, even though assigning samples to treatment and control groups may ac-
              count for local spatial variability in doses of contaminants, contaminant dispersal
              from a point source may be better detected by a more sensitive "gradient design"
              (Ellis and Schneider 1997)—that is, one that ensures sampling from sites across a
              range of contaminant levels. When many human activities interact, influencing
              biological systems in complex ways across landscapes, sampling across sites
              subject to various degrees of influence will often be more appropriate for discern-
              ing and diagnosing the complex biological consequences of that influence (see
              also Premise 29, page 107).

              A second, and the primary, difficulty posed by these designs is the initial  decision
              to focus narrowly on something as variable in nature as  population size. In studies
              to determine environmental impacts, the interaction between variability and the
              size of the potential impact (effect size) must also be taken into account because
              that interaction affects statistical power (Osenberg et al.  1994). High variation in
              population size, even in  natural environments, interacts in complex ways with
              changes in abundances stimulated by human actions. Thus it can be very difficult
              to detect and interpret the effects of human actions even with these advanced
              designs. The minimum level of sampling effort may often exceed the planning,
              sampling, and analytical capability of many monitoring situations. By shifting the
              focus to better-behaved indicators, such as those used in a proper multimetric
18

-------
index (changes in taxa richness, loss of sensitive taxa, or changes in trophic organi-
zation), it is possible to use these designs, often in their less complex versions.
When ecological research embraced species diversity as a central theme in the
1960s, diversity indexes (e.g., Shannon, Morisita, Simpson) came into vogue for
evaluating biological communities (Pielou 1975; Magurran 1988). Not long after-
ward, however, Hurlbert (1971) raised concerns about the statistical properties of
these indexes; others later questioned their biological properties (Wolda 1981;
Fausch et al. 1990). Diversity indexes are influenced by both number of taxa and
their relative abundances; some are more sensitive to rare taxa, others to abundant
taxa. Different diversity indexes may therefore produce a different rank order for
the same series of sites, making it impossible to compare the sites' biological
condition. Diversity indexes are often inconsistent because they respond erratically
to changes in assemblages; thus they can lead to ambiguous interpretations (Wolda
1981; Boyle etal. 1990).

Measures of diversity were nevertheless  advocated for water management (Wilhm
and Dorris 1968). Florida established water quality standards based on a diversity
index,  although the state is are now moving away from them in favor of
multimetric evaluations (Barbour et al. 1996a). The index of well-being (IwB), a
sum of diversity indexes based on number of individuals and biomass (Gammon
1976; Gammon et al. 1981), has not been widely used, except by the Ohio Envi-
ronmental Protection Agency (Ohio EPA) (Yoder and Rankin 1995a). Few scientists
and managers  recommend these diversity indexes today, largely because ap-
proaches are available that are both biologically more comprehensive and statisti-
cally more reliable. Unfortunately, however, diversity indexes have left a negative
semantic legacy that surfaces whenever the word index appears (e.g., Suter 1993).
Recognizing the  need for approaches better suited to considering the many at-
tributes of biological condition simultaneously, many water resource managers
have turned to two approaches with very different strengths: multivariate statistical
analysis and multimetric indexes.  Combinations of the two are especially useful (e.g.,
Hughes et al.,  in press). Multivariate analysis was developed to facilitate detection
of pattern, not impact assessment. Multimetric indexes were designed specifically
to document which components of biological systems provide strong signals about
the impact of humans and to use those signals to define biological condition and
diagnose the factors likely to have caused degradation when it is detected.

Multivariate statistics "treat multivariate data as a whole, summarizing them and
revealing their structure" (Gauch 1982: 1). Many researchers advocate multivariate
analyses of field assessment data because these approaches are assumed  to be the
most objective. (Premise 32, page 112, discusses some drawbacks and misuses of
multivariate analyses.) Indeed, multivariate statistics are useful when an exploratory
survey  is called for (Karr and James 1975; Larsen et al. 1986; Whittier et al. 1988);
they can help uncover patterns when only a little is known about the underlying
natural history of a place or biota (Gerritsen 1995). But because scientists know a
great deal about streams and landscapes, invertebrates and fish, and the effects of
humans on those places and organisms, we advocate actively and explicitly apply-

                                                                        19

-------
               ing that knowledge in choosing which biological attributes to monitor and which
               analytical tools to use—the approach taken in developing multimetric indexes.

               Multimetric indexes build on the strengths of earlier monitoring approaches, and
               they rely on empirical knowledge of how a wide spectrum of biological attributes
               respond to varying degrees of human influence. Multimetric indexes avoid flawed
               or ambiguous indicators, such as diversity indexes or population size, and they are
               wider in scope (Davis 1995; Simon and Lyons 1995).

               The biological attributes ultimately incorporated into a multimetric index (called
               metrics) are chosen because they reflect specific and predictable responses of
               organisms to changes in landscape condition; they are sensitive to a range of
               factors (physical, chemical, and biological) that stress biological systems; and they
               are relatively easy to measure and interpret. Multimetric indexes are generally
               dominated by metrics of taxa richness (number of taxa) because structural changes
               in aquatic systems, such as  shifts among taxa, generally occur at lower levels of
               stress than do changes in ecosystem processes (Karr et al. 1986; Schindler 1987,
               1990; Ford 1989; Howarth  1991; Karr 1991). The best multimetric indexes explic-
               itly embrace several attributes of the sampled assemblage, including taxa richness,
               indicator taxa or guilds (e.g., tolerant and intolerant groups), health of individual
               organisms, and assessment  of processes (e.g., as reflected by trophic structure or
               reproductive biology).

               A multimetric index comprising such metrics integrates information from ecosys-
               tem, community, population, and individual levels (see Premise 12, page 47; Karr
               1991; Barbour et al. 1995; Gerritsen 1995), and it can be expressed in numbers and
               words. Most important, such a multimetric index clearly discriminates  biological
               "signal"—including the effects of human activities—from the "noise" of natural
               variability.

               Standard samples of invertebrates from one of the best streams in rural King
               County, Washington, for example, contained 27 taxa of invertebrates; similar
               samples from an urban stream in Seattle contained only 7 taxa. The rural stream
               had 18 taxa  of mayflies, stoneflies, and caddisflies; the urban stream had no
               stoneflies or caddisflies and only 1 mayfly taxon. The rural stream had 3 long-lived
               taxa and 4 intolerant taxa, but the urban stream had none. The rural stream had 17
               taxa of "clinger" insects; the rural none. No predatory taxa were present in the
               urban creek, but 12% of individuals from the rural creek were predators. When
               these and other metrics were combined in an index based on invertebrates, the
               resulting benthic index of biological integrity (B-IBI) provided a numeric descrip-
               tion of the condition, or health, of the streams. The B-IBI for the rural stream in
               King County was 44 (from  a maximum index of 50); that for the urban stream, 10
               (from a minimum index of 10).
20

-------
                                                               PREMISE5
                   BIOLOGICAL MONITORING  DETECTS  BIOLOGICAL
                                          CHANGES  CAUSED BY  HUMANS
    The goal of
     biological
 monitoring is
to measure and
   evaluate the
consequences of
human actions
  on biological
      systems
The aim of any resource evaluation program is to distinguish relevant biological
signal from noise caused by natural spatial and temporal variation (Osenberg et al.
1994). In ambient biological monitoring of water resources, signals of biological
condition are measured and used to predict impacts of human activity on aquatic
systems. But not all attributes of these systems, or all analytical methods, provide
signals that reveal patterns relevant for managing water resources. In choosing
biological indicators, one should focus on attributes that are sensitive to the
underlying condition of interest (e.g., human influence) but insensitive to extrane-
ous conditions (Patil 1991; Murtaugh 1996). Periodically over the past century,
water managers and researchers have failed to choose from the many variables,
disturbances, endpoints, and processes those attributes that give the clearest signals
of human impact. The nation's waters declined as a result.

This confusion is not difficult to explain. Like all scientists, biologists in the field
are always eager to explore new places, catalogue new habitats and their inhabit-
ants, and apply new principles in the name of "baseline research." Most scientists
want to know more, rarely questioning the desirability of more research or basic
research. But confusing the perspectives and goals of basic and applied ecological
research has been a major reason that biological monitoring programs have  seldom
halted resource degradation. Compounding this problem, water managers have
long sought surrogate measures of human impact or resource condition. The
search for surrogates was often too narrow, and much that humans do to degrade
resources was overlooked.
Basic-research ecologists try to understand natural variation over space and time
within communities of organisms, along with the evolutionary and thermo-
dynamic principles that mediate this variation. For the most part, they work in
natural systems subject to relatively little influence from human activities. They ask
questions such as, Why does the number of species vary from place to place on the
surface of the Earth? What regulates the size of animal and plant populations?
How do global biogeochemical cycles regulate ecosystem structure and function?

Like taxonomists trying to distinguish, identify, and name species,  basic-research
ecologists try to distinguish unique habitat types, communities, or ecosystems, and
to classify them. They have long interpreted differences among environments in
terms of changing species composition or abundances and energy flow or nutrient
                                                                                   21

-------
               cycling; they focus on differences attributable to natural biogeographic and evolu-
               tionary processes. They identify indicator species—for example, to diagnose a
               particular type of natural community, biome, or environment [e.g., sand or gravel
               heathlands, alluvial grasslands, or tall- or short-grass prairie; see Dufrene and
               Legendre (1997) for a recent example].

               Applied ecologists, too, seek to recognize natural variation but also to study how
               natural systems respond to human activities—in particular, how humans can
               manipulate natural systems to achieve certain ends. For the past several decades,
               most applied ecologists have focused on the "engineering" side of their discipline.
               They have concentrated on producing higher crop yields; increasing the water
               supply or purifying contaminated water; and enhancing fish productivity by
               building hatcheries and removing woody debris from streams or, later, putting it
               back in. They raised waterfowl harvests by building wetlands or engineering mitiga-
               tion for wetland losses. Many applied ecologists back the intentional introduction
               of alien  taxa, as in fish-stocking programs or "natural" pest control programs, often
               with substantial negative effects (Simberloff et al. 1997). Even conservation biolo-
               gists have narrowly aimed to protect endangered species—another rare commod-
               ity—instead of seeking to protect life-support systems more broadly. Today, despite
               public awareness and legislation prompted by visibly degraded biological systems,
               applied ecology generally still pursues its commodity goals.

               Thus for many years, public environmental policy has been driven primarily by
               application of narrow physical and chemical principles. When biological targets
               entered the policy arena, they were narrow (cleaner water, hardier corn, more
               ducks). This problem persists despite clear mandates such as the Clean Water Act's
               call for protecting biological integrity, despite the rhetoric of "ecosystem manage-
               ment" that has surfaced in the past decade. Part of the problem lies squarely with
               ecologists trained to use narrow commodities as their indicators; the solution will
               come from applying ecology to find better, broader indicators of biological condi-
               tion.

               A broader applied ecology should, for example, seek to  discover the consequences
               of activities such as grazing, logging, and urbanization on particular places. Ap-
               plied ecologists should ask, What do we measure to understand responses to
               human activities? What methods and measurements best isolate the signal pro-
               duced by human impact from noise? How do we interpret the results? What are
               the likely consequences of changes we see? How do we tell citizens, policymakers,
               and political leaders what is happening and how to fix it?

               The first step toward effective biological monitoring and assessment, then, is to
               realize that the goal is to measure and evaluate the consequences of human actions
               on biological systems. The relevant measurement endpoint for biological monitor-
               ing is biological condition; detecting change in that endpoint, comparing the
               change with a minimally disturbed baseline condition, identifying the causes of the
               change, and communicating these findings to policymakers and citizens are the
               tasks of biological monitoring programs (Figure 3). Keeping this framework in
               mind can help keep biological monitoring programs on track.
22

-------
     Physical, chemical, evolutionary, and biogeographic processes interact to produce
   Physical and Geographic Context
                 Location
            Geological substrate
             Climate, Elevation
           Stream size, Gradient
                            Biological Integrity
                                Taxa richness
                             Species composition
                            Tolerance, Intolerance
                      Adaptive strategies (ecology, behavior,
                                morphology)
                 The baseline without human disturbance is influenced by
                                   Human Activities
                         Land use (cities, farms, logging, grazing, dams)
                                     Effluent discharge
                                     Water withdrawal
                                 Discharge from reservoirs
                               Sport and commercial fisheries
                                    Introduction of aliens

             which alter biogeochemical processes to influence one or more of
                                      Five Factors
                                       Flow regime
                                  Physical habitat structure
                                      Water quality
                                      Energy source
                                   Biological interactions
                                    thereby altering
        Geophysical Condition
         Land cover, Erosion rates
      Slope stability, Evapotranspiration
           Surface permeability
         Runoff amount and timing
           Groundwater recharge
                           Biological Condition
                               Taxa richness
                           Taxonomic composition
                              Individual health
                            Ecological processes
                           Evolutionary processes
           Biological Condition
Unacceptable divergence of

           from
Biological Integrity
                                       stimulates
                                Environmental Policies
                                  Regulations, Incentives
                                      Management
                                 Conservation, Restoration

                                       to protect
                                     Aquatic Life
FIGURE  3. Relationships among kinds of variables to be measured, understood, and
evaluated through biological monitoring. Biological condition is the endpoint of primary
concern.
                                                                                  23

-------
               Both basic-research ecologists and applied ecologists concern themselves with the
               top tier of Figure 3, the baseline condition minimally disturbed by human actions.
               Biogeochemical processes give rise to a geophysical setting and a biota defined as
               possessing biological integrity (Frey 1977; Karr and Dudley 1981; Angermeier and
               Karr 1994).  Natural geophysical settings and biotas unaltered by humans in histori-
               cal times constitute the main focus for basic-research ecologists, but understanding
               and documenting these processes and components also provide the foundation for
               biological monitoring studies.

               In essence, understanding baseline, or reference, conditions in different places is
               analogous to veterinarians' learning what indicates health in different kinds of
               animals. "Healthy" for a lizard is not the same as "healthy" for a dog. Likewise, the
               expected quantitative values for indicators of ecological health in small midwestern
               North American streams are not the same as for Pacific Northwest streams or for
               large South American rivers, even though many of the same biological attributes
               may work as indicators in  those disparate situations (e.g., taxa richness, relative
               abundance of predators). Knowing geophysical setting and undisturbed biological
               condition—in other words, knowing what produces and constitutes biological
               integrity—must underpin any biological monitoring effort.

               Through time, geophysical setting and biological integrity are altered by natural
               events, so that over evolutionary time, biogeochemical processes may change the
               conditions defining regional integrity. But the rapid growth of human populations
               and their technologies during just the past 200 years has been a new, radically
               different force for change.  Regional biological systems are no longer what they
               were 300 years ago, and the change threatens the very supply of goods and services
               humans depend on (Hannah et al. 1994; Costanza et al. 1997; Daily 1997;
               Pimentel et  al. 1997). As a result, the historical dichotomy between basic ecology
               and applied ecology must  give rise to a seamless "new ecology." Whereas basic
               ecology has tried to understand the natural world and applied ecology has largely
               concentrated on extracting human commodities from that natural world, a new
               ecology must protect local, regional, and global life-support systems.

               This more integrative ecology shares its emphasis on human activities with the
               commodity branches of applied ecology. But whereas commodity ecology sought
               to increase human influence and to use that influence to maximize harvests of wild
               and cultivated species, a better applied ecology seeks to understand the biological
               consequences  of human activity and to minimize the harmful ones. Biological
               monitoring  measures the condition of biological systems in the broadest sense and
               thus lies at the heart of this new ecology. The sampling and analytical tools used in
               monitoring  must focus on detecting and understanding human-caused change.

               Conceptual frameworks, protocols, and procedures designed for basic research on
               near-pristine systems are not necessarily those  that will identify change caused by
               human activity. Among 20 randomly selected  sites sampled for benthic insects in a
               cold-water stream, for example,  some of the variation in the samples will have
               natural causes (e.g., among microhabitats within a stream reach or among  reaches
               of streams of different sizes). Sampling itself—the use of a method, the choice of a
24

-------
method, or the efficiency of different field teams—also produces variation (see
Premise 19, page 80). But the most important variation comes from differences in
human activity among segments of a watershed. Understanding that variation and
communicating its consequences to all members of the human community is
perhaps the greatest challenge of modern ecology.

In sum, biological monitoring studies must measure present biological condition
and compare that condition with what would be expected in the absence of
humans. Biological monitoring documents any divergences from expected baseline
conditions and associates divergences with knowledge of human activities in the
area; the goal is to find out why conditions have moved away from integrity.  In
biological monitoring, then, managers need to evaluate five kinds of information
all together:  (1) present and (2) expected biology, (3) present and (4) expected
geophysical setting, and (5) the activities of humans likely to alter both the biology
and the geophysical setting. Managers, policymakers, and society at large can use
this information to decide if measured alterations in biological condition are
acceptable and set policies accordingly. In other words, by identifying the biologi-
cal and ecological consequences of human actions, biological monitoring provides
an essential foundation for assessing ecological risks.
                                                                      25

-------
          PREMISE 6
  ECOLOGICAL RISK ASSESSMENT  AND RISK  MANAGEMENT
  DEPEND  ON  BIOLOGICAL MONITORING
     Tracking
    biological
    endpoints,
   rather than
    pollution-
control dollars,
  will improve
    our ability
     to reduce
ecological risks
Over the past decade or so, risk assessment has focused on human health effects,
usually the effects of single toxic substances from single sources. As practiced since
a 1983 report of the National Research Council (NRG 1983; see also NRG 1994,
1996; Risk Commission 1997), human health risk assessment asks five questions
(van Belle et al. 1996), each with its own technical jargon:

B   Is there a problem? (hazard identification)
H   What is the nature of the problem? (dose-response assessment)
B   How many people and what environmental areas are affected? (exposure
    assessment)
m   How can we summarize and explain the problem? (risk characterization)
B   What can we do about it? (risk management)

Responding to growing interest in ecological risk assessment specifically, EPA in
1992 issued its own Frameworkfor Ecological Risk Assessment (see also USEPA
1994a,b), which was superseded in  September 1996 by the Proposed Guidelines for
Ecological Risk Assessment (USEPA 1996d). In these documents, EPA modifies the
human health assessment terminology and process to evaluate "the likelihood that
adverse ecological effects may occur or are  occurring as a result of exposure to one
or more stressors" (USEPA 1996d). The agency's framework asks questions very
similar to those asked in human health risk assessment:

B   Is there a problem? (problem formulation)
a   What is the nature of the problem? (characterization of exposure and
    characterization of ecological effects)
a   How can we summarize and explain the problem? (risk characterization)
B   What can we do about it? (risk management)

Unfortunately, most risk assessments still take a single-source-single-effect ap-
proach, ignoring the multiplicity of stressors to which individual humans,  as well
as ecological systems, are subjected. In the most recent attempt to shift govern-
ment thinking in this area, a Presidential/Congressional Commission on Risk
issued its Frameworkfor Environmental Health Risk Management (Risk Commission
1997), which simultaneously  enlarges the context for "risk" to include ecological as
well as public health risks and emphasizes the importance of involving the public
throughout the risk assessment and management processes.
 26

-------
 The commission's report recommends six risk management steps. It explicitly
 broadens the definition of risk management to include ecological risks. It urges
 testing of "real-world mixtures" of pollutants, such as urban smog or pesticides left
 on vegetables. The report recommends looking at whole watersheds and "airsheds,"
 and it makes specific recommendations to Congress and to regulatory agencies
 including EPA. It also builds public involvement into all six steps, especially in
 defining a problem and putting it into public health context. The report advises
 risk managers and citizens to: (1) define the problem and put it in context; (2)
 analyze the risks associated with the problem in context; (3) examine options for
 addressing the risks; (4) make decisions about which options to implement; (5) act
 to implement the decisions; and (6) evaluate the action's results. A primary chal-
 lenge is to translate these goals into assessment and protection of ecological health.

 All these attempts to reinvent risk management allow, even encourage, managers to
 broaden the questions, context, and tools they apply to the nation's environmental
 challenges. And although all seem to agree that risk assessment and risk manage-
 ment must be iterative—that conclusions  must be  revisited and the process re-
 peated so that decisions may be adjusted  on the basis of new information—debate
 still rages over which risks to assess and the "right" way to assess and manage them.
 Still, we argue that, whatever the framework for assessing ecological risks, each step
 must be informed by data from biological monitoring. For accurate, relevant
 ecological risk assessment, the measurement endpoints (what is measured)  and the
 assessment endpoints (the ecological goods and services society seeks to protect)
 must be explicitly biological. Biological monitoring provides better information
 about actual environmental quality than chemical and physical measures alone
 (Keeler and McLemore 1996) because biological attributes are one step closer to
 the factors that constitute environmental quality. Microeconomic models based on
 chemical levels  as surrogates of environmental quality may be useful  for approxi-
 mating the costs of pollution control, for example, but they are limited in  their
 ability to explain the ecological, explicitly biological, damage caused  by that
 pollution (Keeler and McLemore 1996). Economic models incorporating biological
 measures, on the other hand, can potentially contribute more accurately to a
whole-system approach to resource management.

To see the benefits of biological monitoring, consider the waste implicit in deci-
 sions to invest increasing amounts of money in wastewater treatment in North
America while paying little attention to whether water resource condition was
improving or to the influence of other limiting factors. The nonlinear nature of
ecological systems makes conventional wastewater treatment very inefficient
(Statzner et al. 1997). Eventually, environmental improvement per dollar spent
declines because other factors begin to limit overall environmental quality. But
judicious use of biological monitoring can track living components of environmen-
tal quality directly, thereby improving management efficiency. Tracking environ-
mental quality through biological monitoring can guide investment strategies
toward those that would yield the greatest benefit  per dollar spent. In short, the use
of biological endpoints, rather than pollution control dollars or numbers of


                                                                       27

-------
               permits issued, will improve decision making, achieve greater environmental
               improvements for each increment of expenditure, and improve our ability to
               rf*/4nf& &/~-r\]r\mf~'il ncL-e
  JT
reduce ecological risks.
               Ecological risk assessment will miss its mark if it simply folds ecological terminol-
               ogy into a new pollution control or human health-focused process. To protect
               biological resources, we must measure, monitor, and interpret biological signals.
               For if we do not understand how biological systems respond—and the conse-
               quences of those responses  for human well-being—we cannot understand what is at
               risk or make wise choices.
28

-------
                                  SECTION
       MULTIMETRIC  INDEXES CONVEY
              BIOLOGICAL INFORMATION
Five activities are central to making multimetric biological indexes effective:
1.  Classifying environments to define homogeneous sets within or across
   ecoregions (e.g., streams, lakes, or wetlands; large or small streams;
   warm-water or cold-water lakes; high- or low-gradient streams).
2.  Selecting measurable attributes that provide reliable and relevant signals
   about the biological effects of human activities.
3.  Developing sampling protocols and designs that ensure that those
   biological attributes are measured accurately and precisely.
4.  Devising analytical procedures to extract and understand relevant
   patterns in those data.
5.  Communicating the results to citizens and policymakers so that all
   concerned communities can contribute to environmental policy.
                                                    29

-------
          PREMISE/
  UNDERSTANDING BIOLOGICAL RESPONSES REQUIRES
  MEASURING ACROSS DEGREES  OF  HUMAN INFLUENCE
Samplingfrom
     sites with
     different
intensities and
types of human
    activity is
    essential to
    detect and
   understand
    biological
   responses to
      human
     influence
Our ability to protect biological resources depends on our ability to identify and
predict the effects of human actions on biological systems, especially our ability to
distinguish between natural and human-induced variability in biological condition.
Thus, even though measures taken at places with little or no human influence (e.g.,
only from "reference" sites) may tell us something about natural variability from
place to place and through time in undisturbed sites, they cannot tell us anything
about which biological attributes merit watching for signs of human-caused degra-
dation. To find these signs, sampling and analysis should focus on multiple sites
within similar environments across the range from minimal to severe human
disturbance.

One could choose sampling sites that represent different intensities of only one
human activity, such as logging, grazing, or chemical pollution. It would then be
possible to evaluate biological responses to a changing "dose" of a single human
influence. Though rare, such a study opportunity could help identify the biological
response signature characteristic of that activity (Karr et al.  1986; Yoder and Rankin
1995b). Knowledge of such biological response signatures would give researchers a
diagnostic tool for watersheds influenced by unknown or multiple human activi-
ties. In reality, however, it is virtually impossible to find regions influenced by only
a single human activity.

In most circumstances, diverse human activities interact (e.g., during urbanization)
to affect conditions in watersheds, water bodies, or stream reaches. In such cases,
sites can be  grouped and placed on a gradient according to activities and their
effects: industrial effluent is more toxic than domestic effluent, for example, and
both pose more-serious threats than low dams, weirs, or levees (Figure 4). Removal
of natural riparian corridors damages streams; conversion to a partially herbaceous
riparian area is less  damaging than Conversion to riprap. Streams grouped this way
show striking and systematic differences in biological condition across the gradient
(Figure 5).
In other circumstances, a single variable can capture and integrate multiple sources
of influence: the percentage of impervious area in a watershed summarizes the
multiple effects of paving, building, and other consequences  of urbanization, as in
a recent study of Puget Sound lowland streams (Figure 6). This measure provides a
simple surrogate of human influence that works well across a gradient of impervi-
ous area from near 0% to 60%. Unfortunately, it is less useful in understanding the
  30

-------
FIGURE 4. A priori
classification system
for ranking Japanese
streams according to
intensity of human
influence (Rossano
1995). Sites were
assigned to one of 21
possible categories
based on amount and
type of effluent,
proximity of dams
and other structural
alterations, and type
of riparian vegeta-
tion. Even without
quantitative measures
from each site, this
approach allowed
sites to be ranked
across a range of
human influence.
1.  Classify sites according to the amount of effluent present.
         Little                             Much
2.  Within each of these broad classes, rank sites according to the types
   of effluent.
                             Agricultural/domestic
          Raw sewage/
          industrial
3.  Within each of these classes, rank sites according to proximity of dams,
   weirs, and levees.
                             Far
Near
4.  Within each of these classes, rank sites according to riparian vegetation.
                                                  Human influence
                              Low
                                                           High
                                                                                       21
                                                       Rank
               often large variation in biological condition at some percentages of imperviousness
               (e.g., 3% to 8%; see Figure 6, page 33). Finding the differences in human activity
               that can explain these biological differences requires information from the water-
               sheds that is more detailed.
               Alternatively, sites may be grouped into qualitative disturbance categories. In a
               study of recreational influence on stream biology in the northern Rocky Moun-
               tains (Figure 7), Patterson (1996) classed sites into four categories associated with
               different levels of human activity: (1) little or no human influence in the water-
               shed; (2) light recreational use (hiking, backpacking); (3) heavy recreational use
               (major trailheads, camping areas);  and (4) urbanization, grazing, agriculture, or
               wastewater discharge. Patterson demonstrated that light recreational activity did
               not substantially reduce B-IBIs in  comparison with undisturbed watersheds,
               whereas heavy  recreational use did significantly alter the benthic invertebrates but
               not as much as more-intensive uses such as urbanization or agriculture.
               A similar approach was used in a study of biological response to chemical pollu-
               tion on three continents: South America, Africa, and southeastern Asia (Thorne
               and Williams 1997). The authors classified sites according to a pollution gradient
               based on  the integration of six measures of chemical pollution. Biological condi-
               tion, as indicated by metrics such as total taxa richness (families) and mayfly,

                                                                                          31

-------
FIGURE 5. Benthic indexes
of biological integrity (B-
IBIs) for 115 Japanese
streams (from Rossano
1995). The top panel
shows B-IBIs calculated
from half of the 115-
stream data set (circles),
which was used to
initially select and test
metrics for use in the B-
IBI. The middle panel
shows B-IBI values
calculated from the
second half of the data set
(pluses); the metrics and
scoring criteria used for
these  data were the
metrics and criteria
developed from the first
half. In the  bottom panel,
all 115 B-IBIs are plotted
together; the indexes
from both sets corre-
spond closely, ranking the
streams comparably
according to intensity of
land use from low to
high. The range of human
influence against which
the B-IBIs are plotted
comes from the classifica-
tion scheme shown in
Figure 4.






















eg
g
4_«
0)
CO

















60

50



40



30



20



10
60

50
40

30
20
10
60


50



40


30
20

m
-
OD
OO
O O
O
COO 00
o o o
o o
o o o
00 00 00 O O
o o o oo o
CO O COCO O
o coo o
oooo
ooco
o
coo
ooco
COD

-
f
.+
+ -H- 44-

V + + +
-
1 1 1
-rrr
4- 4+4-

-
9°
-F,
-+oo
o o
o
4+4-+CODOO+
+ 44-0 00 4+
4- O Qf
4- -F o o o
OD ~f"GD 00 O O
++f"co-io4-EDao o
O CO$4-9-
+ 44-09-00
4- CHOCO
o
4- 939-
Low High
                                                          Human influence
               stonefly, and caddisfly richness, clearly went down as pollution went up. The
               biological responses in the three tropical regions were similiar; they parallel pat-
               terns documented in temperate regions even though the faunas are all very differ-
               ent.

               Data collected over a number of years at the same site(s) can also reveal biological
               responses as human activities change during that period. Regardless of how one
32

-------
FIGURE 6. Benthic index of biological
integrity (B-IBI) plotted against the
percentage of impervious area for urban,
suburban, and rural stream sites in the
Puget Sound lowlands, Washington (from
Kleindl 1995). Though B-IBI clearly
decreases with increasing impervious area,
this plot offers no insight into B-IBI
differences among sites with similar
percentages of impervious area, especially
at low percentages (3% to 17%).
     40
a   so
§   20
CO
     10
9  0
              10     20    30     40     50
                 Impervious area (%)
FIGURE  7. Benthic indexes of biological
integrity (B-IBIs) for stream sites in Grand
Teton National Park, Wyoming (from
Patterson 1996). Before B-IBIs were deter-
mined, these sites had been placed into four
categories of human influence: little or no
human activity (NHA), light recreational
use (LR), heavy recreation use (HR), and
other (O). B-IBIs revealed no significant
difference between sites with little or no
human activity and those having low
recreational use. But B-IBIs were signifi-
cantly lower for sites used heavily for
recreation and lower still for sites subjected
to other uses—specifically, urbanization,
grazing, agriculture, and wastewater
discharge.



eg
0
ic
"c
0)
m


50

40


30

20
10
' 0
00 0
0
00

0
0
0 0
00
0
0
_
       NHA        LR         HR
                  Human influence
                 O
               represents a range of human influence among study sites, sampling from sites with
               different intensities and types of human activity is essential to detect and under-
               stand biological responses to human influence. Thus the goal is to compare like
               environments with like environments—to isolate and understand patterns caused
               by human activities at sites within those like environments.

               Too many existing studies confound patterns of human influence with natural
               variation over time at undisturbed sites or across different environment types. In
               other situations, researchers combine measures of human activity, the physical and
               chemical manifestations of those activities, and their biological consequences in a
               heterogeneous analysis with ambiguous results. Those analyses may even include
               measures of physical environment such as stream gradient. When this range of
               factors (different human influences  on different environment types) is lumped in a
                                                                                         33

-------
              single analysis, it becomes almost impossible to understand causes or consequences
              of natural versus human events.
              Consider the following analogy. Three experiments are designed: one to under-
              stand variation in natural biological systems as a function of stream size; another
              to distinguish the effects of pesticide runoff on streams of first, fourth, and sixth
              order; and a third to define the effects of pesticides on plants and insects. Analyz-
              ing samples from the first series of stream sites would tell one about biological
              responses to changing stream size; samples from the second series, about changing
              human influence as a function of stream size; and samples from the third would
              distinguish responses of different taxa. It would be silly to mix the data from the
              three studies in a single statistical analysis, without regard to which study the
              individual samples came from. Yet by using analytical  procedures that mix the
              effects of natural and human-induced variation (in a single correlation  matrix, for
              example), researchers are essentially doing just that: they are ignoring the context
              of the different components  of their data, making it difficult to distinguish the
              biological signs relevant to resource management or protection. They then con-
              found the sources of the variation they see, even if their initial sampling setup
              would have permitted discrimination among those sources. Univariate and multi-
              variate analyses all too often  suffer from this flaw.
              Sampling only from "reference" sites creates a similar problem because it does not
              provide a way to document which biological attributes vary with human influence
              (see Premise 30, page 108). Careful  thought about which variables best summarize
              human influence and the relationships among those variables should be the
              foundation of monitoring protocols. Creating opportunities to discover biological
              patterns in relation to human activity must be foremost.
34

-------
                                                               PREMISES
         ONLY  A  FEW BIOLOGICAL ATTRIBUTES  PROVIDE  RELIABLE
                             SIGNALS  ABOUT  BIOLOGICAL CONDITION
     Successful
     biological
   monitoring
   depends on
 demonstrating
       that an
     attribute
      changes
consistently and
 quantitatively
      across a
   gradient of
       human
     influence
The success of biological monitoring programs and their use to define and enforce
biological criteria is tied to identifying biological attributes that provide reliable
signals about resource condition (Table 3). Choosing from the profusion of bio-
logical attributes (Figure 8) that could be measured is a winnowing process, in
which each attribute is essentially a hypothesis to be tested for its merit as a metric.
One accepts or rejects the hypothesis by asking, Does this attribute vary systemati-
cally through a range of human influence? When metrics are selected and orga-
nized  systematically, an effective multimetric index can emerge  from the chaos
displayed in Figure 8.
Knowledge of natural  history and familiarity with ecological principles and theory
guide  the definition of attributes and the prediction of their behavior under
varying human influences. But successful biological monitoring depends most on
demonstrating that an attribute has a reliable empirical relationship—a consistent
quantitative change—across a range, or gradient, of human influence. Unfortu-
nately, this crucial step is  often omitted in many local, regional, and national
efforts to develop multimetric indexes (e.g., RBP I, II, III; Plafkin et al. 1989).
The study of populations has dominated much ecological research for decades (see
section II), so researchers  still assume that population size (expressed as abundance
or density) provides reliable signal about water resource condition. But because
species abundances vary so much as a result  of natural environmental variation,
even in pristine areas,  population size is rarely a reliable indicator of human
influence (see Premise 13, page 51, and Premise 24, page 95). Large numbers of
samples (>25) were required, for example, to detect small (<20°/o) differences in
number offish per 100 m2 of stream surface area in small South Carolina streams
(Paller 1995b). Other attributes—such as taxa richness (number of unique taxa in a
sample, including rare ones) and percentages of individuals belonging to tolerant
taxa—have, in contrast, been found to vary consistently and systematically with
human influence. Such attributes, when graphed, give rise to analogues of the
toxicological dose-response curve—which we call ecological dose-response curves—
where the y-axis represents the measured attribute and the x-axis measures of
human influence  (Figure 9).
Ecological dose-response  curves differ in one critical aspect from toxicological
dose-response curves. Toxicological dose-response curves usually measure
biological response in  relation to dose of a single compound. Ecological dose-

                                                                    35

-------
TABLE 3. Terms used in defining biological condition.
Term
Definition
Attribute

Metric

Multimetric index
Biological monitoring

Biological assessment

Biological criteria
Measurable component of a biological system

Attribute empirically shown to change in value along a gradient of human influence

A number that integrates several biological metrics to indicate a site's condition
Sampling the biota of a place (e.g., a stream, a woodlot, or a wetland)

Using samples of living organisms to evaluate the condition or health of places

Under the Clean Water Act, numerical values or verbal (narrative) standards that
  define a desired biological condition for a water body; legally enforceable
FIGURE  8.
Almost any
biological
attribute can be
measured, but
only certain
attributes
provide reliable
signals of
biological
condition and
therefore merit
integration into a
multimetric
index.
                  What to  measure?
9
                                                7
36

-------
FIGURE 9. Average taxa
richnesses of Plecoptera and
sediment-intolerant taxa plotted
against grazing intensity for
seven stream sites in the John
Day Basin, Oregon, in 1988.
Site A had fewer taxa than
expected because although
cattle were excluded, intense
grazing upstream had affected
the site's biota.

       .
     8
CO
CO
0>
c
o
•n
s
,co
                                                   CO
                                                   •*-*
4h

3

2

1

0
                                                       2.0
                                                       1.5
                                                   ~  1.0
                                                   CD
                                                   E  0.5
                                                   T3
                                                   0)
                                                   W  0.0
                                                           High
                                           Low
                                                                  Grazing intensity
               response curves measure a biological response to the cumulative ecological expo-
               sure, or "dose," of all events and human activities within a watershed, expressed in
               terms such as percentage of area logged, grazing intensity, or percentage of impervi-
               ous area in a watershed. The number of unique native fish taxa in a midwestern
               stream sampled today, for example, reflects the cumulative effects of human
               influence up to the present.
                                                                                         37

-------
          PREMISE  9
  SIMPLE GRAPHS  REVEAL BIOLOGICAL  RESPONSES
  TO  HUMAN INFLUENCES
Graphs force us
 to confront the
   unexpected
"Often the most effective way to describe, explore, and summarize a set of num-
bers (even a very large set) is to look at pictures of those numbers... . [O]f all
methods for analyzing and communicating statistical information, well-designed
data graphics are usually the simplest and at the same time the most powerful"
(Tufte 1983: 9; see also Tufte 1990, 1997). Tufte's message is nowhere more impor-
tant than in the  display, interpretation, and communication of biological monitor-
ing data.
Graphs reveal the biological responses important for evaluating metrics more
clearly than do strictly statistical tools because they exploit "the value of graphs in
forcing the unexpected" (Mosteller and Tukey 1977) on whoever looks at them,
including researchers, who must then confront and explain the pattern in those
graphs. For samples where the relationship between human influence and biologi-
cal response is strong, statistics and graphs agree  (Figure 10). In other cases, mean-
ingful biological  patterns can be lost by excessive dependence on the outcome of
menu-driven statistical tests. Statistical correlation can miss an important relation-
ship if the x-variable (e.g., percentage of area logged) is measured with low preci-
sion or if additional factors beyond those plotted on the x-axis influence metric
values but are not included in the statistical analysis.

In Figure 11, for example, we plot two different aspects of biological condition
against one measure of human influence,  such as the percentage of upstream water-
shed that has been logged. Sites are assigned a plus or minus based on that mea-
sure and other aspects of human influence that are visible and documented but
not plotted on the  same graph. In forested watersheds, these other aspects might
include whether roads were near or far from the stream channel, time since logging,
or traits unique to particular watersheds. In some cases such interacting factors may
have degraded biological condition (roads near the stream channel would exacer-
bate logging's effects), or they may have allowed  good conditions to persist (roads
on distant ridges have less effect on streams). The distribution of pluses and
minuses in Figure 11 illustrates the fallacy of assuming that a biological metric says
nothing about condition because it does not correlate strongly with a single surro-
gate of that condition, as researchers perennially assume when a biological measure
does not correlate with some measure of chemical pollution. Rather, we should
conclude that the surrogate is  not capturing significant components of human
influence and look more closely for the biological explanations behind  the data.
  38

-------
FIGURE  10. Example of two hypotheti-
cal metrics plotted against a gradient of
human influence. Here statistical
correlation and graphical analysis agree:
metric A is a good indicator, and metric
B is not. (Compare Figure 11.)
                                                      DQ
                                                      O
                                                      *i_
                                                      •t-i
                                                      CD
                                                           High                            Low
                                                                    Human influence
FIGURE  !l. Hypothetical relationships
between human influence and candi-
date biological metrics (from Fore et
al. 1996). Metric A is more strongly
correlated with resource condition (or
12 is higher if using regression) than
Metric B, initially suggesting that it is a
better metric. But comparing the
metrics' ability to distinguish between
minimally disturbed sites (denoted by
plus signs) and severely degraded sites
(open boxes; ranges noted by arrows)
shows that Metric B is actually a more
effective measure of biological condi-
tion despite its smaller statistical
correlation. (Compare Figure 10.)

E  >
  -
          Metric A
          r=0.69
. *+:i
• .'.+++ I
+ Range

1

n Range
          Metric B
          r=0.42
CD
o
c
CO
•a
c
jQ
CO
CD
.>
CD
CD
DC
/ ;

® + +
A ® ® A
^ O A T "
• . .• - . .
o° „•- .- • :
*



                                                 Range
                                                                                        n Range
                                                High                           Low
                                                        Human influence
                                                                                           39

-------
               Not all aspects of human influence can be easily captured in a single graph or
               statistical test. When a number of variables influence condition, a single plot
               against one dimension of human influence will not tell the whole story (Figure 12);
               neither will a single statistical test. Graphs force one to search for insights that rote
               application of statistical tests cannot discover.

               Weak statistical correlation can also miss important biological patterns when the
               distribution of the data (e.g., Figure 13) does not lend itself to tests based on
               standard correlation techniques that detect only linear relationships. Yet nonlinear
               patterns are common in field data (Figure 14). Consider the plots in Figure 15, for
               example. The points fall into a wedge-shaped distribution, whose scatter shows little
               or no statistical significance but can be interpreted biologically. The upper bound
               of each plot is the hypotenuse of a right triangle (the maximum species richness
               line)  that defines the number of species expected in minimally disturbed streams as
               a function of stream size (Fausch et al. 1984). The plots illustrate what Thomson et
               al. (1996) term a "factor ceiling distribution"; in this case, the ceiling, maximum
               species richness, is defined by the  evolution of the regional biota. Generally, at sites
               where the number offish species  falls below the ceiling, some human activity in
               the adjacent or upstream watershed has reduced the number of species present; or
               sampling might have been inadequate, "dragging" species richness below the line.

               Graphs highlight idiosyncrasies in data distributions that, when examined closely,
               may provide insight into the causes of a particular biological pattern. At one
               extreme, outlying points on a graph may offer key insights about the complex
               influence of human activities in watersheds; one can, for example, explore what
               unique situations at those sites cause them to appear as outliers.

               Even the spread of data can offer  insights, as illustrated by the large range in B-IBIs
               at sites with 20% to 30% impervious area shown in Figure 16. Sites with high
               mayfly taxa richness (B and C) lie in reaches of two streams with relatively intact
               riparian corridors and wetlands. The site with low mayfly taxa richness (A) is
               located in a stream that receives fine material from an old coal mine. Sites A, B,
               and C had unique characteristics that were best  understood by examining their
               specific contexts, not by applying a regression or correlation analysis. Finding these
               patterns then led to subsequent studies in the same and in other places to deter-
               mine if those patterns were more general.

               Graphs also illustrate variation in  behavior among taxa in response to a specific
               disturbance (Figure 17). For example, numbers of taxa for three orders  of insects
               (stoneflies, mayflies, and caddisflies) declined downstream of the outflow from a
               streamside sludge pond in the Tennessee Valley, but the magnitude of change
               varied among the taxa (see also Premise 13, page 51). The same graph also reveals
               the direction and magnitude of change along a longitudinal transect down the
               stream.
               Graphs may sometimes allow researchers to avoid naive application of elaborate
               multivariate techniques (Beals 1973). Principal components analysis, the most often
               used  ordination technique (James and McCullough 1990), defines statistically
               orthogonal factors, which, biologically, may or may not be independent; interpret-
40

-------
FIGURE  12. Taxa richness of
Trichoptera plotted against the
percentage of watershed area that
was logged for 32 stream sites in
southwestern Oregon. Metric
correlation (Spearman's rho) was
not significant because, alone, the
percentage of area logged was an
inaccurate measure of human
influence; other factors, such as
type of logging, presence of roads,
and other human influences, were
not included. When these other
human influences were considered
to identify minimally disturbed
sites (denoted by plus signs) and
severely degraded sites (open
boxes), the response of Trichoptera
taxa richness visibly distinguished
between different degrees of
human disturbance.
      20


      15


      10
CO
CO
CD
c
g
CO
s
*S    on
Q.    ^u
O
o
[±    15
      10
    = -0.10
e«    9
            99        99
                 9      99
                                             0 -
       9
      9
      99  9
                                                   99
                                                  +Range
                                                                                   n  T n Range
                                                      20       40       60
                                                        Area logged (%)
                                            80
FIGURE  13. Hypothetical relation-
ship between human influence and
a Metric A. Statistical correlation
(Spearman's rho) is not significant,
yet the graphic pattern strongly
suggests a biological response. At
low levels of human influence,
Metric A is not a reliable indicator
of biological condition, but where
human disturbance is high, the
metric does respond.
                  o
                          p=0.17
                          rho = 0.37
                           9    A    99

                              99      9
                      Low                           High
                               Human influence
                                                                                           41

-------
FIGURE 14. Relative abundance
(percentage of total) of individuals
belonging to tolerant taxa in
samples of benthic invertebrates
from 65 Japanese streams ranked
according to intensity of human
influence (see Figures 4, page 31
and Figure 5, page 32). (Data
provided by E. M. Rossano.)
    100 r
 
                                      jQ
                                       E
     30
     20
CO
•e   10
0)
Q.
CO
sz
.to    0
c
~CC
                                            30

                                            20

                                            10

                                             0
                     Maximum species
                     richness line
                                               r B
                         3       4
                       Stream order
                                                    10           100         1000     5000
                                                         Watershed area (km2)
42

-------
FIGURE 16. Average taxa richness
of Ephemeroptera plotted against
percentage of impervious surface
area surrounding Puget Sound
lowland streams (from Kleindl
1995). Site A, Coal Creek, had
fewer Ephemeroptera than
expected. This site has an active
mine in its headwaters, and
Ephemeroptera are known to be
sensitive to mine waste. Sites B
and C had relatively intact
riparian areas (wetlands).
       co
       CO
       0

       .o
       CO
0
•*-*
Q.
|   *
0
•&   o
LJJ
                     1      1
                    10    20     30     40     50

                       Impervious area (%)
FIGURE 17. Taxa richness of
mayflies, stoneflies, and caddisflies
for sites along the North Fork
Holston River in the Tennessee
Valley in 1976 (from Kerans and
Karr 1994). Arrow indicates the
position of the streamside sludge
pond. Taxa richnesses for all three
orders decline at the sludge pond
and slowly recover for sites
downstream.
 CO
 cS   6
 c
I   4
 CO
I   *
     0
                                 Ephemeroptera

                              ,.  Trichoptera


                              ,.  Plecoptera
                                                                    i     i    i
                                              150      110     70       30
                                                 Distance from mouth (km)
               ing the results can therefore be complicated (Goodall 1954). Graphs can be a
               superior approach to methods that focus on maximum variance extracted because,
               when used correctly, they emphasize ecological rather than mathematical associa-
               tions, a more appropriate criterion for organizing and understanding complex
               information (Beals 1973).
               Complex ecological situations require  unusual analytical means. Graphs can often
               be ecologists' most useful tools, permitting the exploration of ecological data
               "before, after, and beyond the application of 'standard analyses' " (Augspurger
               1996). Rather than choose an inappropriately linear statistical model before plot-
               ting their data, ecologists should exploit the power of graphs for "reasoning about
               quantitative information" (Tufte 1983), and then choose and apply appropriate
               statistics. It is myopic to be a slave of standard statistical rules and procedures or to
               avoid statistics altogether.

                                                                                        43

-------
        PREMISE  10
SIMILAR BIOLOGICAL ATTRIBUTES ARE  RELIABLE INDICATORS
IN  DIVERSE  CIRCUMSTANCES
              A striking conclusion from 15 years' research in selecting metrics is that the same
              major biological attributes serve as reliable indicators in diverse circumstances.
              This result has its advantages and disadvantages. On the advantage side, every
              small project (e.g., at the county or community level) need not test and define its
              own locally applicable metrics. Scientists and resource managers can implement
              local biological monitoring and assessment programs based on results from other
              studies. When local studies cite earlier work, readers can know that the methods
              have been tested elsewhere; the accumulating body of tests refines, or refutes, the
              generality of patterns others have defined.

              On the disadvantage side, some applications of multimetric indexes uncritically
              borrow theoretical or empirical metrics from other studies. This borrowing
              becomes problematic when the theory is wrong or does not apply in the  study
              circumstance, or when metrics are applied to systems or regions other than those
              for which they were tested. For example, human impacts may increase taxa
              richness in cold-water streams (Hughes and Gammon 1987; Lyons et al. 1996) as
              cool- and warm-water species enter areas where water temperatures have been
              raised by activities such as logging of riparian vegetation. In contrast, in eastern
              warm-water streams, human influence commonly decreases species richness
              except for aliens (Karr et al. 1986). Thus, one cannot make identical assumptions
              about metrics offish taxa richness in the two contexts. Similarly, a benthic inver-
              tebrate metric for soft-bodied organisms (e.g., oligochaetes, tipulid flies, and other
              grublike forms) often indicates degraded conditions in North America, but in
              Japan, the better metric consists of legless organisms, a grouping that includes the
              soft-bodied organisms but also shelled snails and mussels. In North America,
              mussels and snails are more often indicators of high-quality environments, but in
              Japan, most are alien or otherwise indicative of degraded conditions.

              The bottom line is that metrics should be based on sound ecology and adapted
              only with great care beyond  the regions  and habitats for which they were devel-
              oped. Exploring biological patterns to discover the best biological signals (that is,
              metrics) should mix graphs, conventional statistics, and thoughtful consideration
              of regional natural history.
44

-------
                                                             PREMISE  11
            TRACKING COMPLEX  SYSTEMS  REQUIRES A MEASURE
                                         INTEGRATING MULTIPLE FACTORS
      We use
  multimetric
    indexes to
  monitor the
    economy;
    we should
  use them to
monitor water
    resources
Scientists, citizens, and policymakers faced with making decisions about complex
systems—economies, a family member's health, an ecological system—need mul-
tiple levels of information. Consider some of the indexes used to track the health
of the national economy: the index of leading economic indicators, the producer
price index, the consumer price index, the cost-of-living index, or the Dow Jones
industrial average. All these indexes integrate multiple economic factors.
The index of leading economic indicators (Mitchell and Burns 1938) tracks the US
economy in terms of 12 measures: length of work week; unemployment claims;
new manufacturing orders; vendor performance; net business formation; equip-
ment orders; building permits; change in inventories, sensitive materials, and
borrowing; stock prices; and money supply. These measures are combined to form
the overall index, which takes as its reference point a standardized year (e.g., 1967);
the value of the current year's index is expressed in terms of its value in the refer-
ence year. Composite economic indexes like these have survived six decades of
discussion and criticism and remain widely used by economists, policymakers, and
the media to interpret economic trends (Auerbach 1982).
Similarly, physicians and veterinarians rely on multiple measures and multiple tests
to assess the health of individual patients. On a single visit to the doctor, a patient
might be "sampled" for urine chemistry, blood-cell counts, blood chemistry, body
temperature, throat culture, weight, or chest X-rays.  Clearly, these measurements
are not independent of one another, for they come from a single individual whose
health is affected by many interacting factors. Further, you would not expect your
doctor to rely on only one specialized blood test to diagnose your overall health;
rather, you assume that multiple measures will give a more accurate diagnosis.
Patterns emerging from these multiple measurements would enable the doctor to
recognize the signature  of a particular ailment and suggest more targeted measure-
ments if she suspected a certain disease. Only then could she prescribe treatment.
Multimetric biological indexes calculated from ambient biological monitoring data
provide a similar integrative approach for "diagnosing" the condition of complex
ecological systems. The  same logical sequence applies in compiling multimetric
economic, health, or biological indexes. First, identify reliable and  meaningful
response variables through testing; then measure and evaluate the system against
expectations; finally, interpret the measured values in terms of an overall assess-
ment of system condition. The resulting index (for economic or biological

                                                                   45

-------
               resources) or diagnosis (for patients) allows people without specialized expertise to
               understand overall condition and to make informed decisions that will then affect
               the health of those economies, resources, or patients.
               Most multimetric biological indexes for use in aquatic systems comprise 8 to 12
               metrics,1 each selected because it reflects an aspect of the condition of a biological
               system. These metrics are not independent because they are calculated from a
               single collection of organisms, just as multiple personal health tests are done on a
               single individual. But even if metrics are statistically correlated, they are not
               necessarily biologically redundant. Rather, just as a fever plus a high white-blood-
               cell count reinforces  a diagnosis of bacterial infection, multiple metrics all contrib-
               ute to a diagnosis of ecological degradation (ecological disease).
               The two most common IBIs for streams have  been developed, tested, and applied
               using fish (Karr 1981; Miller et al. 1988; Lyons 1992a; Fore et al. 1994; Lyons et al.
               1995, 1996;  Simon in press) and benthic invertebrates (Kerans and Karr 1994;
               Kleindl 1995; Rossano 1995, 1996; Fore et al.  1996; Patterson  1996).  Both incorpo-
               rate known attributes from multiple levels of biological organization  and different
               temporal and spatial scales. Typically, patterns emerge that are the signatures of
               biological responses to particular human activities (Karr et al. 1986; Yoder 1991b;
               Yoder and Rankin 1995b).
               Based on the success and widespread use of these two indexes, similar indexes are
               now being developed by a number of state agencies to use with invertebrates and
               vascular plants in wetlands (Karr  1997); with algae and diatoms in streams (Bahls
               1993; Kentucky DEP 1993; Florida DEP 1996; Barbour et al., in press); and with
               plants, invertebrates, and vertebrates in terrestrial environments (CRESP  1996;
               Chu 1997; Bradford et al., in press; see also Premise 21, page 84). Extending IBI to
               new taxa, environment types, and geographic  areas is like learning to  practice
               medicine in  humans, pets, livestock, and so on: the expectation of what constitutes
               "health" depends on the animal, but the same fundamental diagnostic strategy
               applies in all cases.
1  For species-poor environments such as cold-water streams, the total number of metrics is likely to be smaller (e.g.,
  Lyons et al. 1996).
46

-------
                                                             PREMISE  12
     MULTIMETRIC  BIOLOGICAL INDEXES  INCORPORATE  LEVELS
                                    FROM  INDIVIDUALS TO LANDSCAPES
 Users should
  deliberately
choose metrics
 to encompass
  the range of
 signalsfrom
    disturbed
    biological
     systems
 Ihe success of multimetric approaches such as IBI in assessing biological condi-
tion depends on choosing and integrating metrics that reflect diverse responses of
biological systems to human actions. Ideally, a multimetric index would cover all
such responses, but the costs of developing such an index would be much too
high. A suite of chosen metrics is necessarily a compromise between "too narrow"
and "too broad"; it is also a compromise of choices among conveniently measured
biological surrogates of important biological phenomena. Present IBI and B-IBI
metrics represent our choices in these compromises, but we expect metrics to evolve
and expand over the next decade. Still, a fundamental tenet of IBI is that the  user
makes a conscious effort to choose metrics that cover the range of biological
signals available from disturbed systems.
IBI is not a community analysis in either of the common uses of the word commu-
nity. IBI does not examine all taxa but is generally based  on one or two assem-
blages (phylogenetically related groups of organisms; Fauth et al. 1996), such as
fish or benthic invertebrates. Neither does a multimetric  IBI focus on the commu-
nity level in the standard textbook hierarchy of biology (individual, population,
assemblage, community, ecosystem, and landscape). Rather, the choice of measures
in a multimetric index reflects an attempt to represent as many of those levels as
possible, preferably directly but at least indirectly. The resulting indexes are likely
to produce the strongest multimetric view of biological condition (Table 4). The
best multimetric indexes are more than a community-level assessment because they
combine measures of condition in individuals, populations, communities, ecosys-
tems, and landscapes.
Individual level. Individual health manifests itself in many ways both internally
and externally, with physiological or morphological signs and in metabolic or
genetic biomarkers reflecting organismal stress. We have not yet seen reliable
metabolic or genetic biomarkers that can be applied broadly in the field, although
in certain situations (see Summers et al. 1997 for a promising example), biomarkers
may work as secondary tools for diagnosing biological condition; we hope for
progress in this area in the next decade. To date, however, IBI metrics of individual
health consist of easily detected external abnormalities; their frequency in an
assemblage  indicates stress on individuals.
In fish, for example, visible signs of stress include  skeletal deformities; skin lesions;
tumors; fin erosion; and certain diseases that are associated with impaired

                                                                    47

-------
TABLE 4. Types of metrics, suggested number of metrics of each type, and represented levels in the biological
hierarchy. Well-constructed multimetric indexes contain the suggested number of metrics from each type and
therefore reflect multiple dimensions of biological systems.
Metric type          Number    Individual    Population   Community    Ecosystem  Landscape
Taxa richness           3-5

Tolerance, intolerance    2-3

Trophic structure        2-4

Individual health         1-2

Other ecological         2-3
  attributes
V
V
               environments, especially large amounts of toxic substances. Early studies offish in
               the seven-county area around Chicago indicated high incidence of external abnor-
               malities (Karr 1981), for example—a pattern also apparent in Ohio (Yoder and
               Rankin 1995a). Among benthic invertebrates, head-capsule deformities in chirono-
               mids (midges) are strong indicators of toxics (Hamilton and Saether 1971; Cushman
               1984; Warwick et al. 1987; Warwick and Tisdale 1988). Anomalies in fish are often
               used as IBI metrics, but chironomid head-capsule deformities are rarely incorpo-
               rated into the benthic IBI because so much laboratory work is required to stain,
               mount on slides, and count the individual insects.

               In other studies, tadpoles collected in a coal ash deposition basin had fewer labial
               teeth than tadpoles from reference areas (Rowe et al. 1996). They also had de-
               formed labial papillae, which would limit the types of food they could eat and
               limit their growth. Fish in Gulf of Mexico estuaries showed higher numbers and
               frequencies of several pathologies at heavily disturbed sites than at minimally
               disturbed sites (Summers et al. 1997). Finally, periphytic diatoms of the genus
               Fragilaria in a metal-contaminated Rocky Mountain river in  Colorado had de-
               formed cells  (McFarland et al.  1997). The percentage of deformed cells ranged from
               0.2% ± 0.2 to 12% ± 2.0 from low to high levels of heavy metal (Cd, Cu, Fe, Zn)
               contamination.

               Population level. Several metrics in both the fish and benthic IBIs indicate, if not
               the details of population demography, the relative condition of component
               groups. For example, the lack of intolerant taxa among fish or invertebrates or of
               clingers (taxa that cling to rocks) among the invertebrates is a strong signal that
               populations of these organisms are doing poorly. The absence of darters, sunfish,
               and suckers among the fishes and of mayflies (Ephemeroptera), stoneflies
               (Plecoptera),  and caddisflies (Trichoptera) among the invertebrates, suggests that
               viable populations of many species within these taxa cannot maintain themselves.
48

-------
Usually, a population must be viable at a site before one can consistently detect a
species' presence.
Assemblage level. Changes in the chemical, physical, and biological environment
resulting from human activities alter assemblages. These changes may appear as
changes in species composition or species richness (conventional measures of
community structure); in trophic structure, such as decreases in top carnivores or
increases in omnivores; or in shifts from specialists to generalists in food or repro-
ductive habits (reflecting shifts in food-web organization, including energy flow
and nutrient cycling). Multimetric indexes incorporate this information by includ-
ing metrics such as the percentage of predators, omnivores, or other feeding groups
and also species richness and the relative abundance of alien fishes (in streams) or
of vascular plants (in wetlands and terrestrial environments).

Considerable theoretical discussion has centered on "functional feeding groups" of
North American benthic invertebrates (Cummins 1974; Cummins et al. 1989;
Cummins et al.  1995). In particular, according to the river continuum hypothesis
(Vannote et al. 1980), the relative abundance of these groups is predicted to change
along the length of a river or stream. For example, in comparison with headwaters,
which are presumed to receive mostly allochthonous organic matter, downstream
reaches might have more filter-feeders or net-spinning caddisflies taking advantage
of high in-stream production. But the river continuum hypothesis does not seem
to apply consistently across North American streams (Vannote  et al. 1980;
Winterbourn et al. 1981; Minshall et al. 1983). Metrics based on functional feeding
groups among benthic invertebrates (with the possible exception of relative preda-
tor  abundance) likewise respond differently in different streams.
This inconsistent response differs from what might be a more general pattern of
trophic metric behavior in fishes; perhaps the trophic structure offish assemblages
in North America is more consistent than for benthic invertebrates. Alternatively,
perhaps more is  known about the natural history of fishes, permitting better
delineation of feeding groups. Or our knowledge of invertebrates may be less
precise, or invertebrates may be more opportunistic. The generality of trophic
group response to disturbance deserves more careful analysis, but, meanwhile, we
urge caution. Despite a widely accepted theory, metrics pertaining to functional
feeding groups among benthic invertebrates may or may not be good indicators;
their dose-response relationships to human influence must be carefully tested and
established for multiple data sets and circumstances before they should be used in
a multimetric index.
Landscape level. Regardless of level in the biological hierarchy (individuals,
species, ecosystem), the persistence of living things depends on heterogeneities in
space and time. Spatial heterogeneities are visible in littoral zonation, in vegetation
bands associated with water depth in marshes, or in association with soil moisture
and slope gradients on drier land. Stream fish spend their lifetime in many micro-
habitats; they are exposed to different flows and other shifts in time as days and
seasons change. Eggs laid in main-channel gravels become fry hiding in side
channels and along the banks; fry grow into juveniles large enough to avoid the

                                                                        49

-------
               predators that would otherwise eat them; juveniles may then move into deep pools
               where those predators are and where food supplies also differ.

               Finding food, avoiding predators, seeking spawning habitat—any activity in an
               organism's life cycle—are subject to and dependent on such heterogeneities in
               space and time. For some species, the scale of movements may extend only a few
               centimeters; for others, the scale can be hundreds or thousands of miles. The loss
               of spatial or temporal components of these heterogeneities can change the distribu-
               tion or abundance of a species or cause it to disappear altogether. The presence or
               absence of anadromous or other migratory fishes (e.g., salmon, bull trout) is thus a
               landscape-level indicator. Dams, alien predators, and altered water flows and
               temperatures interfere with their movements through a landscape, decimating
               these species.
               Incorporating several multimetric indexes (fish IBI, benthic IBI, algal IBI) into a
               biological monitoring program is a good way to reflect the condition of assem-
               blages that respond to human disturbances at different scales. Different taxa in the
               same or different assemblage reflect the presence of a broad range of heterogene-
               ities. If top predator taxa needing large home ranges or long-lived taxa requiring
               years to mature are present, for example, one can infer that the spatial and tempo-
               ral components they require are also present. Excessive in-stream production or
               many herbivorous fishes or invertebrates are characteristic of heavily grazed land-
               scapes, where riparian corridors may be damaged and excessive nutrients from
               livestock wastes are entering the stream.
               Development of IBI to date has involved a conscious effort to span the range of
               biological context. But much remains to be done. Better measures of individual
               health are needed, as are measures better defining demographics. Strengthening the
               connections between measures of food web and trophic structure and more-direct
               measures of nutrient cycling and energy flow would also improve multimetric
               assessment. Finally, landscape metrics that emphasize overall biological condition
               (number of native community types or cumulative taxa richness across a water-
               shed) are also needed. Ideally, metrics of landscape condition should be more than
               a sum of site-specific assessments.
               Great care must be taken to measure biological condition, not stressor intensity.
               We believe that biological surrogates of biological condition are essential; chemical
               and physical surrogates of biological condition are not adequate.
               Developed and applied properly, the multimetric IBI incorporates and depends on
               known components of biology—components specific to localities and taxa—across
               the organizational hierarchy and from disparate spatial and temporal scales. The
               result is a synthesis of biological signals revealing the effects of human activities at
               different levels, in different places, on different scales, and in response to a range of
               human activities.
50

-------
                                                             PREMISE 13
       METRICS ARE  SELECTED TO YIELD RELEVANT BIOLOGICAL
                                  INFORMATION AT REASONABLE COST
              Xhe index of biological integrity as first developed for fish (Karr 1981; Karr et al.
              1986) incorporated 12 metrics from three biological categories: species richness and
              composition, trophic composition, and individual condition. Later work with both
              fish and invertebrates led to somewhat different groups: specifically, species
              richness, taxonomic composition, individual condition, and biological processes
              (Karr 1993; Barbour et al. 1996b) or community structure, taxonomic composi-
              tion, individual condition, and biological processes (Fore et al.  1996). Within each
              broad category, some metrics are proven for many regions and faunas. Others work
              in some regions or studies but not in others.  Still other potential metrics based on
              theoretical ecology or toxicology may work but have not been adequately tested,2
              because they are either too difficult to measure or too theoretical to define (Table 5).
              The categories in Table 5 guide metric selection for new regions, faunas, or habi-
              tats, but no metric should become part of a multimetric index before it is thor-
              oughly and systematically tested and its response has been validated across a
              gradient of human influence.

              The choice of how to actually express each metric is as important as selecting the
              metric itself. One could simply count the number of individuals in a target group
              and express it as population size, abundance, or density (Figure 18, top); one could
              determine the proportion, or relative abundance, of the total number of individu-
              als belonging to a target group (number of individuals in the  target group divided
              by the total number of individuals in the sample; Figure 18, middle); or one could
              count the number of taxa in the entire sample or in particular subgroups (taxa
              richness; Figure 18, bottom). One could also determine (not shown) the propor-
              tion of the biota from specific taxa (e.g., number of mayfly taxa/total number of
              taxa). Approaches vary in their ability to reveal consistent dose-response relation-
              ships, as Figure 18 shows; knowledge of natural history and of which sampling
              protocols are most efficient should guide one's choice.

              Population size—besides being difficult and often costly to determine with suffi-
              cient precision (Paller et al. 1995b), especially for rare species—is not a good
              measure because it is naturally too variable, irrespective of human impacts (Karr
2  Unfortunately, untested or too-theoretical attributes have been central to EPA's rapid bioassessment protocols (RBP
  I, II, III), used since 1989. Many measures incorporated into RBP III were never tested adequately, and recent tests
  (Barbour et al. 1992; Kerans et al. 1992; Kerans and Karr 1994; Barbour et al. 1996a; Fore et al. 1996) indicate that
  they do not meet rigorous standards for metric acceptance.

                                                                                   51
    No metric
should become
     part of a
   multimetric
index before its
  response has
been validated
      across a
   gradient of
      human
     influence

-------
TABLE 5. Sample biological attributes, in four broad categories, that might have potential as metrics. Actual
monitoring protocols have proven some of these attributes effective; other attributes may work but need
more testing; still others are difficult to measure or too theoretical. Ideally, an IBI should include metrics in
each of these categories, but untested or inadequately tested attributes should not be incorporated into the
final index.
Category
Demonstrated effective      Need more testing
                       Difficult to measure or
                       too theoretical
Taxa richness
Tolerance, intolerance
Trophic structure
Individual health
Total taxa richness

Richness of major taxa,
  e.g., mayflies or sunfish

Taxa richness of intolerant
  organisms

Relative abundance of
  green sunfish

Relative abundance of
  tolerant taxa

Trophic organization,
  e.g., relative abundance
  of predators or omnivores

Relative abundance of
  individual fish with
  deformities, lesions, or
  tumors
Relative abundance of
  individual chironomids with
  head-capsule deformities

Growth rates by size or
  age class
Dominance (relative
  abundance of most-
  numerous taxa)

Number of rare or
  endangered taxa
Contaminant levels in
  tissue (biomarkers)
Relative abundance
  distribution, after
  Preston (1962)

Chironomid species
  (difficult to identify)
                       Productivity
Metabolic rate
Other ecological
  attributes
                           Age structure of target
                            species population
                1991). Our recent work in Puget Sound lowland streams, for example, found no
                systematic relationship in two successive years between benthic invertebrate
                abundance and the percentage of impervious area in the upstream watershed, one
                measure of human influence (Figure 19).

                Similarly, ratios of two groups in an assemblage do not respond systematically to
                human influence, largely because ratios are composed of two factors that can
                respond, and thus vary, independently of each other, making it impossible to draw
                firm conclusions about the relationship of those ratios to human influence (see
                Premise 24, page 95). Further, two large numbers and two small numbers may yield
                the same ratio, although the biological meaning of small and large numbers may
                be very different (Kerans and Karr 1994). If both components of the ratio are
52

-------
FIGURE 18. Presence of Trichoptera
(caddisflies) in a standard sample,
expressed as total number of tri-
chopteran individuals (top), relative
abundance of trichopteran individuals
(middle), and richness of trichopteran
taxa (bottom). These three biological
attributes are plotted against grazing
intensity as an indicator of site
condition at seven stream sites in the
John Day River basin of southwestern
Oregon.
       250

.§15   200
  B
— "O
CO C
  .>   150
  "
       100
        50
•
i
° 	
CD
O
C
CO
•o
13
.Q
CO
0)
.>
CO
CD
rr
35

30

25

20

15

10
•

-

-

-9

-

* *
                                                     C/3
                                                     CO
                                                     0)
                                                     C
                                                    .C
                                                     o
                                                    •d
                                                     CO
                                                     X
                                                    .05
                                                             Poor
                                        Good
                                                                     Site condition
                important, they might more appropriately be considered separately. (This reason-
                ing also applies in the case of diversity indexes, which combine richness and
                relative abundances. We prefer to keep those issues distinct with separate metrics.)

                Metrics related to feeding ecology or trophic structure are best expressed as relative
                abundance—for example, the number of individual predators, omnivores, or
                scrapers divided by the total number of sampled individuals.3 The relative abun-
                dance of organisms  at various levels in a stream's trophic organization reflects the
                condition of the food web, including energy flow and nutrient dynamics, but
                relative abundances are much easier to measure than true production or energy

3  Although this metric looks like a ratio, it is a ratio of a variable over a constant for the sample. In contrast, the ratios
  of two taxa or two functional feeding groups are ratios of two variables from the sample.

                                                                                          53

-------
 FIGURE  19. Number of
 invertebrates plotted against
 impervious area for lowland
 Puget Sound streams in two
 successive years.
     800
                                      400
 
T3
c5  120°
jQ
                                      800
                                      400
              1994
                                          • •
                                            •• •
                                                    9     99
                                                1995
                                             * *
                                                        20             40
                                                        Impervious area (%)
                                                    60
               flow. If we know what to expect from minimally disturbed sites in a region, we can
               then distinguish the deviations caused by human activities from that expectation.
               The relative abundance offish-eating fish in minimally disturbed streams, for ex-
               ample, is likely to be 20% or more; omnivores, 20% or less. In degraded streams,
               the relative abundance of omnivores is likely to be much higher (> 40%).
               Major taxonomic groups are best evaluated in terms of taxa richness4 because, as
               human activities damage a stream and its watershed, native taxa tend to disappear.
               A decline in taxa richness is generally one of the most reliable indicators of degra-
               dation for many aquatic groups (Ford  1989; Barbour et al. 1995), including per-
               iphyton  (Bahls 1993; Pan et al. 1996);  phytoplankton (Schelske 1984); zooplankton
               (Stemberger and Lazorchak 1994); riverine fish (Karr 1981; Miller et al. 1988; Ohio
               EPA 1988; Rodriguez-Olarte and Taphorn 1994; Rivera and Marrero 1994; Lyons
               et al. 1995, 1996); lake fish (Minns et al. 1994); estuarine fish (Thompson and
               Fitzhugh 1986; Deegan et al. 1993; Weaver and Deegan 1996; Deegan et al. 1997;
               Hartwell et al. 1997); freshwater invertebrates (Ohio EPA 1988; Reynoldson and
               Metcalfe-Smith 1992; Kerans and Karr 1994; DeShon 1995; Fore et al. 1996;

4  Taxa richness can be standardized per unit of area (e.g., taxa/0.1 m2) or per unit count of individuals (e.g., taxa/500
  individuals). The proper choice is hotly debated, a topic we cover in more detail in Premise 28, page 101.

54

-------
               Thorne and Williams 1997); and marine invertebrates (Summers and Engle 1993;
               Engle et al. 1994; Weisberg et al. 1997).
               Taxa richness may be calculated for an entire sample or for subgroups, such as fish
               families or insect orders, that use the stream environment in a particular way.
               Sunfish, for example, feed in the water column or at the surface of pools, whereas
               suckers feed in benthic pool environments, and darters or sculpins feed in benthic
               riffle environments. Each requires the unique structural complexity and cover
               associated with those particular feeding environments; the interactions of cover,
               structural complexity, and changing food abundances resulting from human
               actions may cause declines in all these groups. Because their natural histories differ,
               these three taxa provide information about the condition of three different habitat
               types within a stream. Loss of sucker taxa points to a problem, such as sedimenta-
               tion, within the benthic pool environment. Loss of sunfish suggests  loss of physical
               cover and their invertebrate food in the pelagic and surface zones of pools; indeed,
               insects decline at the surface when riparian vegetation is lost. Similar information
               may be gained from the taxa richness of lithophilous spawners or nursery species.

               Among benthic invertebrates, we calculate the taxa richnesses of Ephemeroptera
               (mayflies), Plecoptera (stoneflies), and Trichoptera (caddisflies) because they too
               reflect  different types of degradation. Ephemeroptera taxa are lost when toxic
               chemicals like those from mine wastes foul a stream (see Figure 17, page 43; Hughes
               1985; Kiffney and Clements 1994). Plecoptera taxa disappear as riparian vegetation
               is lost and sediment clogs the interstitial spaces among cobbles.  Plecoptera tend to
               decline at less intense levels of human influence than Trichoptera or Ephemeroptera.
               Therefore, combining these three taxa into a single "EPT"5 metric (as in RBP III
               and others; Plafkin et al. 1989; Lenat and Penrose 1996) may obscure real differ-
               ences that could help define both  the types and sources of degradation at a site.

               The signals provided by intolerant and tolerant taxa mean that the best expression
               of metrics based  on these taxa differs between intolerants and tolerants. The mere
               presence of very  sensitive, or intolerant, taxa (as apparent from taxa richness) is a
               strong  indicator of good biological condition;  the relative abundance of these taxa,
               in contrast, is difficult to estimate  accurately without extensive and costly sampling
               efforts. Presence  alone of tolerant taxa, on the other hand, says little about biologi-
               cal condition since tolerant groups inhabit a wide range of places and conditions,
               but as conditions deteriorate, their relative abundance rises (see Figure 21, page 61).
               In general, we recommend that only about  10% (no fewer than 5% or more than
               15%) of taxa in a region should be classed as intolerant or tolerant. The point of
               these metrics is to highlight the strong signal coming from presence  of the most
               intolerant or most tolerant taxa. We avoid the average tolerance value as reflected
               in biotic indexes because the strong signals  of tolerants and intolerants are
               swamped by the  remaining 70% to 90% of taxa with intermediate tolerances.

               (For a more statistical rationale for choosing taxa richness and relative abundance,
               see Premise 19, page 80, and Figure 33, page 81.)

5  EPT is the sum of the mayflies (Ephemeroptera), stoneflies (Plecoptera), and caddisflies (Trichoptera) found in a
  benthic invertebrate sample.

                                                                                       55

-------
        PREMISE 14
 MULTIMETRIC  INDEXES  ARE BUILT  FROM PROVEN  METRICS
AND A  SCORING  SYSTEM
              Across taxonomic groups, many of the same biological attributes indicate human-
              induced disturbance (see pages 54-55, Premise 13; Table 6). Over the last 15 years,
              numerous studies have helped define those most broadly applicable metrics (Karr
              1981; Miller et al. 1988; Kerans and Karr 1994; Fore et al. 1996; see Barbour et al.
              1996b for summary table of metrics). After testing in a series of independent
              studies, 10 attributes of stream invertebrates and 10 to 12 attributes of stream fishes
              consistently emerge as reliable indicators of biological condition at sites influenced
              by different human activities in different geographic areas6 (Tables 7 and 8; see also
              Table 5, page 52).

              Consistently reliable metrics include the total number of taxa present in the
              sample (total taxa richness), the number of particular taxa or ecological groups
              (e.g., taxa richness of darters or mayflies), the number of intolerant taxa, and the
              percentage of all sampled individuals (relative abundance) belonging to stress-
              tolerant taxa (e.g., tubificid worms). Among fishes, a high percentage of individual
              fish with disease, fin erosion, lesions, or tumors indicates toxic chemicals in a
              stream. Increased frequency of hybrids seemed a useful metric in early IBI studies
              (Karr 1981; Karr et al.  1986), although relatively few studies since then have used it
              successfully. Increased hybridization could indicate a  loss of habitat variety and
              consequent mixing of gametes from different species spawning in a homogenized
              environment (Hubbs 1961; Greenfield et al. 1973).

              The values of metrics such as these provide the best and most complete assessment
              of a site's condition, but to compare sites and communicate their relative condi-
              tion to the widest possible audience, metric values at a site are summarized in the
              form of an aggregate index—the index of biological integrity. Because human
              actions affect biological resources in multiple ways and at multiple scales,  10 to 12
              metrics from four broad categories (see Table 4, page 48, and Table 5, page 52) are
              selected and then scored using standardized scoring criteria; these  metrics  are the
              building blocks of the multimetric index (Karr 1981; Karr et al. 1986; Karr 1991).
Metric values
 are scored by
  comparison
with the value
 expected at a
   minimally
 disturbed site
              Because we now know a great deal about which metrics respond consistently to
              different levels of human effect, agency biologists with limited budgets do not
6  The number of metrics in the fish IBI is somewhat smaller in relatively simple systems such as cold-water streams
  (Lyons et al. 1996). Wetlands may be most appropriately assessed with multiple taxa (e.g., plants, insects, fish , birds)
  with fewer metrics for each of the taxa- or assemblage-based IBIs.
56

-------
TABLE 6. Regardless of taxon used or habitat sampled, similar metrics respond predictably (V) to human
influence. As human influence increases, taxa richness declines, the relative abundance of generally tolerant
organisms increases, and generally sensitive taxa disappear. (Sources: see page 54, Premise 13.)
Taxon
                 Habitat
               Taxa richness
              Relative abundance
              of toierants
Number of sensitive
or intolerant taxa
Fish

Fish

Fish
Periphyton
River

Lake

Estuary
Benthic           River
  invertebrates
                 River
V
V

V

V
                                                                           V
                                                      (generalists)
                                                                           (nursery specialists)
               have to test all attributes to begin using a multimetric index; instead, they can take
               advantage of and build on studies that have been done before. Nevertheless,
               whenever more than five sites with different human influences can be sampled, we
               encourage testing of metric responses in particular locales to see whether the
               patterns observed in other regions can be generalized.

               Before one can build a multimetric index, one must convert metric data into a
               common scoring base. Typically, metrics are quantified with different units and
               have different absolute numerical values (e.g., numbers of taxa may range from 0
               to a few dozen; relative abundances of certain groups may range from 0% to 100%).
               Also, some metrics increase in response to human disturbance (e.g., percentage of
               omnivores) while others decrease (e.g.,  overall taxa richness). To resolve such
               differences, each metric is assigned a score based on expectations for that  metric at
               minimally disturbed site(s) for that region and stream size. Metrics that approxi-
               mate what one would expect at minimally disturbed sites are assigned a score of 5;
               those that deviate somewhat from such sites receive  a score of 3; those that deviate
               strongly are scored 1 (Karr 1981; Karr et al. 1986; Karr 1991). The final index is the
               sum of all the metrics' scores (Figure 20).

               In all cases, the basis for assigning scores  is "reference condition," that is, the condition
               at sites able to support and maintain a balanced, integrated, and adaptive biological
               system having the full range of elements and processes expected for a region; thus
               IBI explicitly incorporates biogeographic variation into its assessment of biological
               condition. In some regions, biologists can actually find and sample from sites that
               have not been influenced, or have been influenced only minimally, by humans; in
               other regions, where pristine sites are unavailable, biologists may have to infer
               reference condition based on knowledge of the evolutionary and biogeographic
                                                                                        57

-------
TABLE 7.  Potential metrics for benthic stream invertebrates. Metrics that responded to human-induced
disturbance as predicted are indicated by a check (V); those marked with a dash (—) were not tested. Percent
sign (%) denotes relative abundance of individuals belonging to the listed taxon or group(s). Metrics marked
with an asterisk (*) have been included in a 10-metric multiregional B-IBI (Karr 1998; see also Table 11, page
103). Human influence in Tennessee Valley consisted primarily of mining and agriculture; in southwestern
Oregon, logging and road building; in eastern Oregon, grazing; in Puget Sound lowlands, urbanization
(measured by percentage of impervious surface); in Japan, multiple human influences; and in Wyoming,
recreation.
Metric
Predicted
response
Taxa richness and composition
Total number of taxa* Decrease
Ephemeroptera taxa*
Plecoptera taxa*
Trichoptera taxa*
Long-lived taxa*
Diptera taxa
Chironomidae taxa
Decrease
Decrease
Decrease
Decrease
Decrease
Increase
Tenn. SW
Valley Ore.
V V
V V
V V
"V v
	 V


Eastern Puget
Ore. Sound Japan
V V
V V
V V
V V V
V

— —
NW
Wyo.

V
V



Toierants and intoierants
Intolerant taxa*          Decrease
Sediment-intolerant taxa  Decrease
% tolerant*              Increase
% sediment-tolerant      increase
% planaria + amphipods  Increase
% oligochaetes          Increase
% chironomids           Increase
% very tolerant
% "legless" organisms
Increase — — — — V —
Increase — — — — V
Feeding and other habits
% predators*
% scrapers
% gatherers
% filterers
% omnivores
% shredders
% mud burrowers
"Clinger" taxa richness*
Population attributes
Abundance
Dominance*
Decrease V V V
Variable V V V
Variable V
Variable V
Increase V
Decrease V V
Increase — — — — V —
Decrease — — — — V —

Variable V V
Increase V V V V
" Sediment-surface taxa richness
58

-------
TABLE 8. Metrics used in the original fish index of biological integrity (IBI) for midwestern US streams and
equivalents for more general application.
Original fish IB!
General fish IBP
Number of fish species

Number of darter species

Number of sunfish species

Number of sucker species

Number of intolerant species

Relative abundance of green sunfish

Relative abundance of omnivores

Relative abundance of insectivorous cyprinids


Relative abundance of top carnivores

Number of individuals

Relative abundance of hybrids

Relative abundance of diseased individuals
Number of native fish species

Number of riffle-benthic insectivores

Number of water column insectivores

Number of pool-benthic insectivores

Number of intolerant species

Relative abundance of individuals of tolerant species

Relative abundance of omnivores

Relative abundance of insectivores (specialized
  insectivores)

Relative abundance of top carnivores

Not a reliable metric

Not often used successfully

Relative abundance of diseased individuals
  Metrics chosen vary as a function of stream size, temperature class (warm-, cool-, cold-water), and ecological factors to
  reflect biogeographic and other patterns, including sensitivity to different human influences.
               processes operating in the region (see Premise 30, page 108). In still other cases
               (Fausch et al. 1984; Hughes 1995; Hughes et al., in press), researchers must depend
               on historical data, collected when human activity was less, to define reference
               condition.

               Simple, uniform rules for setting scoring criteria—the range of numerical values
               that qualify a metric for a score of 5, 3, or 1—are therefore difficult to define
               because they depend in part on the sampling design that generated the data. In a
               hypothetical watershed where one-third of sampled sites were pristine, one-third
               moderately disturbed, and one-third highly disturbed, one could simply divide the
               values for each metric at the thirty-third and sixty-seventh percentiles. But human
               activities tend to homogenize landscapes and living systems so that a majority of
               sites in a given watershed are likely to be moderately or even severely degraded,
               such as in the Japanese study illustrated in Figure 21. In the real world,  therefore, it
               makes sense to err on the conservative side by expanding the middle score (3) or

                                                                                            59

-------
FIGURE 20. Range and numeric
values for six invertebrate
metrics from a severely dis-
turbed site (lower Elk
Creek, v) and a less disturbed
site (East Fork Cow Creek, •) in
southwestern  Oregon. Because
the metrics have different
quantitative values, they are
given scores (5, 3, 1) to put
them on the same scale: 5
indicates little or no deviation
from expected, or reference,
condition; 3 indicates moderate
deviation from expected
condition; and 1 indicates
strong deviation from expected
condition. Vertical lines in the
figure represent the cutoff
points for assigning these
metric scores.  Total benthic IBI
(B-IBI) value for these two sites
equals the sum of these metric
scores and five others (from
Fore et al. 1996).
           Taxa richness
                        10
  Piecoptera taxa
   Intolerant taxa
      Relative abundance:   |-
               tolerants  QQ%
      Relative abundance:
              dominants  80%
             Abundance
                  40
   54
                     40%     20%
                    55%    40%
80
V
\ 1
1 1
1 6 9
?
I
14
I I
I !
02 5
T
i
14
              0%
             20%
                        0   500    1500
Benthic IBI
        15
Lower Elk Creek
            3500
         e
         47
E. Fork Cow Creek
               even the low score (1) to include more sites rather than fewer, thus making it more
               difficult for a site to attain a high score.
               Natural shifts or breaks in the distribution of metric values can guide the setting of
               scoring criteria; indeed, scoring criteria should be adjusted to fall at these points
               because the points often reflect a biological response. Where metric values increase
               or decrease linearly across the gradient of human influence (Figure 21, top), as in
               total taxa richness, the values are typically trisected into three equal divisions, each
               representing the criteria  for assigning a score of 1, 3, or 5. Other metrics, such as
               relative abundance of tolerant organisms or particular trophic groups, respond in a
               more skewed pattern (Figure 21, bottom; Figure 22); for these metrics, natural
               break points suggest setting scoring criteria in unequal divisions. Setting scoring
               criteria is  an iterative process and should be revisited as regional databases and
               biological knowledge expand.
60

-------
FIGURE  21. Plots of two
sample metrics showing
different ways to set the
criteria for assigning metric
scores of 1, 3, and 5. For
metrics with a monotonic, or
linear, distribution (e.g., total
taxa richness: top), one
divides into equal thirds the
range from 0  to the highest
value (here 36).  For metrics
that are  not distributed
monotonically, one uses
natural breaks in the distribu-
tion to define score bound-
aries (shown in the bottom
plot by vertical dotted lines).
Metric values and classifica-
tion scheme for human
influence come  from Rossano
(1995) (see also Figure 3, page
23, and Figure 4, page 31).
   co
   co
   05
   C
   O
  •ZT
   3
   03
o>
      40
      35
      30
      25
      20
      15
      10
       5
       0


     100

      80

      60

      40

•§    20
   0
                                                        e  «
                                               •V
                                         0 M^»4MMMM»
                                            Low
                                                        Human influence
                                                                                            61

-------
FIGURE  22. Relative abundance (percentage
of sediment-tolerant individuals) and taxa
richness (number of taxa) plotted against
the rank order of that metric value for 86
stream sites sampled in southwestern
Oregon. Dotted vertical lines mark the
range of values (scoring criteria) for scoring
metrics as 5, 3, or 1. Most sites have near
0% sediment-tolerant individuals; only
very degraded sites show higher values of
this metric. In other words, the distribution
pattern for this metric is skewed. Taxa
richness, in contrast, is less skewed. Scoring
criteria are divided into unequal divisions
for skewed metrics, reflecting a biological
response in the data (top); the divisions are
more equal for unskewed metrics (bottom).
In both cases, most sites receive a score of
3, the most conservative interpretation of
condition.
      80
      60
 CO    40
DC
      20
       0
        0.0   0.1   0.2   0.3  0.4   0.5  0.6
                 Sediment tolerants
              (relative abundance, %)
     80

     60
M
&   40
cr
     20

      0
                                                                20    40     60     80

                                                                      Number of taxa
                                        100
62

-------
                                                              PREMISE  15
        THE STATISTICAL  PROPERTIES  OF MULTIMETRIC INDEXES
                                                                     ARE  KNOWN
             JMultimetric indexes are statistically versatile. We can use familiar statistical tests,
             such as t-tests or analysis of variance (ANOVA), to look for significant differences
             in index values because IBI satisfies the model's assumptions (Fore et al. 1994). In
             addition, because IBI is a single integrating number, it serves as a yardstick to rank
             (compare) sites according to their relative condition. Finally, from statistical
             power analysis, we know that an IBI formulated and developed as we propose can
             detect six distinct categories of resource condition (Fore et al. 1994; Doberstein,
             Karr, and Conquest, in prep.). Because we know the statistical precision of a given
             IBI, we can use IBIs to discover and define differences among sites caused by
             changes through time or space.
             Using bootstrap7 analysis of fish data from Ohio, we determined that the distribu-
             tion of IBI at one stream site is unimodal (Figure 23); integrating metric scores
             into a multimetric index thus allows us to take advantage of properties of the
             mean. Integration can be done by summing or averaging the metric scores; the
             results are equivalent. For the fish IBI, averaging metric scores reduced the
             variance and increased precision (Fore et al. 1994). The values for multimetric
             indexes approximate a normal distribution (Fore et  al. 1994), probably because
             averages tend to be distributed normally by the central limit theorem (Cassella
             and Berger 1990); consequently, multimetric indexes can be tested with familiar
             statistics such as ANOVA or regression.
             The IBI distribution satisfies the assumptions of ANOVA, even though the strong
             unimodal peak but no tails (expected given the way scores are calculated) is not
             strictly normal (see Figure 23). These assumptions are: (1) the error term is unbi-
             ased; (2) measurement error is not correlated among sites; (3) variance is homoge-
             neous; and (4) the distribution of the error term is normal (assumed only for
             hypothesis testing).
             Some regulatory situations  require statistical evidence that a significant change
             has occurred in the field. The statistical properties of IBI make it an appropriate
             choice for these situations.  In reality, however, management decisions are rarely
             based on the outcome of a statistical test or its associated/-value. Often, sites
The bootstrap algorithm creates new samples by randomly selecting and replacing elements from the original sample.
Random sampling with replacement continues until the bootstrap sample contains the same number of elements as
the original sample. Many such samples are generated to approximate the distribution of IBI at a site.

                                                                                    63
  Integrating
 metric scores
      into a
 multimetric
 index allows
   us to take
advantage of
 properties of
   the mean

-------
                   300 r      A
                   200 -
              D,

              £0
              CO
              Q.
              o
              o
             .Q
              CD
             .n
                   100 -
                              28     32      36
                   300 r
                   200 -
                   100 -
                              22      26      30
    30      34      38
 D
34      38      42     46
                                                      Fish IB!

               FIGURE 23. Distribution offish IBI values from bootstrapping analysis for four typical
               stream sites in Ohio; the unimodal distributions approximate a normal distribution. The
               line below each *-axis marks the 95% confidence interval (< 8). A difference of ± 4 points
               in IBI values therefore represents a statistically significant change in biological condition
               (Fore et al. 1994).
64

-------
FIGURE 24. Power
curves for the fish IBI
estimated from nine
locations sampled three
times by the Ohio EPA
(from Fore et al. 1994).
Actual points are shown
only for a = 0.05;
other values of a are
pictured as smoothed
lines. For 80% power
(a value accepted by
most researchers), IBI
can reliably detect a
difference of about 8
points at an a—level of
0.05 (projected onto
the x-axis, as indicated
by dashed lines). Total
IBI can range from 12
to  60, a difference of
48; thus IBI can detect
six non-overlapping
categories of biological
condition.
    100


     80

u.   60
0)
I
Q-   40

     20
a = 0.1
                              10        15        20

                          Difference in mean SB!
                                         25
               within an area need to be ranked so that funds for restoration can be allocated, or
               policies to determine human use can be evaluated. Managers and policymakers
               therefore need to know something about the magnitude of differences across sites
               and, most important, whether observed differences are biologically meaningful.
               Without this kind of information, they cannot ascertain the causes of those differ-
               ences.

               A multimetric index provides a yardstick for measuring and communicating the
               biological condition of sites, but how many tick marks are on the yardstick? In
               other words, what is the precision of the index? On the basis of a statistical power
               analysis offish data from Ohio EPA, IBI can detect six distinct categories of
               biological condition (Figure 24). Ohio EPA's version of IBI, like the original IBI,
               ranges from 12 to 60. For this index, 95% of the variability in IBIs generated by the
               bootstrap procedure fell within ± 4 points of the observed IBI (Fore et al. 1994).
               These results confirmed previous estimates of confidence intervals based on field
               observations through time (Angermeier and Karr 1986; Karr et al. 1987).
                                                                                       65

-------
          PREMISE  16
  MULTIMETRIC  INDEXES REFLECT BIOLOGICAL  RESPONSES
  TO  HUMAN  ACTIVITIES
  Because IBI
     can detect
   many influ-
  ences in time
    and space,
  it is an ideal
tool for judging
 the effectivenss
of management
     decisions
Human activities degrade water resources by altering one or more of five principal
groups of attributes—water quality, habitat structure, flow regime, energy source,
and biological interactions—often  through undetected yet potentially devastating
effects on water resources (Table 9; Karr 1991, 1995b). Human activities such as
logging, agriculture, and urbanization affect water quality by introducing sediment
and raising water temperature (Bisson et al. 1992; Megahan et al. 1992; Gregory
and Bisson 1997; Williams et al. 1997). Habitat structure changes when large
woody debris is removed from a channel, or when sediment fills the spaces among
cobbles. When vegetation is removed from a watershed, streams and rivers flood
more heavily and more often, or they may dry up entirely. Logging of riparian
areas also alters the energy sources in a stream: removing riparian vegetation
removes one source of allochthonous organic material, disrupts entry of large
woody debris to the channel, and also increases light reaching the stream, which in
turn increases water temperature and algal growth and thus the algal material
available to fish and invertebrates. Overfishing and introducing alien species,
including native fish raised in hatcheries, alter relationships among predators and
prey or competitors. As these changes stress the normal assemblage of stream
organisms, they degrade the stream.
Because multimetric indexes are sensitive to these five factors, they quantify the
biological effects of a broad array of human activities. The effects of logging were
generally reflected in benthic IBIs  from southwestern Oregon (Figure 25), even
though logging was quantified simply as the percentage of total watershed area that
was logged (Fore et al. 1996). Secondary influences on B-IBIs in these watersheds
included road density and location. In east-central Illinois (Karr et al. 1986), fish
IBIs revealed the influences of agriculture: IBIs were lowest at sites where cultiva-
tion reached  streamside, and stream channels had been dredged and straightened;
IBIs were higher downstream, where the riparian area was left either as pasture or
forest, and the stream channel was intact (Figure 26). In the Pacific Northwest,
urbanization generally produces lower IBIs than logging (Kleindl 1995; Fore et al.
1996).

Multimetric indexes can reflect changes in resident biological assemblages caused by
single point sources in one river or stream as well as differences over a wide geo-
graphic area. For example, taxa richness of mayflies, stoneflies, and caddisflies (see
Figure 17, page 43), as well as overall B-IBI (Figure 27), fell sharply immediately
  66

-------
TABLE  9. Five attributes of water resources altered by the cumulative effects of human activity, with examples
of degradation in Pacific Northwest watersheds (from Karr 1995b).
Attribute
  Components
Degradation in Pacific Northwest
  watersheds
Water quality
Temperature, turbidity; dissolved
  oxygen; acidity; alkalinity; organic
  and inorganic chemicals; heavy
  metals; toxic substances
Increased temperature and turbidity

Oxygen depletion

Chemical contaminants
Habitat structure
Substrate type; water depth and
  current speed; spatial and
  temporal complexity of physical
  habitat
Sedimentation and loss of spawning gravel

Obstructions interfering with movement of
  adult and juvenile salmonids

Lack of coarse woody debris
Destruction of riparian vegetation and
  overhanging banks

Lack of deep pools
Altered abundance and distribution of
  constrained and unconstrained channel
  reaches
Flow regime
Food (energy) source
Biotic interactions
Water volume; flow timing
Type, amount, and size of organic
  particles entering stream;
  seasonal pattern of energy
  availability
Competition; predation; disease;
  parasitism; mutualism
Lower low flows and higher high flows limiting
  survival of salmon and other aquatic
  organisms at various phases in their life
  cycles

Altered supply of organic material from
  riparian corridor

Reduced or unavailable nutrients from
  carcasses of adult salmon and lampreys
  after spawning

Increased predation on young by native and
  alien species

Overharvest by sport and commercial fishers

Genetic swamping by hatchery fish of low
  fitness
Alien diseases and parasites from
  aquaculture, including hatcheries
                downstream of a streamside sludge pond on the North Fork Holston River in
                Tennessee (Kerans and Karr 1994). Across six midwestern regions or watersheds
                with different degrees of land development, fish IBIs differed markedly (Figure 28;
                Karr et al. 1986). Yet despite their different fish faunas, one can compare the
                condition of these regions on a single quantitative scale.
                                                                                             67

-------
FIGURE 25. Benthic IBI values
plotted against the percentage of
area logged in watersheds in
southwestern Oregon in 1990.
Percentage of watershed area
logged alone is an incomplete
measure of human influence
because information about type of
logging, time since logging, or
location and type of roads is not
included. Nevertheless, B-IBI
clearly distinguishes the best
available (+) from the degraded (-)
sites.
                          50
                    5    40
                     o
                    £    30
                     CD
                    CO
                          20
                          10
    • •     9
   •      •   «  «
            « e ••
                                                              D

                                                              D
0      20       40      60
         Area logged (%)
                                                              80
FIGURE 26. Fish IBI
values for Jordan
Creek, a first- to third-
order stream in east-
central Illinois (from
Karr et al.  1986).
Higher values repre-
sent changes in the
fish assemblage that
reflect improved
biological conditions
from stations 1
through 4.
    60
    50
I  40
    30
                                                             Excellent
                                                             Good
                               Fair
                                                             Poor
                                  1b 1c 1e 2a 2b 2d 3a 3d 3e 4a 4b 4c 4d 4e
                                                  Station
FIGURE 27. Median B-IBI
values for the North Fork
Holston River in the Tennes-
see Valley from 1973 to 1976
(from Kerans and Karr 1994).
The arrow marks the location
of a streamside sludge pond.
(Compare Figure 12, page 41.)
                           73   --H-74   	+	75   -A-76
68
          m
          o
          c
          CD
          JO
          C
          CC
          TJ
          CD
               55 r
               45
                                        35
                                        25
                                           160   140   120   100   80    60    40    20     0
                                                       Distance from mouth (km)

-------
               Because IBI can detect many influences, both in time and space, it is an ideal tool
               for evaluating the efficacy of management decisions. Along the Scioto River, Ohio,
               for example, fish IBI values for data collected in 1979 paralleled degradation
               resulting from regional habitat deterioration and wastewater effluent. By 1991,
               improvements in effluent treatment processes had substantially raised IBI (Figure
               29); in this  case, the benefits of management can be seen as increased IBI. Manage-
               ment actions may also decrease IBI. A local effort to stabilize the channel up-
               stream of a woodlot in Indiana resulted in substantial sediment transport into the
               woodlot reach of the stream and a sharp decline in IBI (Figure 30). The graphs of
               IBI values from these places can be quickly interpreted by policymakers and
               concerned citizens as well as research biologists.
FIGURE 28. Distribution of sites in six
midwestern regions or watersheds accord-
ing to biological condition. The fish IBI
was used to distinguish six categories of
condition: NF, no fish; VP, very poor; P,
poor; F, fair; G, good; and E, excellent. The
IBI values varied across the six regions
depending on the type and intensity of
human land use (from Karr et al. 1986).
                                                                                   Mean
40
60
20
     Arkansas
     Red River
     n = 37

40
vj>
d^
CD
(75
40

n=12
-

-
I —
- Raisin R\ver
n=139
i — ' i












I — * ' — I



A



                                                   40
   - SaltCreek_
     n=125
                                                   40
   . Chicago
     n = 87
                                                         NFVP     P     F    G    E

                                                               Condition from IBI
                                                                                      69

-------
FIGURE  29. Fish IBI values
along the Scioto River,
Ohio (from Karr 1991). The
lower IBIs reflect degrada-
tion associated with
combined-sewer overflow
(CSO) and wastewater
treatment plants (WWTP).
Improvements in effluent
treatment, reflected in an
overall increase in IBIs from
1979 to 1991, brought most
of the sites into compliance
for warm-water habitat
(WWH); some sites even
scored as excellent warm-
water habitat (EWH).
m
CO
LL
60


50


40


30

20


10
                     CSO   WWTP
           1991
                                                           EWH
                135        125        115
                                River mile
                                           105
             95
                          60
                     _   40
                     CD
                      CO
                          20
                                                                                        Very poor
                                74   75   76    77
                         78   79
                        Year
                               80    81
82
                FIGURE  30. Changes in fish IBI values over time in Wertz Drain in Wertz Woods, Allen
                County, Indiana. During 1974-76, Wertz Drain had relatively high IBI values for a first-
                order stream in an area of intensive agriculture. The channel was sinuous, pools and riffles
                were well developed, and there were trees shading the channel. Although this site was not
                intentionally modified, a poorly executed bank stabilization project upstream during 1976
                transported sediment to the site. Consequently, habitat quality deteriorated, as did the
                resident fish community. IBIs clearly trace the decline and slow improvement in stream
                condition over time.
70

-------
                                                               PREMISE  17
                        HOW  BIOLOGY AND STATISTICS ARE USED  IS
                                          MORE  IMPORTANT THAN TAXON
     In many
 circumstances,
the redundancy
    that comes
from sampling
 more than one
   assemblage
 permits better
   diagnosis of
   degradation
Xhe taxonomic group most appropriate for assessing environmental condition
depends on the region to be assessed; agency resources; special staff expertise; and,
most important, how biological knowledge is applied in designing sampling and
analysis protocols (Karr 1991). Of the 47 states with bioassessment programs in
place, 20 use fish, 44 use benthic invertebrates, and 4 use algae (periphyton or
diatoms) (Davis et al. 1996). Twenty-six states use more than one major group, such
as fish as well as invertebrates. No one taxon is correct or incorrect in a monitoring
program. Like using 10 to 12 IBI metrics, sampling more than one taxon creates
some redundancy. But in many circumstances, that redundancy pays off by sub-
stantially improving one's ability to diagnose the causes of degradation, causes that
may be apparent only if more than one assemblage is sampled (e.g., fish and
invertebrates, fish and algae).

In the Pacific Northwest,  benthic invertebrates have some advantages over fish as
the  primary subjects for biological monitoring (Fore et al.  1996). Macroinvertebrate
taxa are numerous, ubiquitous, abundant, and relatively easy to sample; their
responses to a wide spectrum of human activities are relatively easy to  interpret.
Moreover, because the life cycles of some benthic invertebrates extend several
years, they are excellent integrators of past human influences. But fish also have
advantages. Taxa such as sculpins, cyprinids, and suckers are often well represented
in numbers of species and individuals in Pacific Northwest streams.  Broadly
ranging species such as anadromous salmonids offer a tool for monitoring large
landscapes and the effects of harvest, hatcheries,  and barriers to migration (R. M.
Hughes, pers. commun.). Some biologists recommend including more than one
vertebrate class (e.g.,  fish and amphibians) in any IBI based on vertebrates (e.g.,
Peter Moyle, cited in Miller et al. 1988; Hughes et al., in press).

Convenience, money, time, or place will also affect the choice of taxon to sample.
Chosen taxa should be cost effective to collect and identify. Most fish  (exceptions
include some sculpins, minnows, and darters) can be identified at once in the field.
More equipment may be required for fish (e.g., electrofishing gear) than for inver-
tebrates, although both require more-complex equipment in deep-water environ-
ments. Permit requirements, too, may be more complicated for sampling fish than
benthic invertebrates or algae. Insects and diatoms, on the other hand, are easier to
sample in the field but more difficult and time-consuming to identify  in the
laboratory.
                                                                                    71

-------
               Watershed size and location can affect the consistency of results obtained using
               different taxa. Fish- and invertebrate-based assessments may disagree, depending
               on river size or region. In large watersheds (> 500 mi2), for example, fish and
               benthic IBIs ranked sites the same only 44% of the time (Yoder and Rankin 1995a).
               The two kinds of IBIs gave the same results 65% of the time for midsize streams
               and rivers (50 to 500 mi2) and  75% of the time for small streams (Yoder and Rankin
               1995a). According to R. M. Hughes (pers. commun.), species richnesses offish and
               invertebrates rarely agree for Appalachian streams and New England lakes. A high-
               priority challenge is  to determine if these apparent inconsistencies reflect real
               differences in the sensitivity of the different assemblages or if they result from
               differences in sampling effectiveness for fish and invertebrates as a function of
               water body size.

               Finally, one has to be careful that taxa chosen for biological monitoring reflect real
               changes in the local  and upstream landscape. The absence of anadromous fishes
               may not indicate that a site is in poor condition; a natural waterfall may simply be
               blocking fish passage, or their absence may reflect ocean conditions or overharvest
               rather than site condition. Migratory birds or fishes inhabiting estuaries or the
               ocean for part of their life cycles may be affected more by conditions elsewhere
               than by those in the monitored streams. Indeed,  landscape-level factors may well
               have more effect on local and regional biological integrity than do traditionally
               monitored alterations in physical or chemical habitat (Richards et al. 1996, 1997;
               Roth et al., 1996; Allan et al. 1997; Wang et al. 1997; Hughes et al., in press).
               Species listed as threatened or endangered under the Endangered Species Act
               reflect landscape conditions well, and including them in an IBI may even improve
               management of these species by putting them squarely into their larger biological
               context (Karr 1994).

               In short, different taxa have  different advantages for different places. As for all
               aspects of designing  a biological monitoring program, researchers need to tease out
               the patterns of response among taxa from artifacts of defining reference condition
               or of sampling itself; they need to consider carefully how different taxa might
               permit a better diagnosis of the causes of degradation in different geographic areas
               and situations. The most accurate assessments of biological condition may well
               come from determining biological condition using IBIs based on more than one
               assemblage.
72

-------
                                                              PREMISE  18
                    SAMPLING  PROTOCOLS ARE WELL  DEFINED  FOR
                                                FISHES  AND INVERTEBRATES
 One sampling
method doesn't
   fit all, but
     sampling
      must be
 standardized
 I he utility of any measure of biological condition in a stream depends on how
accurately the original sample represents the fauna present in that stream—that is,
how successful it is in avoiding statistical "bias." Indeed, a fundamental assump-
tion of the fish IBI is that the sample on which it is based reflects the taxa richness
and relative abundances of the stream's fauna, without bias toward taxa or size (Karr
et al. 1986). Implicit in this assumption is that sampling effort is standardized. Any
fish sampling protocol must therefore be consistent, comprehensive, and representa-
tive of the stream's microhabitats, including pools, riffles, margins, and side
channels. Many researchers during the last 15 years have helped to refine the
protocols for sampling fish to evaluate or implement an IBI (Ohio EPA 1988; Lyons
1992a,b; Lyons et al.  1995; Lyons et al. 1996). Other protocols for sampling fish
and invertebrates have also been described, although their goals  and applications
vary somewhat from development of an IBI [Klemm et al. 1990, 1993, for
USEPA's Environmental Monitoring and Assessment Program (EMAP); Cuffney et
al. 1993 and Meador et al. 1993 for US Geological Survey's National Water Quality
Assessment (NAWQA)].
Early work on the fish IBI identified sampling gear, the range of microhabitats in  a
stream, and stream size as important factors affecting sampling accuracy (Karr et al.
1986; Ohio EPA 1988). These researchers showed that, with standard procedures,  it
is feasible to sample virtually all fish from all microhabitats in small- to medium-
size streams. Boat-mounted electrofishing gear is the most effective and most
efficient in the widest variety of stream types. Early work by Angermeier and Karr
(1986) suggested that fully sampling from two entire meanders typically captures
the variety of stream microhabitats, yielding enough individual fish to calculate
taxa richness and relative abundances for IBI metrics. More recent work in several
geographic areas suggests about 40 channel widths as the appropriate length of
sampling efforts  (Lyons 1992b; Paller 1995a,b; Angermeier and Smoger 1995). In
relatively homogeneous systems (e.g., low-gradient streams), longer distances may
be needed (Angermeier and Smogor 1995).

Large rivers, lakes, reservoirs, and coastal and estuarine environments contain a
diversity of habitats. No single sampling method is appropriate to every one of
those habitats, yet using multiple sampling methods is difficult,  expensive, and
thus impractical. As a result, selective sampling protocols, which measure biologi-
cal condition based on one or a few local microhabitats, have been developed for

                                                                    73

-------
               these systems (Thoma 1990; Weaver et al. 1993; Jennings et al. 1995; Deegan et al.
               1997; Whittier et al. 1997b; Whittier 1998).
               Invertebrates, such as benthic insects, pose different sampling challenges: more
               species to deal with than among fishes, more microhabitats, more sampling tech-
               niques and protocols appropriate for the variety of microhabitats. Therefore,  one
               must either use many different protocols to get a representative invertebrate
               sample or first test whether sampling from a single microhabitat accurately repre-
               sents stream condition. In their study of streams in the Tennessee Valley, Kerans et
               al. (1992) sampled invertebrates from pools (Hess sampler) and riffles (Surber
               sampler) and evaluated 18 invertebrate attributes as indicators of human influence.
               They concluded that monitoring designs "that quantitatively sample multiple
               habitats, are spatially replicated, and use many different attributes for assessment
               provide a good method for determining biological condition" (Kerans et al. 1992:
               388). Although a number of invertebrate attributes behaved similarly for pools and
               riffles, others (e.g., mayfly taxa richness, caddisfly taxa richness) matched expected
               stream health rankings better for pools than for riffles. When the researchers
               combined metrics to create a B-IBI, patterns were stronger for pools than for
               riffles. Rankings were not always consistent for pool and riffle data (Kerans and
               Karr 1994), perhaps because these studies were done in relatively large rivers with
               substantial sedimentation, which  might be detected more readily in pool environ-
               ments (B. L. Kerans, pers. commun.).
               Debate still rages over whether single- or multiple-habitat sampling is best with
               invertebrates. Some contend that a single habitat is adequate; others insist that
               sampling multiple habitats is essential. Our experience suggests that sampling a
               single habitat is appropriate and adequate, although our reasons for this conclusion
               do not always agree with others'. Sampling riffles, for example, is often justified on
               the grounds that riffles are the most diverse, the most productive, or the dominant
               habitat (Plafkin et al.  1989; Barbour et al. 1996b; Barbour et al., in press). We are
               not convinced that these claims are true or even at issue. Still, because we  have
               successfully and cost-effectively used single-habitat samples to discern human
               effects on small streams (Kerans et al.  1992; Kerans and Karr 1994; Kleindl 1995;
               Rossano 1995, 1996; Patterson 1996),  we recommend a single-habitat sampling
               protocol that concentrates on riffles.
               Because a Surber sampler samples only part of a riffle, a single sample may not be
               precise enough to judge stream condition. We therefore tested the effects of
               replicate sampling of invertebrates, using data from the John Day River basin of
               north-central Oregon (Fore and Karr, unpubl. manuscript). Five replicates were
               collected, and their contents were identified for each of seven sites (Tait et al.
               1994). Using a bootstrap resampling algorithm, Fore and Karr simulated the effects
               of taking one, three, or five replicates at a site. Fore and Karr changed the number
               of replicates for each site to test whether metric precision varied as a function of
               the number of replicates (Figure 31). With only one  replicate, a metric could either
               increase or decrease depending on which of the five  replicates was chosen by the
               bootstrap algorithm. In practice, therefore, the numerical value of a metric calcu-
               lated using a single Surber sample at a site would depend on where in the riffle that

74

-------
FIGURE 31. Results of bootstrapping analysis
(random sampling with replacement) of the
relative abundance (percentage) of predators for
seven stream sites along a gradient of grazing
intensity in the John Day Basin, Oregon. For
each site, one, three, or five replicates were
randomly selected, and least-fit regression lines
(100 in each graph above) were plotted. The
lines in the upper graph are based on means for
one replicate (out of five possible) per site; in
the middle, for three replicates per site; in the
bottom graph, for five replicates per site.
Precision increases with number of replicates,
especially between one and three replicates; in
fact,  the relationship between site condition
and proportion of predators may appear either
negative or positive  with only one replicate.
Note, however, that precision increases  rela-
tively little from three to five replicates. The
lower two graphs clearly show that the relative
abundance of predators increases as resource
condition improves.
2
o
to
T3
S
Q.
      0
      8 h
                                                      Q L,
                                                         Poor
                                         Good
                                                                   Site condition
               sample had been taken. When the mean of three replicates is plotted, however, the
               relationship between metric scores and human influence is more consistent (see
               Figure 31). Metric precision increases little if five replicates are collected instead of
               three. Thus we conclude that the increased costs of sample collection and analysis
               for three replicates over one are justified, but not those for five replicates.
               For invertebrates, therefore, we recommend a standard sampling area of approxi-
               mately 0.1 m2 (0.3 m-by-0.3 m Surber sampler frame) and  three replicate samples
               for each site. We also recommend collecting from riffles for three reasons: (1) riffles
               are easier to define and identify by field crews than are pools or margins; (2) riffles
               are more uniform than other stream microenvironments and thus easier to com-
               pare across watersheds; and (3) riffles are shallow, and the  current through them is

                                                                                          75

-------
               fast, making sampling with kicknets or Surber samplers easier. We also take all
               replicates in a single riffle; this strategy characterizes one site more fully than does
               the alternative of sampling once in each of several riffles, as some protocols pro-
               pose (e.g., EMAP; R. M. Hughes, pers. commun.).

               It is especially important to collect and count a sufficient number of insects to
               characterize the biota in multiple dimensions. If sampling fails to yield a total of
               500 or more organisms (for example, in regions where natural invertebrate densities
               are low), the number of replicates or the sampled area may need to be increased.
               We believe that sampling enough organisms is far more important than how
               sampling is organized (e.g., single or multiple riffles, composite samples, or no
               composite  samples). Subsampling that counts only 100, 200, or even 300 organ-
               isms, as recommended by RBP and some other protocols, tends to reduce the
               utility of many metrics that have become standard in multimetric assessments
               (Doberstein, Karr, and Conquest, in prep.; see Premise 28, page 101).

               It is probably not always necessary to identify insects to species; strong patterns
               emerge from samples where most insects are identified only to genus (except for
               chironomids). Identification to genus provides distinct advantages over identifica-
               tion only to family, however—in particular, by strengthening the ability to discrimi-
               nate among sites of intermediate quality (Figure 32).

               Using standard methods for sampling invertebrates (Box 2), we have been able to
               detect changes in biological condition caused by a whole range of human influ-
               ences from the Grand Tetons (Patterson 1996) to streams in several areas of Oregon
               and Washington (Kleindl 1995; Karr, Morley, and Adams, in prep.).

               Finally, for both fishes and invertebrates, timing of sampling is important. Karr et
               al. (1986) recommended periods of low to moderate stream flow for sampling
               fishes. For benthic invertebrates, recent experience leads us to recommend late
               summer, before autumn rains  begin. We sample stream insects in the Pacific
               Northwest  in September. Water flows are generally stable and safe for field work at
               that time of year, and invertebrates are abundant. Sampling at this time also
               minimizes  disturbance to the redds, or nests, of anadromous fish. Optimal  sam-
               pling period will, of course, vary regionally and should be set based on knowledge
               of the regional biota, precipitation patterns, and other relevant factors.
76

-------
FIGURE 32. Number of
clinger taxa present in
samples of benthic
invertebrates from 65
Japanese streams
ranked according to
intensity of human
influence (see Figure 4,
page 31, and Figure 5,
page 32). The pattern is
consistent across the
influence gradient,
regardless of the level
of taxonomic identifi-
cation, but the slope
becomes smaller from
species to genus to
family, reducing the
metric's usefulness for
discriminating among
sites at higher taxo-
nomic levels. (Data
provided by E. M.
Rossano.)


















CO
CO
CD
C
.c
o
CO
X
r
CD
O)
_c












22
20
18
16
14

12

10

8

6

4

2

16
14
12
10
8
6
4
2

14
12
10
8

6
4

2


- 	 	 • 	 	 	
Species 5
-A
- 'A 	
A
AA A O
- A A °>
A A
A A
A A. A
A AA
A A A
A A
A A A A .
A AA 1
A A A
A A AA A
A A A AA A A
A
Genus 5
: 	 e99« 	
999 3
99 °
e
• A A
V V
9999 1
e® 88 « '
0 » 00 0
0 000 00 0

. T Family 5
T T
_T TT TT 3
T T T T
T VWT V
V T W T -I
T WT V T T
T W T T
TT TTT TV V

                                      Low
High
                                                         Human influence
                                                                                          77

-------
BOX 2, How to sample benthic invertebrates..

    Equipment                                   •                          -                 - '
       . Wtodfied SOO-mtcron Syttoer sampler with cod end (receptacle?
      .. 2,8-ggflon bucket |dfehpdn works well .too)            '              •  -,
     -,  Squirt or spray bottle   •                              '.''''•
        Forceps                                             .       "• .
        Marking tape •                                              ,
       , 5QO-mleron soil sieve
        Sample jars {8-oz or4-oz; 4-oz: urine specimen bottles, are an Inexpensive- alternative)'
   !  '  • Plastic sandwich bags (ZlptocJ' far      Ms
     '   Pure, ethanofediftiiedty sample to about 70%
        Permanent markers (Sharpies)               -       .
      •/Pencils •                         "                   •                '
     .  '2 white, dteep*Bisfed sorting pans for large                      "      .  -
     '•   SmaVrake, trowel, or other Implement (e.g., piece of rebar of old           win martdog tape'at
        10 om
      , 50-jrrt measuring tape   •  .                                       •
        Flagging   '
        Stopwatch  .                                        .
   '  .   Camera to photograph .site and surmundlng environment
    ' '   Kitehen spatula for transferring material from sieve to satopie jar
        Pocket knife (always handy)
        Spares of selected-Items above

    Selecting a Sample Reach
       The choice of a stream reach to sample should be guided by a study's specific      and by
       ' watershed characteristics. But sampling for biological monitoring must never lose sight of the
        ultimate goal: to defect and measure human influence in watersheds, Factors to consider include
        stream size, stream gradient, range of mierohabitats In the reach, and .length of sampfe reach.

    Selecting a Sample Site
        The^djstributlon of invertebrates In small streams is patchy, driven by associations among the
        animals and stream mlcfohabitats- (e.g., riffles, pools, and raceways, or erosions! and-deposttionaf
        areas). For that reason, our standard protocol cafe for collecting Ihree replicate         as follows:
       • 1 •,  , Sample- Sn the "best" natural riffle segment within a stydy reach, even If doing so     not give
            an exact match of sybstrates for all study streams. Sediment types may-vary among streams,
            especially in association with different human activities within watetsheds. ideal sampling
            sybstrates consist of rocks 5 to 10 em to diameter sitting on top of pebbles. Avotdsubstrates-
            dominated by rocte larger than 50 em in diameter,
       2,   Sample within the stream's main flow,
       3.   Sample at water depths of 10 to 40 cm,
       - 4..  Collect three replicate samples in a single     depth, flow, and         type       be  -
            similar for the three replicates.
       5,  _ Begin sampling at the downstream end of the riffle and proceed upstream to collect the three
            replicates; avoid the tension zone-from the riffle to a downstnearo pool or other habitat
78

-------
 Sampling the Site
  .   Sampling teams may consist of two to four people. Collecting the maerolnvertehFates requires two
    - people; others can       with equipment, labeling, taking notes, and other tasks.        as
    ," follows:,  • -      "       '             ''.'•'
    . 1,  Plaee-'tlw Surber sampler- on, the streambed with the opening of the nylon net facing upstream.
         -Brace the brass frame and hold ft firmly on the substrate, especially on the side         to the
         net to prevent invertebrates from slipping tinder the net  •         •
     2,  While one person holds the:brass frame under water, the other person should Iltt any large
  -  .     rocks within fte frame and wash Mo the stream any organisms crawling or loosely attached to
,  .       the rocks, -so the organisms drift into the nylon net, Put the.             Into a bucket for
         farther picking* on shore.   -                          .
  :   3,  Whan'large rocks have been- removed, cleaned, and placed in the bucket, thoroughly stir the
         remaining substrate with the rake or trowel. Stir to a depth of 10 cm- for a Short period- (aboat
         one minute) to loosen organisms In the MetstltM       andib     therft into the net. If you
         find more large reeks with organisms on them, wash' the  organisms, into the net and put the
         •rod& Into the bucket   :                                                    !  -
  '   4,,  Now slowly lift the     frame off ihe substrate, tiling the net' up and oat of the water. Use the
         action of the water to wash trapped or clinging organisms into the Surber sampler's cod end,
     5. •  Carry the. net .and the bucket t© shore forpicking or for transferring to alcohol to sort, count,
         andldentlfy in the-lab. The Surber sampler's removable receptacle makes the transfer rela-
         tively simple. Use the squirt bottle to wash down the      of the net before removing 'the cod
         end.  Using the magnifying glass and forceps, collect and preserve every organism from the
         Surber sampler as well as from the     and water in the bucket -After removing the cod end,
       -  wash its contents through the soil sieve, picking out large rocks, detritus, and ether debris for
         hand sorting. Transfer any organic mater remaining on the sieve:fa sample }a=rs, taking
         not to damage invertebrates, A plastic kitchen       and squirt bottle work well to        -
       -  clfngers from the sides of the net or the sieve,
     6.   Put -a pencll-on-paper label Into each sample jar and label the outside with permanent ink;
         include the date, sample location (name and number), and        number,
     7,   Rinse the net thoroughly after each sample to avoid cross-conlamtaation,.

 When to Sample      -.            :       -  •'
     -Species composition and population sizes of macroihverlebrates vary substantially through a river's
              cycles. Because the goal is to      the influence  of human actions, not natural varia-
     tion through time, collect samples during a -short period. For Pacific Northwest         late
     summer or early- autumn is best This timing gives representative samples of stream invertebrates
     and simultaneously:
     1.  ' AvGids-endangering.fietd crews {as frt seasons of high water),
     2.   Standardizes seasonal context, -
     3.   Maximizes efficiency of the sampling method because flows are neither too high nor too tow,
     4.   Avoids periods when flows are likely to be too variable.
     in the Pacific Northwest, we sample in September, before the autumn rains begin. Shifting the
    sample period a bit earlier into August or extending it into October Is acceptable. But all samples
    should be collected within a period of, not more than four weeks.
                                                                                           79

-------
         PREMISE  19
 THE  PRECISION  OF  SAMPLING  PROTOCOLS CAN BE  ESTIMATED
 BY EVALUATING  THE  COMPONENTS  OF  VARIANCE
    Statistical
   analysis of
   metric and
index variance
   is useful for
   fine-tuning
    protocols
Calculating components of variance is a simple and useful technique for estimat-
ing the relative contribution of measurement error and site differences to the
overall variance of a metric or index. In general, our goal is to select metrics that
have small measurement error relative to the differences we want to measure:
changes related to human activities.

For example, we used zooplankton data from northeastern lakes studied under
EPA's EMAP to estimate the relative contribution of three sources of variability to
the overall variance observed for each of three metrics: taxa richness, relative
abundance, and density (Hughes et al. 1993; Stemberger and Lazorchak 1994;
Stemberger et al.  1996). In that study, one to three zooplankton samples were
collected from each of seven lakes. The data were then subsampled in the labora-
tory and the organisms taxonomically identified. In our analysis of those data, we
identified three sources of variability and, thus, three components of variance:
variability caused by differences among lakes (lake effects), variability caused by
differences in sample location within the lake (crew error), and variability caused
by different subsamples identified in the lab (lab error). These three sources of
variance for metric scores can be summarized in an ANOVA model as:

            Metric score = Lake,-  +  Crew error,^  +  Lab error^;

where Lake,- = the effect of the rth lake on metric score; Crew error^ = the variabil-
ity caused by crew differences, sampling time, or location within the z'th lake; and
Lab errorifj,; = the variability that arises from the laboratory subsampling protocol
used in the initial study.

In statistical language, this model is a two-level nested ANOVA that is unbalanced
because the number of replicates varies at each level. Using the sums of squares
from the computer output and a little algebra (Sokal and Rohlf 1981: Chapter 10),
one can estimate  the variance of each term in the model.

For this analysis, we assumed that lakes differed in human influence and thus
biological condition. We were  interested in how the lakes differed from one
another. We were not interested in evaluating differences within lakes or within
subsamples; therefore, these two sources of variability were considered sources of
error. A variable is typically labeled an "effect" when one wants to measure or
compare values for that variable; if, on the other hand, one does not care whether
 80

-------
                crew A collects more animals than crew B ("crew effects"), for example, then one
                seeks to avoid that source of variability altogether, and so it is labeled "error."

                Based on our analysis of the components of variance in the zooplankton samples
                (Figure 33), we concluded that the sampling protocol was adequate to detect lake
                differences when taxa richness or relative abundance were calculated. We also
                discovered that lab variability was relatively small and that using lab time to
                identify replicate samples is not necessary. In contrast,  metrics varied relatively
                more depending on where crews collected samples within the lake. Consequently,
                we recommend that future studies like this one should put  more effort into
                sampling from the lakes while reducing the number of lab subsamples.

                We arrived at another important conclusion by comparing taxa richness, relative
                abundance, and density. The error components of variance for density were much
                larger than the lake component; for density, any signal at the lake level was lost in
                the noise of variability. In contrast, for taxa richness or relative abundance, most
                of the variability occurred among lakes rather than among replicate samples and
                subsamples (see Figure 33). If the goal is to distinguish  among lakes, then one
                should select metrics that minimize variability caused by within-lake and within-
                lab differences and maximize variability resulting from human influence. Taxa
                richness and relative abundance are metrics that do so.
FIGURE 33. Sources of variance for
two groups of herbivorous
zooplankton (cladocera, such as
Daphnia, and calanoid copepods),
calculated for northeastern lakes
(using data collected by R. S.
Stemberger under EPA's Environ-
mental Monitoring and Assess-
ment Program). Taxa richness,
relative abundance of individuals,
and density were calculated for
each group. The lab protocol used
to subsample ("lab error") and
replicate samples taken from each
lake ( "crew error") constituted two
sources of error; differences from
lake to lake ("lake variability")
were the effect of interest. Number
of lakes, 7; number of crew
replicates, 1-3; number of lab
replicates, 1-3. Components of
variance were estimated with
ANOVA.
                                      Cladocerans
            Calanoids
                          Taxa richness
                          Relative abundance
                          Density
D Lake variability
d Crew error
• Lab error
                                                                                        81

-------
               We analyzed components of variance in two other locations, the Puget Sound
               lowlands and Grand Teton National Park, Wyoming, to compare the sources of
               variability with total variance in benthic IBIs for homogeneous sets of streams
               (Figure 34). Rather than looking at individual metrics, these studies focused on the
               indexes themselves, after individual metrics had been tested and integrated. For
               samples within riffles in Puget Sound lowland streams, approximately 9% of the
               total variance in index value arose from differences within streams (Figure 34, top).
               (For this study, human influence was measured as a continuous variable, the
               percentage of impervious area; see  Figure 6, page 33.)

               The Grand Teton study did not measure human influence in each watershed.
               Instead, all sampled streams were assigned  to one of four categories of human
               influence, and variation was apportioned according to its source: among members
               of a group or among groups. B-IBI differences among members of the groups
               contributed  11% to the overall variance in B-IBI. Eighty-nine percent of the
               variance came from differences among the groups that reflected discrete human
               influence classes: little or no human activity; light recreational use; heavy recre-
               ational use; and urbanization, grazing, agriculture, or wastewater discharge (see
               Figure 7, page 33). In the Puget Sound and Grand Teton studies, the sources of
               error were low relative to variability resulting from different types of human land use.

               Statistical analysis of metric and index variance is thus useful for tuning sampling
               protocols; it is important in  defining where to put one's efforts and in determining
               the  usefulness of an index to detect human effects. But it cannot replace the more
               important aspects of testing and analysis that link metric and index values to
               human influence. The most  desirable statistical properties are no substitute for a
               biologically meaningful response to human disturbance.
FIGURE 34. Components of variance for the
B-IBIs for sites (n = 30) in the Puget Sound
lowlands and (n = 16) Grand Teton National
Park, Wyoming. In Puget Sound, variability
associated with stream differences was large
relative to variability associated with micro-
habitat (within-riffle) differences. In Wyo-
ming, variability associated with different
categories of streams (grouped according to
land use) was much higher than variability
associated with streams within each group.
Components of variance were estimated
with ANOVA.
                                                     Puget Sound lowlands
                   Variability across streams
                   Variability within streams
Grand Teton National Park
                                                                     D Variability across stream types
                                                                     H Variability within stream types
82

-------
                                                               PREMISE  20
        MULTIMETRIC  INDEXES  ARE  BIOLOGICALLY  MEANINGFUL
  Each metric
and IBI value
translates into
 a verbal and
visual portrait
  of biological
    condition
A multimetric IBI for a site is a single numeric value, but one that includes the
numeric values of individual indicators of biological condition. The actual mea-
sured values of the component metrics—each explicitly selected because it repre-
sents a specific biological element or process that changes reliably as human
influence increases—are not lost when an IBI is calculated. An IBI  itself, along with
patterns in the component metrics, focuses attention on biologically meaningful
signals. Each numeric metric value and the IBI as well can be translated into words
for a variety of audiences, including nonscientists, enabling them to understand
immediately how the biology at high-scoring sites differs from that at medium- or
low-scoring sites.

A site labeled "excellent" on the basis of a fish IBI, for example, is  comparable to
the best streams without human influence (Karr 1981). A full complement of
species expected for the habitat and  stream size is present, including the most
sensitive or intolerant forms. (Note especially that not all regionally distributed
species will be found in any single sampling site; even the best sites contain only a
fraction of regional species.) In addition, long-lived taxa are present in the full
range of age and size classes; the distribution  of individuals and taxa indicates a
healthy food web with a balanced trophic structure or organization. In contrast, a
fair-quality site has very few sensitive or intolerant forms and a skewed trophic
structure (e.g., larger numbers of omnivores and relatively few top predators,
especially in older age classes). At a very poor site, few fishes are present, except for
introduced or tolerant forms, and more than a few individual fish are likely to
show deformities, lesions, and tumors. Similar descriptions can convey the details
of biological condition for benthic invertebrate assemblages. In contrast, the
ecological context of many chemical criteria, bioassays, and biomarkers is often
unclear.
The combination of numeric and narrative descriptions that come  from a
multimetric IBI makes communication possible with virtually all academic disci-
plines, stakeholders, and communities. The opportunity for education is thus part
and parcel of a multimetric approach.
                                                                                     83

-------
           PREMISE 21
   MULTIMETRIC  PROTOCOLS  CAN WORK  IN  ENVIRONMENTS
   OTHER THAN STREAMS
   Thefmtfull-
  scale terrestrial
     IBIisnow
  under develop-
     ment at the
Hanford Nuclear
    Reservation
 I he principles for developing sampling protocols and analytical procedures for
monitoring streams are broadly applicable to other environments. Progress has
been made in assessing estuaries (Deegan et al. 1993; Engle et al. 1994; Weaver and
Deegan 1996; Deegan et al. 1997), lakes (Stemberger et al. 1996; Pinel-Alloul et al.
1996), wetlands (Adamus 1996; Karr 1997), riparian areas (Brooks and Hughes
1988; Croonquist and Brooks 1991), and reservoirs (Jennings et al. 1995).

Applying multimetric concepts to terrestrial environments has so far been limited.
Most of the relevant studies examined individual biological attributes rather than a
set of metrics. Species richness, for instance, declined with declining size of forest
fragments (Williamson  1981). In midwestern agricultural landscapes, the relative
abundance of omnivorous birds increased as the size of forest fragments fell; other
feeding groups did not  change systematically with fragment size (Figure 35; Karr
1987).
In a mist-net study of tropical forest birds, Karr (1987) detected disturbance-
associated shifts in species composition, capture rates, and trophic organization
within the undergrowth assemblage. Species richness in standard samples declined
by 26%, and capture rates doubled, in a disturbed forest relative to an undisturbed
forest; in this case, the disturbance was a recent history of intensive research within
the forest. Although the number of species changed little in the major foraging
guilds, spiderhunters, which feed on insects and nectar, increased sharply with a
change in undergrowth plants in the disturbed area.

In 1996, Karr et al. (1997) began developing the first full-scale IBI for a terrestrial
locale, the Hanford Nuclear Reservation in eastern Washington State. Under the
jurisdiction of the US Department of Energy since 1943  for weapons production,
the 560-mi2 reservation was closed to public access and development for more than
half a century. As a result, Hanford is a paradox. On the one hand, it poses an
enormous toxic-cleanup challenge to the Department of Energy, whose  Office of
Environmental Management has been at it since 1989; on the other, the reserva-
tion and its surroundings comprise some of the state's largest continguous patches
of native shrub-steppe vegetation and the last spawning run of chinook  salmon in
the mainstem Columbia River. The vegetation before European settlement con-
sisted of shrubs (Artemisia spp., Chrysothamnus spp., and Purshia tridentatd) and
   84

-------
FIGURE 35. Percentage of individuals in
several trophic groups among birds of
forest islands in east-central Illinois:
O, omnivores; FI, foliage insectivores;
BI bark insectivores; AI, aerial insecti-
vores; and GI, ground insectivores. The
relative abundance of omnivores in-
creases as size of the forest fragment
decreases; relative abundances of the
other groups do not change as
systematically.
0)
o
CO
T3
c
3
-Q
CO
CD
DC
      40
      20
                                     1980
                                                         2-16     24-40    65-118
                                                                     Area (ha)
                                       >600
               perennial bunchgrasses (Agropyron spicatum, Festuca idaboensis, Stipa spp., and Poa
               spp.). The number of alien annual plants increased with increasing human activity
               (Daubenmire 1970; Rickard and Sauer 1982), persisting even long after the activity
               ceased. The abundance of insect taxa shifted after wildfires (Rogers et al. 1998).

               The Hanford area is ideal for testing potential metrics for an IBI because it presents
               a full array of kinds and degrees of human impact. Initial field work established 13
               study sites across this gradient, including agricultural lands and lands altered by
               heavy equipment, fire, and grazing (Figure 36). A site was also chosen from the
               neighboring Arid Lands Ecology Reserve (ALE), which has been minimally dis-
               turbed. Plants and insects were the two organismal groups chosen for metric testing
               and IBI development.

               After one spring field season, the researchers have now begun establishing which
               plant and insect attributes will give consistent ecological dose-response curves
               across the gradient of disturbances at Hanford. Measured plant attributes  include
               species present; number of individuals; and percentage  of cover for grasses, forbs,
               shrubs, and the cryptogamic crust. Insects were collected from pitfall traps, sweep
               nets, butterfly transects, and individual shrubs; galls on the shrubs were also
               counted.

               Altogether  58 plant species, representing 20 families, have been found from the 13
               sites; 72% of these are native and  16% are introduced aliens. The distribution of
               particular species (e.g., the alien cheatgrass Bromus tectorum and native grasses) and
               the proportion of native vs. alien species varies across the sites. The proportion of
               alien species per site ranges from 28% to 92%; it is highest at the most disturbed
               sites. The percentage of alien species and the percentages of native grass and shrub
               taxa may offer potential plant metrics (Figure 37).

               On the basis of insects from 4 of the  13 sites, taxa richness appears to be higher at
               the minimally disturbed ALE site  (49 insect families) than at the old town of
                                                                                        85

-------
FIGURE 36. The Hanford
Nuclear Reservation,
including central Hanford,
the Arid Lands Ecology
Reserve (ALE), Wahluke
State Wildlife Recreation
Area, and Saddle Mountain
National Wildlife Refuge.
Letters indicate location of
study plots. Sites C, G, and
H have been affected by
fire; site D by an early
history of grazing; sites J
and M by agriculture; and
sites F, K, and L by physical
disturbances. Sites A, B,
and D show only minimal
disturbance (reference sties).
Sites E and I  have unknown
disturbance histories.
WASHINGTON
                                           N
                                                                  20'
               Hanford (29 families), a burn site (23 families), or an abandoned agricultural field
               (23 families) (Figure 38). Relative abundances also vary across these sites. A com-
               mon agricultural pest (cutworm, a noctuid moth) made up 89% of the Lepidoptera
               at an abandoned agricultural site, but no species dominated among the butterflies
               and moths at the other sites. Beetles, especially one species (Eusattus muricatus,
               family Tenebrionidae), dominate at the burn site but not at the others. Other
               promising attributes include the number of predators and parasitoids; food web
               effects that may show up as shifts in species composition from site to site; and the
               numbers, taxa richness, and taxa composition of bees, wasps, and ants (Hy-
               menoptera). The Hymenoptera are particularly interesting because they occupy a
               wide range of trophic levels. At the old town site, an area dominated by the alien
               yellow star thistle (Centaurea solstitialis), hymenopterans had the highest relative
               abundance (38%) of the insects collected there. Perhaps there is a link between
               hymenopteran pollinators and the introduced weed, an interaction that may offer
               a useful metric.
86

-------
FIGURE 37. Preliminary ecological 35
dose-response curves for two
potential metrics for plants at 13 co
Hanford sites: top, relative $ £°_ pc
abundance of native shrubs and $ w
grasses (percentage of total), and O)^ 20
bottom, relative abundance of ^ -C
i- • := w m
alien species. co oS
10

100
90
C? 80
^ 70
1 60
g 50
< 40


30
20
-


.
• «
•
9
* «

1 1 1 1 1 1 1 1 1 1 t 1 1 1

«
-
0
•
•
_ »
8 A •
99
' 9 9
I I I 1 I I I I I I i I I I
PS BM4 DS FF BM15 CS OO
BS ALE BWP RR OH HT
90

co 80
CO
CD
I 7°
0
« 60
0)
o 50
CD
Q.
w 40
Of)
ou
18
16
^ 14
2 12
ca
£ 10
DL
8
R
_
Site
-


-

-

•
8
«
I I i I I
i-
-
-
*

FIGURE 38. Preliminary ecological
9 * dose-response curves for two
potential metrics for insects at
ALE   Town Old field  Burn
           Site
richness, and bottom, relative
abundance of predators (%>).
                                                                       87

-------
                                              SECTION  IV
        FOR A ROBUST  MULTIMETRIC  INDEX,
                        AVOID  COMMON  PITFALLS
   Although properly constructed multimetric indexes are robust measurement tools,
     various pitfalls can derail their development and use. The failure of a monitoring
                protocol to assess environmental condition accurately or to protect
   running waters—or any other environment—usually stems from flaws in sampling or
analysis. Multimetric indexes provide an important tool for measuring the condition of
   ecological systems. They can be combined with other tools in ways that enhance or
                 hinder their effectiveness, and, like any tool, they can be misused.
That multimetric indexes can be, and are, misused does not mean that the multimetric
       approach itself is useless. Like any scientific procedure, multimetric procedures
                          must be tailored appropriately to a particular situation.
 For streams, for example, it is  unrealistic to expect a single "off-the-shelf" multimetric
    index to be appropriate everywhere. Regional variations that adhere to some basic
 biological, sampling, and statistical principles maintain the strengths of a multimetric
    assessment while reflecting the reality of regional variation in biological condition
            (Miller et al. 1988). The goal is not to measure every biological attribute;
   indeed, doing so is impossible. Rather, the goal is, first, to identify those biological
          attributes  that respond reliably to human activities, are minimally affected
                      by natural variability, and are cost effective to measure; and,
                     second, to combine them into a regionally appropriate index.
                                                                  89

-------
         PREMISE 22
  PROPERLY  CLASSIFYING  SITES is  KEY
Characterizing
    ecoregions
    should not
 get in the way
 of testing and
 using metrics
    diagnostic
    of human
      impact
Successful biological monitoring depends on judicious classification of sites. Yet
excessive emphasis on classification, or inappropriate classification, can impede
development of cost-effective and sensible monitoring programs. Using too few
classes fails to recognize important distinctions among places; using too many
unnecessarily complicates development of biocriteria. Inappropriate levels of
classification also lead to problems. The challenge is to create a system with only as
many classes as are needed to represent the range of relevant biological variation in
a region and the level appropriate for detecting and defining the biological effects
of human activity in that place.
Like a taxonomy of places, classification attempts to distinguish and group distinct
environments,  communities, or ecosystem types; the proper approach to classifica-
tion may vary,  however, according to specific goals. Biological (community)
classification generally lags far behind classification by physical environment or
habitat type for aquatic systems (Angermeier and Schlosser 1995). The characteris-
tics that make streams similar or different biologically—and  thus make classifica-
tion important for biological monitoring—are determined first by the geophysical
setting (including climate, elevation, and stream size), and second by the natural
biogeographic processes operating in a place (see Premise 5, page 21, and Figure 3,
page 23). Together they are responsible for local and regional biotas. Coastal
rainforest headwaters on the Olympic Peninsula, for example, are likely to be
biologically comparable, as would be headwater streams in central Illinois.

But even though geophysical context is a fundamental determinant of variation in
biological systems, classification based on the geomorphologists' view of stream
channel types, or on other landforms occupied  by biological systems, is not
necessarily the  proper level for assessing the biological condition of those systems.
In the Pacific Northwest, geomorphologists identify some 50 to 60 channel types
based on the interplay of physical and chemical processes that shape stream
channels (MacDonald et al. 1991). But recognizing these channel types does not
necessarily mean that an equal number of biological classes is needed for biological
monitoring. The native biota may not be unique to each of those channel  types in
terms of species composition, taxa richness, or other important aspects of ecologi-
cal organization; even if some species replacement occurs, metric norms may not
change. Fewer biological categories may therefore work just as well.

Many agency programs rely on geographically delineated ecological regions reflect-
ing prevailing geophysical and climatic regimes (Omernik 1995; Omernik and
 90

-------
Bailey 1997). Such ecoregion divisions are valuable, but they are not the be-all and
end-all of classification schemes. Indeed, classification at the ecoregion level alone
is unlikely to give appropriate weight to every factor important to creating homo-
geneous sets for comparing the biological condition of streams. Other factors,
including topography, geological substrate, and stream size or gradient may be
more significant biologically. In addition to ecoregion, a good classification
scheme should consider the defining characteristics of local and regional physical
and biological systems. It would make little biological sense, for example, to group
large, meandering stream reaches with small, fast-flowing streams even if they are in
the same lowland ecoregion; the habitats these stream reaches provide, and there-
fore the biota that live there, are very different. Likewise, the biological attributes
signaling the effects of human activities in two high-elevation first-order streams
may not differ just because they are in different ecoregions. In short, ecoregions (or
equivalent units) are a necessary but not sufficient basis for a stream classification
used in biological monitoring.
Furthermore, no matter how much it enhances our knowledge of natural landscape
variation, characterizing ecoregions should not get in  the way of testing and using
metrics diagnostic of human impact. The  point of classification is to group places
where the biology is similar in the absence of human  disturbance and where the
responses are similar after human disturbance. In some cases, these groupings may
coincide with ecoregion boundaries; in others, they may cross those boundaries.
To evaluate sites over time and place, we need groupings that will give reliable
metrics and accurate criteria for scoring metrics to represent biological condition
(see Premise 14, page 56).
On the east and west sides of the Cascades, and elsewhere in the Northwest, for
example, many of the same metrics respond to the effects of grazing, logging, and
urbanization, even though climate, vegetation, terrain, and human land use differ
(Table 10). The expected values of these metrics differ—taxa richness, for example,
is lower east of the Cascades—which may result from "natural" differences  or
differences stemming from more widespread human influence on a more fragile
eastside landscape. Nevertheless, in both westside and eastside ecoregions,  the
same  metrics respond across a range of human influence, and IBIs composed of
these  metrics reflect and distinguish among the effects at different sites. Elsewhere,
such as across eastern deciduous forests and midwestern prairies, maximum species
richness also transcends ecoregion boundaries (Figure 39). Expected species rich-
ness seems to be higher for forested landscapes than for prairie or grassland land-
scapes. Other metrics, such as trophic structure, however, are reliable indicators of
human influence across ecoregions for some places and taxa (e.g., North American
fishes) but not for others (e.g., benthic invertebrates) (see Premise 12, page 47).
Thus, classification based on ecological dogma, on strictly chemical or physical
criteria, or even on the  logical biogeographical factors used to define ecoregions is
not necessarily sufficient for biological monitoring. The good biologist uses the
best natural history, biogeographic, and analytical resources available to choose a
classification system.


                                                                        91

-------
TABLE  10. Similar metrics emerge as reliable indicators of human influence across the Pacific Northwest,
regardless of ecoregion. Percent sign (°/o) denotes relative abundance of individuals belonging to the listed
taxon or group. Metrics marked with a check are those that responded across a range of intensity for grazing
(eastern Oregon and Wyoming) or logging (western Oregon and Idaho).
Metric
                                Predicted
                                response
                                               Eastern
                                               Oregon
SW
Oregon
                                                                          Central
                                                                          Idaho
NW
Wyoming
Taxa richness and composition
Total number of taxa              Decrease
Ephemeroptera taxa              Decrease
Plecoptera taxa                  Decrease
Trichoptera taxa                  Decrease
                                                  V

                                                  V
                                                                 V
                                                                 V
                                                                 V
                                                                             V
                                                                             V
                               V
                               V
                               V
Tolerants and intolerants
Intolerant taxa                   Decrease
Sediment-intolerant taxa          Decrease
% tolerant                       Increase
% sediment-tolerant              increase
                                                                 V
                                                                 V
                                                                 V
                                                                             V

                                                                             V
Feeding and other habits
% predators                     Decrease
% scrapers                      Variable
% gatherers                     Variable
                                                  V

                                                  V
                                                                             V

                                                                             V
                               V
                               V
Population attributes
Dominance*
                                Increase
FIGURE  39. Lines of
maximum species
richness for stream order,
based on historical data
from midwestern streams.
Although the lines differ
for the eight watersheds,
they fall into two general
groups: woodland
watersheds in several
ecoregions in the eastern
Midwest (upper group)
and two Great Plains
streams in two different
ecoregions. (Modified
after Fausch et al. 1984.)
                               CO
                               Q.
                               CO
                                    30
                               w
                              *=    20
                               o
                               |
                               c
                              15
                              12
                                    10
                                                  345
                                                 Stream order
                                                                     Raisin River, Michigan
                                                                     Red River, Kentucky
                                                                     Embarras River, Illinois
                                                                     St. Croix River, Wisconsin
                                                                     Chicago area rivers, Illinois
                                                                         R'ver area> Illinois
                                                                     Salt Creek, Nebraska
                                                                     James River, North and South Dakota
92

-------
                                                                PREMISE  23
                             AVOID FOCUSING PRIMARILY  ON  SPECIES
 Simple species
  composition
 is not as good
    a guide as
    ecological
  structurefor
classifying sites
.M-any water quality specialists begin their analyses of stream, data with a matrix of
species and abundances. Using species-level community comparisons such as
percentage similarity indexes, Pinkham and Pearson's B, the Bray-Curtis index, or
multivariate statistics, they then evaluate species overlap among sites and classify
the sites based on these evaluations. Unfortunately, the mathematical and ecologi-
cal properties of these measures (Wolda 1981; Washington 1984; Reynoldson and
Metcalfe-Smith 1992) make these procedures problematic. Moreover, regional
classifications based on species overlap limit one's view by focusing on species
composition rather than higher-level taxonomic and ecological structure.

Consider two undisturbed streams in adjacent Appalachian watersheds (Figure 40).
A standard sample from a first-order stream in one watershed contains eight fish
species: darters A, B, and C; sunfish D and E; and minnows F, G, and H. The
other site contains seven species: darters M, N, and O; sunfish P and Ql and
minnows R and S. Comparing the samples using measures of species overlap (0%)
would highlight the completely different species composition at the two sites, even
though the higher-level taxonomic or ecological overlap (near 100%) is obvious at
the family level and in feeding ecology. Both sites support three darters, two
sunfish, and either two or three minnows.

Consider now what happens after a disturbance at each site: the species composi-
tion of both streams shifts as another regional darter,] (a tolerant species), moves
in, and two of the original darter species disappear from each stream because they
cannot tolerate the  changes caused by  the disturbance. Similar changes occur in
the other taxa (see Figure 40). Now the species overlap index for the two sites is
more similar (33%), and both are less similar to their original assemblages (27%
and 30%). Assemblages with very different species composition respond in much
the same way, becoming more similar  in the presence of similar human activity.
These responses result from their nearly identical ecological structure, not from
similarities in species composition. It is this ecological structure that gives the
clearest signals of human disturbance.

In this example, species-level classification suggests that the two areas are very
different, even though their higher-level taxonomic and ecological organization are
nearly identical. The point is that ecological organization  and regional natural
history are better guides for site classification than a focus on species composition.
                                                                                      93

-------
FIGURE 40. Species composition                 Site 1                                Site 2
for two hypothetical fish assem-                 g^ djsturbance
blages before and after a human
disturbance that changes the                    Darter A                              Darter M
biological condition of the sites.                 Darter B                              Darter N
The turnover in species is not                   Darter C                              Darter O
sufficient reason to conclude that                Sunfish D                            Sunfish P
these sites should be classified                   Sunfish E                            Sunfish P
differently, for their ecologial                    Minnow F	'	Minnow R
organization before and after
disturbance are the same.                       Minnow G                            MinnOW S
                                            Minnow H
                                            After disturbance
                                            Darter A                              Darter M
                                            Darter J                              Darter J
                                            Sunfish D                            Sunfish D
                                            Sunfish L                            Sunfish P
                                            Minnow F                            Minnow Ft
                                            Minnow K                            Minnow K

-------
                                                            PREMISE 24
                   MEASURING THE WRONG THINGS SIDETRACKS
                                                BIOLOGICAL  MONITORING
   The belief
 that a metric
should work
 is not reason
   enough to
      believe
  that it will
A. bewildering variety of biological attributes can be measured, but only a few
provide useful signals about the impact of human activities on local and regional
biological systems. Some attributes vary little or not at all (e.g., the number of
scales on the lateral line of a particular fish species); others vary substantially (e.g.,
weight, which can vary with age and reproductive or environmental conditions).
Variation may be natural or human induced, and natural variation may come from
temporal (diurnal, seasonal, annual)  or spatial sources (stream size, channel type),
or both. Biological monitoring must separate human effects from natural variation
by discovering, testing, and using those biological attributes that can be measured
with precision to provide reliable information about biological condition.

Some attributes are poor candidates  for monitoring metrics because of their
underlying biology. In particular, abundance, density, and production vary too
much to use in  multimetric biological indexes (see Figures  18, page 53, and Figure
33, page 81), even when human influence is minimal, and they (especially produc-
tion) may also be very difficult to measure. Estimated density or species abundance
at a site is affected by three sources of variance: sampling efficiency, natural events,
and human activities (see Premise 19, page 80).
Population size can vary enormously even when conditions are stable (Botkin
1990; Bisson et  al. 1992) because populations respond to natural environmental
changes as well  as to intrinsic dynamics such as lag times between developmental
stages. Identifying correlates of population variance in natural environments is
challenging enough, but where human influence is also at work, the complex
interaction of human and natural events determining population size makes it
almost impossible to separate human effects  from sampling and natural variance.
Sampling protocols have been developed to overcome this problem (see Premise 4,
page 16; Schmitt and Osenberg 1996), but they are often complicated, expensive,
and time  consuming. Moreover, they may even fail to detect biological signals that
may be detected by looking at other components of biological systems or organiz-
ing and framing data in other ways. Taxa richness and relative abundance are more
effective as indicators of biological responses to human actions (see Premise 6,
page 26; Premise 11, page 45; Premise 12, page 47; Premise 17, page 71).

Some attributes, such as ratios (e.g., of the abundances of two trophic groups), are
inherently flawed. A ratio consists of measures pertaining to two different groups,
one used  as the numerator, the other as the denominator. The numerator,

                                                                   95

-------
               denominator, or both may vary simultaneously and for diverse reasons. For ex-
               ample, very large numbers of scrapers and filterers may yield the same ratio as a
               pair of very small numbers of each trophic group. Metrics expressed as ratios may
               intuitively seem useful, but empirical evidence (Barbour et al. 1992) and statistical
               theory (Sokal and Rohlf 1981) show that when two variables are combined in a
               ratio, the ratio tends to have higher variance than either variable alone.  If two
               attributes of an assemblage are potentially important, moreover, they should be
               evaluated independently. With rare exceptions (e.g., relative abundance of indi-
               viduals in a sample; see Premise 13, page 51 and below), using ratios mixes inde-
               pendent parameters in ways that make it hard to discern their relative influence,
               much as diversity indexes combine species richness and evenness into a single
               expression.
               Not to be confused with ratios are metrics expressed  as proportions (e.g., propor-
               tion of darters out of total number of individuals). The relative abundance, or
               percentage, of a particular group is calculated as the number of individuals in that
               group divided by the total number of individuals present. That proportion changes
               only as a function of changing relative abundance of the target taxon. As the
               number  of individuals in a sample becomes very small, such as at seriously im-
               paired or highly oligotrophic systems, however, low numbers may distort these
               proportions, and assessment procedures may  need altering (e.g., Ohio EPA 1988).

               Finally, many attributes now in use are based on theoretical arguments that often
               lack adequate empirical support. Although theory can be a good guide for selecting
               metrics,  the theory must be tested with real-world data before a metric is used.
               Empirical natural history patterns should always take precedence over ecological
               theory in choosing which metrics to  incorporate into a multimetric index. Theory
               can suggest metrics, especially when  one begins to look at a new geographic region
               or a new biota. But the belief that a metric should work is not enough reason to
               conclude that it will. Ecology's path as a scientific discipline is littered with the
               carcasses of "good" theoretical constructs that evidence later showed were flawed.
               We should not rely on theory to guide decisions about vital goods and services that
               come from natural systems. Once again, the key test  is whether an attribute shows
               an empirical dose-response relationship across a gradient of human influence.
96

-------
                                           PREMISE 25
                 FIELD WORK is MORE VALUABLE THAN
                 GEOGRAPHIC INFORMATION SYSTEMS
Although a geographic information system (GIS) can be a powerful tool for
mapping satellite and other data, it is not required for a successful monitoring
project. The time and money spent on this technique may be better spent doing
field work to identify the types and levels of human influence
and defining the criteria for selecting and ranking sites.

Local field work leads to understanding and to decisions based on practical local
experience observing natural systems, knowledge of the major human activities
associated with those systems, and the resulting biological responses. The most
successful projects are those that identify major human land uses in a region and
study existing information before sampling. GIS can be useful for managing and
displaying information, but GIS technology is not a replacement,
or even a good surrogate, for biological monitoring.
                                                              97

-------
        PREMISE 26
SAMPLING  EVERYTHING  is NOT THE GOAL
              JtJiological systems are complex and unstable in space and time (Botkin 1990;
              Pimm 1991; Huston 1994; Hilborn and Mangel 1997), and biologists often feel
              compelled to study all components of this variation. Complex sampling programs
              proliferate. But every study need not explore everything. Biologists should avoid
              the temptation to sample all the unique habitats and phenomena that make
              biology so interesting. Managers, especially, must concentrate on the central
              components of a clearly defined research or management agenda—for example,
              detecting and measuring the influence of human activities on a biological system.

              Sites should be selected for sampling that are typical of a region and reasonably
              homogeneous with respect to important biogeographic features. Special habitat
              types—such as streams that are spring fed, ephemeral, or very large—may represent
              important and fascinating gaps in our biological knowledge, but if they represent a
              small percentage of a region's sites they should be left out of broad surveys (unless,
              of course, they are the target of a particular monitoring program).

              Biologists are trained to focus on the unique because unique environments often
              yield new insights into how biological systems operate. But for monitoring, it is
              more important to focus widely on changes  caused by humans and to document
              those effects.
98

-------
                                                               PREMISE 27
                                   AVOID PROBABILITY-BASED SAMPLING
                                                 UNTIL  METRICS ARE DEFINED
   Probability-
based sampling
 allows statisti-
 cally defensible
generalizations
      to other
   places—but
     only after
  metrics have
  been verified
Probability-based sampling selects sites randomly within a region so that an
estimate of overall resource condition is statistically reliable (Olsen et al., in press).
But the technique is best not applied until after site classification and metric
testing are completed—in other words, after dose-response relationships to human
activity have been established.
Random sampling may not permit one to develop an integrative IBI to measure
human effects: random sampling can even make it difficult to discover patterns
caused by human activities. Random sampling of sites does not guarantee that
selected sites are homogeneous enough (properly classified) to be included in an
analysis. Neither does it guarantee that a full range of ecological states, from
heavily degraded  to undisturbed, will be studied. In fact, because human influence
is so pervasive, most sites within a watershed are likely to be moderately to severely
degraded; probability-based sampling is likely to miss the best and worst places if
they are rare. Yet the best and worst sites are key for demonstrating biological
responses to human influence, for developing and testing new metrics, and for
calibrating scoring criteria (5, 3, or 1). By the same token, numerous studies
demonstrate that subjective selection of reference sites can also be misleading
(Patterson 1996; R. M. Hughes, pers. commun.; also see Premise 30, page 108).

Another drawback of probability-based sampling may be the cost of identifying
every potential sampling site before a random sample can be selected. Perhaps
most important, if an  agency commits exclusively to this sampling design before
determining the biological responses likely to give the most useful signal about
resource condition, considerable money and time can be lost, especially if the
sampling design is short-circuited by the problem of getting access to sites because
landowners may not grant permission to sample on private lands. Finally, many
institutions and agencies may lack the resources for sampling sufficient numbers of
sites  to apply probability-based surveys.
On the other hand, if we already have robust indicators, probability-based sam-
pling is critical to evaluate the condition of all waters in a region. Whenever
probability-based sampling has been combined with strong indicators in recent
years, degradation has been found to be more pervasive than originally believed.
Probability-based sampling can also help avoid problems with a monitoring
strategy that defines sites based on known sources of degradation: a random
sample can find sites omitted because their causes of degradation were unknown.

                                                                     99

-------
              Three early steps are crucial to a robust monitoring protocol: first, classifying of
              regional biological systems at appropriate levels—neither too detailed nor too
              superficial (see Premise 22, page 90); second, discovering of biological patterns
              associated with human actions—the documentation of ecological dose-response
              curves (see Premise 5, page 21); and third, cross-checking to ensure that the classifi-
              cation system selected is appropriate for the data set (see Premise 22, page 90).
              Narrowly conceived and implemented probability-based sampling designs too
              often overlook one (or more) of these three steps, and thus can fail to detect
              biological patterns associated with human-induced  degradation. The failure of
              some state and federal programs in the past decade  can be traced to the failure to
              define metrics that exhibit dose-response curves before monitoring began.

              Nevertheless, when classification and ecological dose-response are appropriately
              established in concert with probability-based sampling, the result can be especially
              useful because it allows biologists to make statistically defensible conclusions
              beyond the sampled sites. For riverine fish, for example, probability-based sam-
              pling can help to estimate the condition of rivers over a large region where the fish
              metrics and a fish IBI have already been tested and  validated. For now, probability-
              based sampling is less useful with other taxonomic groups, such as zooplankton,
              ants, plants, and to some extent benthic invertebrates, for which tests of metrics—
              the search for ecological dose-response curves—are incomplete.
100

-------
                                                           PREMISE  28
COUNTING  TOO-INDIVIDUAL SUBSAMPLES  YIELDS  TOO FEW
                                DATA FOR  MULTIMETRIC  ASSESSMENT
            A number of sampling protocols have been used in multimetric biomonitoring
            studies. Although there are no absolute standards for sampling design or analytical
            techniques, certain protocols are more effective than others in avoiding the pitfalls
            of too few data or poor-quality data.
            Since the fish IBI was first developed in 1981, fish-sampling protocols have called
            for sampling all microhabitats within stream reaches from 100 m to 1 km long,
            depending on stream size. Fish IBIs have been developed for Ohio (Ohio EPA
            1988; Yoder and Rankin 1995a,b), Wisconsin (Lyons 1992a; Lyons et al. 1996),
            Oregon (Hughes and Gammon 1987; Hughes et al., in press), Canada (Steedman
            1988; Minns et al. 1994), Mexico (Lyons et al. 1995), and France (Oberdorff and
            Hughes 1992). Sampling design has not been controversial, largely because stan-
            dard sampling methods are effective at sampling most fish in most microhabitats
            in small to midsize streams.
            One study dealing with the effects on fish IBIs of sample size (number of individu-
            als per sample) found that small  samples were correlated with high measurement
            error; that is, the confidence intervals for IBIs increased as sample size decreased
            (Fore et al. 1994). Among 37 sites in Ohio's Great Miami Basin, 29 had confidence
            intervals for IBI of 6 or less (Fore et al. 1994; Figure 41). Seven out of eight of the
            sites with confidence intervals greater than 6 had fewer than 400 individuals per
            sample. The loss of precision in estimating IBI with samples of 400 or fewer
            suggests that it is unwise to intentionally use still  smaller samples or subsamples.8

            Sampling protocols  are not as broadly accepted for benthic invertebrates as for
            fish. At least three superficially similar multimetric indexes using benthic inverte-
            brates have been proposed: the invertebrate community index (ICI: Ohio EPA
            1988; Yoder and Rankin 1995a,b); the rapid bioassessment protocol III (RBP:
            Plafkin et al. 1989); and the benthic index of biological integrity (B-IBI: Karr and
            Kerans  1992; Kerans et al. 1992;  Kerans and Karr 1994; Fore et al. 1996; Rossano
            1996; Karr 1998). Both ICI and B-IBI were extensively tested before publication or
            use in research or management; neither the sampling methods nor the metrics were

When small sample sizes are a result of severe degradation, scoring of metrics—especially for relative abundance—can
be adjusted to account for this fact (Ohio EPA 1988). Researchers sponsored by EPA's Environmental Monitoring and
Assessment Program on Oregon streams and rivers were able to get precise results with samples of as few as 100 to 200
fish (R. M. Hughes, pers. commun.). Perhaps the threshold varies in cold- vs. warm-water streams, an issue that
deserves further exploration.

                                                                              101
     Why not
     sample a
    reasonabk
area and count
     the whole
     sample to
   begin with?

-------
FIGURE 41. Confidence
intervals for a fish IBI in
relation to the number of
individuals in samples
collected at 37 sites within
the Great Miami Basin, Ohio.
(From Fore et al. 1994.)
    12
    10
03

|
©
q
q>
15
*»
o
O
                                     0 -
Higher variance
Lower variance
                                         200    400    600     800    1000   1200   1400
                                                         Number of fish
              as carefully evaluated for RBP, although recent tests are helping strengthen the
              protocol (Barbour et al. 1992; Barbour et al. 1996a; Barbour et al., in press). Tests
              of B-IBI in several regions (Tennessee, Wyoming, Oregon, Washington, Japan)
              point to 10 metrics as appropriate for including in a broadly applicable B-IBI
              (Table 11).

              One of the most controversial aspects of these three invertebrate indexes is the
              number of individual organisms to be counted for an analysis. Both ICI and B-IBI
              call for counting every individual in each sample. RBP, in contrast, calls for
              subsampling as few as 100 individuals from each large sample to define a "consis-
              tent unit of effort"; the adequacy of this number has been hotly debated (Fore et
              al. 1994; Barbour and Gerritsen 1996; Courtemanch 1996; Vinson and Hawkins
              1996). The need for subsampling with RBP comes out of its initial design: RBP
              calls for sampling a 2-3 m2 area "to integrate sampling among a wide range of
              heterogeneous microhabitats" (Barbour and Gerritsen 1996: 387). A smaller sam-
              pling area, such as 0.1 m2, would reduce the heterogeneity among sampled micro-
              habitats from the outset  (Kerans et al. 1992; see Premise 18, page 73).

              We have found one effort to justify the adequacy of the 100-individual subsample
              approach (Barbour and Gerritsen 1996) unconvincing on several grounds, particu-
              larly with regard to studies of streams. First, the authors base their conclusions on
              data from lakes, not streams, and we believe it is not a good idea to extrapolate
              results across environment types. Second, arthropods were collected in "12 petite
              Ponar grabs (0.02 m2)," giving a total sample area of only 0.24 m2, in comparison
              with RBP's recommended 2-3 m2 for streams. Third, only one subsample was
              generated for each of nine sites; variability was assessed, not with multiple samples
              from a site, but from multiple sites. Nine sites were grouped according to relative
              abundance curves, creating a mathematical near-certainty that taxa richness would
              vary systematically across the groups. A better approach would have been to
102

-------
TABLE  11. Ten-metric B-IBI based on study in six geographic regions. Metrics were tested in six benthic
invertebrate studies done in the Tennessee Valley, southwestern Oregon, eastern Oregon, the Puget Sound
region, Japan, and northwestern Wyoming. A + indicates that the metric varied systematically across a
gradient of human impact for that data set; - indicates that the metric did not vary systematically; 0 indi-
cates that the metric was not tested for that data set. Sources: Tennessee, Kerans and Karr 1994; southwestern
Oregon, Fore et al. 1996; eastern Oregon, Fore et al., unpubl. manuscript; Puget Sound, Kleindl 1995; Japan,
Rossano 1995; northwestern Wyoming, Patterson 1996.
Metric                Predicted     Tenn.      SW       Eastern     Puget               NW
                      response     Valley      Ore.      Ore.        Sound     Japan    Wyo.

Taxa richness and composition
Total number of taxa     Decrease        +           +          +          +          +         +
Ephemeroptera taxa     Decrease        +           +          -          +          +         +
Plecoptera taxa         Decrease        +           +          +          +          -         +
Trichoptera taxa        Decrease        +           +          +          +          +         +
Long-lived taxa         Decrease        0           +          +          +          0

Tolerants and intolerants
Intolerant taxa          Decrease        +           +          +          +          +         +
% tolerant              Increase         +           +          -          +          +         +

Feeding and other habits
% predators            Decrease        +-          +          +          -         +
"Clinger" taxa richness   Decrease        0           0          0          +          +         0

Population attributes
% dominance           Increase         +           +          -          -          -         +
  (three taxa)
               examine sites of different known human influence, to construct multiple random
               samples from each site, and to examine if the ranking of sites or other inferences
               about relative condition  of the sites (e.g., ability of different metrics to discriminate
               among sites) was influenced by the subsampling procedure.

               The decision to count only 100-individual subsamples (intended to speed labora-
               tory analysis) has serious ramifications for the counts' reliability in multimetric
               indexes. First, the counting procedure itself becomes a source of error or bias. In
               RBP, the samples are spread out in a sorting pan with a sampling grid, and grid
               squares are counted at random until 100 individuals have been counted. The initial
               process to "randomly distribute" the organisms is one potential source of bias. Bias
               also arises from differences in the identity, size, mass, density, or distribution of
               individuals among the squares; these attributes can influence results even if ran-
               dom selection of grid squares is strictly enforced.
                                                                                       103

-------
              In addition, sample size affects estimates of taxa richness and relative abundances,
              which are central to a robust multimetric index (Courtemanch 1996). Samples must
              be large enough to accurately reflect the species richness and relative abundances
              for the resident biota. Yet, argues Courtemanch (1996: 382-383), the 100-indi-
              vidual subsample does not provide an "asymptotic estimate," either of taxa rich-
              ness (number of taxa per standard  number of individuals) or of taxa density (taxa
              per standard area) in each sampled unit; thus "there is no basis for comparison
              with either another sample community or with a reference condition."

              Courtemanch proposes two remedies for this problem: two-phase processing, in
              which the entire sample is first searched for large individuals belonging to rare taxa;
              and serial processing, which involves following the RBP procedure to count
              individuals in grids up to  100 and  then counting more grids until no new taxa are
              found. The large-individual standard is appealing but, we find, hard to defend on
              either sampling or biological grounds (see also Walsh 1997). A similar approach is
              outlined by Vinson and Hawkins (1996).

              It may be more efficient to sample a smaller, entirely "countable" area in the first
              place, rather than spending the time and effort to collect large numbers of organisms
              that are never counted. The protocol we recommend (see Box 2, pages 78-79)
              samples smaller areas, focuses on a single microhabitat, collects three replicate
              samples, keeps samples separate, and counts each sample completely. Such a
              protocol saves some time  in the field and gives more complete results from the
              laboratory; we thus have greater confidence in both the statistical and biological
              aspects of the resulting multimetric evaluation. This approach does not, of course,
              give a complete count of all organisms  present in a stream reach or a measure of
              variability among riffles within the reach. It has, however, provided enough detail
              to judge relative biological condition among streams—within a region and among
              regions.
              Perhaps the most serious flaw in the 100-individual subsample approach derives
              from the fact that sample  size does not affect all metrics in the same way. Count-
              ing only 100 individuals may thus  lead to erroneous  conclusions or limit a
              manager's ability to diagnose causes of degradation. In testing the 100-individual
              standard, for example, Barbour and Gerritsen (1996)  found that, for taxa richness,
              counting 100-individual subsamples and also counting all individuals produced the
              same  rank order for their nine sample sites; they therefore concluded that 100
              individuals adequately represented taxa richness across these sites. Yet because
              these  researchers' method  is based  on analysis of relative  abundance curves, not
              sites ranked according to a known  human-influence gradient, the behavior of their
              taxa richness metric cannot be attributed exclusively  to human impact. Further, it
              is inappropriate to extrapolate from the presumed behavior of one metric to the
              behavior of all metrics in a multimetric index.

              Subsamples of only 100 individuals are less likely than large samples to consis-
              tently reveal the presence  of intolerant, long-lived, or otherwise rare taxa, regardless
              of their size; small subsamples are  also likely to affect relative abundances of key
              trophic or other ecological groups  (Ohio EPA 1988). Failing to count rare taxa or
104

-------
               rare ecological groups such as intolerant taxa would exclude some of the strongest
               biological signals about the condition of places. This effect of subsampling is
               analogous to the exclusion of rare species that is often recommended in multivari-
               ate analyses (Reynoldson and Rosenberg 1996; see Premise 32, page 112).

               An analysis of random subsamples of stream invertebrates collected in Puget
               Sound lowland streams (Doberstein, Karr, and Conquest, in prep.) has yielded very
               different conclusions from those of Barbour and Gerritsen (1996). Using a boot-
               strap resampling protocol like that described by Fore et al. (1994), Doberstein,
               Karr, and Conquest generated several hundred subsamples for each of several
               streams for 100-, 300-, 500-, and 700-individual subsamples and for the  entire
               complement of individuals collected in three 0.1-m2 samples. (The field sampling
               procedures were those described in Box 2, pages 78-79.) After determining the
               variance in parameter estimates (metric values) for the resulting distributions of
               random samples, Doberstein, Karr, and Conquest then asked  how many distinct
               classes of biological condition could be detected, by each metric and for the
               integrative B-IBIs.
               Using the 10-metric B-IBI shown in Table 11 (page 103), the researchers found they
               could reliably discern an average of 3.6 classes of biological condition per metric
               (range, 1.14 to 10.61) when they counted full samples from minimally disturbed
               streams (Figure 42). This result compares favorably with the 3 classes distinguished
               by the 5, 3, and 1 scoring protocol. In contrast, metric sensitivity for random
               (bootstrap) 100-individual subsamples dropped to  an average  of 1.1 classes (range,
               0.31 to 3.16). Counting all sampled individuals and then combining the metrics
               into a B-IBI permitted detection of 5.8 classes, the same sensitivity found by Fore
               et al. (1994) for a fish IBI. Counting random 100-individual subsamples from  each
               sample site, in contrast, allowed detection of only 2.1  classes of stream condition
               (e.g., "good" vs. "bad") (Figure 42). Given the time and energy devoted  by state
               agencies to biological monitoring, this resolution is unsatisfactory.
FIGURE 42. Average number of
classes detected by metrics in a 10-
metric B-IBI (see Table 11) and by
the B-IBI itself at different
subsample sizes. Data come from
a minimally disturbed stream in
King County, Washington.
05   6
Cfl
CO
"o
O   4
i_
03
E
-5   2
                                                                               Benthic IBI
        100    200   300   500    700   Whole
                  Subsample size
                                         105

-------
              Doberstein, Karr, and Conquest (in prep.) have also found that counting an
              increasing number of 100-individual subsamples permitted detection of an increas-
              ing number of classes. For three minimally disturbed streams, counting three 100-
              individual subsamples instead of one raised the detectable levels of stream condi-
              tion from 1.88 to 4.43. Would it not be simpler to count the whole sample to
              begin with?

              In sum, one needs large enough samples and multiple metrics for a truly
              multimetric picture of biological condition. Multiple metrics together provide a
              stronger signal than one or two alone and, further, allow diagnosis of the likely
              causes of degradation.
106

-------
                                                               PREMISE  29
                  AVOID THINKING  IN REGULATORY  DICHOTOMIES
     Because
    biological
 condition is a
   continuous
   variable, it
    should be
measured on a
   continuous
        scale
Xhe framework for environmental regulation necessarily divides actions and places
into those that are "in compliance" and those that are not on the basis of legal
standards and criteria that are assumed to protect the overall condition of a site
and its inhabitants. As a result, agency personnel tend to think in dichotomies and
to view sites as "impaired" or "unimpaired," "acceptable" or "unacceptable," and so
on (Murtaugh 1996). The trouble is, biological condition is not an either-or affair.
The condition of living systems within a region may vary from near pristine to
severely degraded. In other words, the biological condition of places falls along a
gradient. Therefore, to fully understand, rank, and evaluate those places, research-
ers should also measure biological condition along a gradient.

Multimetric biological indexes furnish a yardstick for measuring, tracking, evaluat-
ing, and communicating actual continuous variability in biological condition.
Instead of simply labeling a site "control" or "treatment," "impaired" or "unim-
paired," "acceptable" or "unacceptable," a multimetric assessment identifies and
preserves finer distinctions among sites, in the index itself and in the values of the
component metrics. Multimetric assessment automatically takes account of a site's
context, permitting distinctions among urban streams that might all be labeled
"impaired" in a dichotomous analysis. Suburban Swamp Creek sites near Seattle,
for example, have B-IBIs of 26 to 34,  which is clearly better than urban Thornton
Creek's range of 10 to 18 but not nearly as good as rural Rock Creek's 44 to 46.

Dichotomous methods for evaluating biological condition lead to a variety of
analytical and even regulatory problems. What is or is not an "acceptable"  thresh-
old in some biological (or chemical) factor depends on a site's context. Thresholds
considered acceptable in an urban stream may be totally unacceptable in a rural or
wildland stream. In addition, threshold definitions change over time as science and
human values change, people learn more, and measurement techniques become
more sophisticated. Through the years, the  regulated community as well as regula-
tors and other citizens have become frustrated by what they perceive as arbitrary
moving targets in the form of "minimum detectable" thresholds.

In contrast, measuring biological condition with a continuous yardstick such as IBI
puts a site  along a gradient of condition in comparison with other sites or other
times, allowing thresholds to be reset  according to context. It also permits a
ranking of many sites—which might all be labeled "degraded" in a dichotomous
scheme—so that priorities may be set for budget-constrained protection or restora-
tion efforts.
                                                                                   107

-------
       PREMISE 30
REFERENCE CONDITION  MUST BE  DEFINED  PROPERLY
              The goal of biological assessment is to detect and understand change in biological
              systems that results from the actions of human society. But change with respect to
              what? Just as economic analyses define a standard (e.g., 1950 dollars) against which
              economic activity can be judged, biological assessment must have a standard
              against which the conditions at one or more sites of interest can be evaluated. This
              standard, or reference condition, provides the baseline for site evaluation.

              In multimetric biological assessment, reference condition equates with biological
              integrity—defined as the condition at sites able to support and maintain a bal-
              anced, integrated, and adaptive biological system having the full range of elements
              and processes expected for a region. Biological integrity is the product of ecological
              and evolutionary processes at a site in the relative absence of human influence
              (Karr 1996); IBI thus explicitly incorporates biogeographic variation. Protecting
              biological integrity is a primary objective of the Clean Water Act. The value of IBI
              is that it enables us to detect and measure divergence from biological integrity.
              When divergence is detected, society has a choice: to accept divergence from
              integrity at that place and time, or to restore the site.

              Programs that measure biological and geophysical conditions in near-pristine
              environments provide much information about biotas and geophysical contexts  in
              different areas. They inform managers about natural ranges of variability and allow
              comparisons across watersheds and landscapes among streams of similar elevation,
              size, or channel type; they provide ecologists with needed information about the
              interplay of physical processes and biological responses. But reference condition is
              only half the picture. If the goal of water resource management is to halt degrada-
              tion of living aquatic systems, then managers must stop focusing exclusively on
              natural processes and responses, as they have for many years in trying to imple-
              ment biological criteria. Reference information is not enough.

              Furthermore, measuring pristine conditions in one ecoregion or subecoregion after
              another, year after year, will not slow the degradation of aquatic resources. Sam-
              pling pristine environments from every ecoregion or subecoregion does not
              necessarily add insight about which biological attributes provide reliable signals
              about resource condition. Putting as much effort into quantifying and evaluating
              human influence as into collecting biogeographical information is the only way  to
              discern biological signal from the background of natural variability. Sampling sites
              across a range of human influence provides the means to  detect that signal.
108

-------
The message here is clear. Agency biologists would do well to devote as much
effort to understanding how to detect human influence as to collecting biogeo-
graphical "reference" information. Until state and federal agencies understand the
importance of sampling across a gradient, both time and money will be wasted.

One major challenge is that there are few, if any, places left that have not been
influenced by human actions. Thus, defining and selecting reference sites, and
measuring conditions at those sites, requires a careful sampling and analysis plan.
Common pitfalls include using local sites that are degraded rather than looking
over a wider area for minimally disturbed sites; arbitrarily defining reference sites
without adequate screening or site evaluation; and classifying sites inaccurately so
that degraded sites are put into reference sets, especially when arbitrary statistical
rules (e.g., a site is considered "impaired" if it is 25% of reference condition) are
used to guide regulatory or other management decisions (e.g., Barbour et al.
1996a). Definition of reference condition in biological assessment may use modern
or historical data, or theoretical models (Hughes 1995). Some are better than
others.
The Wyoming Department of Environmental Quality, for example, requested
nominations for reference streams from water resource personnel in the state.
Analysis of biological data from 14 nominated sites (Patterson 1996) indicated that
three sites had IBI values substantially below reference condition; sources of
degradation could easily be identified even though the sites had been judged as
reference sites. Six additional sites also had low scores, suggesting some human-
induced degradation. The remaining five Wyoming reference sites were not likely
affected to any significant degree by human activity. In  this case, even professionals
erred in judging sites as unimpaired. Because defining  reference condition properly
is critical to the success of multimetric indexes, reference sites must actually be
minimally influenced by people.

To begin making biological monitoring more effective—that is, to get information
in the most cost-effective manner that can begin to protect water resources
immediately—biologists need to document and understand dose-response relation-
ships between particular biological attributes and human influence (see Premise 7,
page 30). They need to identify metrics that respond to human disturbance and
not just to geographical differences among ecoregions. They must shift their focus
from exhaustively characterizing ecoregions or defining reference condition to
sampling sites that have been subject to different intensities and types of human
influence. Finally, they must choose  a small set of metrics that provide reliable
signals about the effects of human activities in the region. Metrics must be chosen
according to their ability to distinguish between different types and intensities of
human actions. By integrating those metrics into a multimetric index, we have a
scientifically sound and policy-relevant tool to improve management of water
resources.
                                                                      109

-------
         PREMISE 31
 STATISTICAL DECISION  RULES  ARE NO SUBSTITUTE
 FOR BIOLOGICAL JUDGMENT
    Statistical
  significance
is not the same
  as biological
   importance
The objective of biological monitoring is to detect human-caused deviations from
baseline biological integrity (see Premise 5, page 21, and Figure 3, page 23) and to
evaluate the biological—not statistical—significance of those deviations and their
consequences (Stewart-Oaten et al. 1986, 1992; Stewart-Oaten 1996). In other
words, biological change, not/>-value, is the endpoint of concern. A statistically
significant result (small />-value) may not equate with a large, important effect, as
researchers often assume; similarly, a statistically insignificant effect (larger-value)
may well be biologically important (Yoccoz 1991; Stewart-Oaten 1996). Without
some statement about the probability of detecting an effect of given magnitude, it
is almost impossible for anyone to know for certain from, say, a Mest whether a
biological effect is present. It is too simplistic, and potentially misleading, to
assume that lack of statistical significance necessarily means that differences
between places do not exist. Only power analysis can define the precision of a
finding that two things do not differ.

Ecologists tend to overuse tests of significance (Yoccoz 1991). It is not enough to
detect differences in lieu of determining an impact's magnitude and cause or of
understanding its consequences (Stewart-Oaten 1996). It would be wiser to decide
first what is biologically relevant and then use hypothesis testing to look for
biologically relevant effects, not merely run a general "search for significance."
Overreliance on statistical correlation, Mests, or other statistical models can short-
circuit the process of looking at data and asking whether they make sense and what
they show. Dependence on/'-values can divert scientists  and managers from
exploring the biology responsible for the patterns in data, no matter when or by
whom they were collected.

To evaluate alternative decisions, scientists and managers should balance hypoth-
esis testing with other statistical tools, such as decision theory (Hilborn 1997);  they
should explore thoroughly the causes and consequences of differences in biological
condition. When a study is based on tested biological metrics, of course hypothesis
testing can be appropriate, as when sites upstream and downstream of a point
source need to be compared for setting regulations. But when a biologist or statisti-
cian reports a significant difference based on a/rvalue, the key next questions are,
How different? In what way? What is the effect in biological systems?
  110

-------
By providing a biological yardstick for ranking sites according to their condition,
multimetric indexes can answer these questions. Because their statistical properties
are known and their statistical power can be calculated (see Premise 15, page 63;
Peterman 1990; Fore et al. 1994), they can also be used to compare sites statisti-
cally. But a ranking according to biological condition is more appropriate than
statistical comparisons for setting site-specific restoration or conservation priorities.
                                                                        Ill

-------
        PREMISE  32
 MULTIVARIATE STATISTICAL ANALYSES OFTEN  OVERLOOK
 BIOLOGICAL KNOWLEDGE
Multivariate
analyses were
   developed
  forfinding
patterns, not
    assessing
    impacts
Xo many field biologists, "statistics" means "multivariate statistics" because field
data are complex and multidimensional. Despite the availability of numerous
statistical techniques, monitoring studies have used the same multivariate tech-
niques since the 1960s (Potvin and Travis 1993). These multivariate approaches-
including cluster analysis, factor analysis, and widely used ordination techniques
such as principal components analysis (PGA; James and McCulloch 1990)—extract
the maximum statistical variance in variance-covariance matrices, usually across
species or sites (Ludwig and Reynolds 1988). Unfortunately, the contexts in which
multivariate methods have been applied have often precluded detecting, under-
standing, and basing decisions on some of the most important signals from bio-
logical systems.

The fault lies not with multivariate statistics themselves, which can provide impor-
tant insights about the structure of data sets, but rather with how they are used.
Multivariate analyses were developed for pattern analysis, not impact assessment.
Failure to understand the difference, or to keep it in mind when interpreting
biological data, can lead to errors. We believe that misinterpretation is more
common with multivariate techniques than with the multimetric approach. Cer-
tainly it is easier for people without statistical training to understand the results of
a multimetric analysis. Many authors have covered the use of multivariate  methods
(Wright et al. 1993; Davies et al. 1995; Davies and Tsomides  1997; Walsh 1997), so
we focus on some of the problems associated with their misuse in biological
monitoring.

First, some ordination techniques, including PGA, assume that the data follow a
multivariate normal distribution (Tabachnik and Fidell 1989), which is in fact a rare
pattern in data from biological monitoring. These methods assume smooth con-
tinuous relationships, either linear or simple polynomial, but relationships among
environmental variables are often nonlinear. In multivariate analysis, the numerous
zeros and frequent high abundances typical of biomonitoring data are outliers with
a potentially strong influence on the statistical solution (Gauch 1982; Tabachnick
and Fidell 1989), so the data are often transformed to "fix" departures from nor-
mality, usually without success (Ter Braak 1986). Second, data are often edited (e.g.,
rare taxa are deleted), which may result in omitting important biological informa-
tion (Walsh 1997).
112

-------
               Third, depending on which variables an analysis includes, multivariate techniques
               may fail to discriminate among important sources of variation, such as natural and
               human-induced variation or variation caused by sampling, subsampling, and error.
               Most multivariate data matrices contain a mix of sites, some with little influence
               from humans, others subject to different degrees of human influence. The matrices
               often mix data from different seasons or from, for example, different stream sizes
               or lake types. Although variables may be similarly confounded in multimetric
               analyses, it is usually easier to recognize and avoid this pitfall because multimetric
               analyses do not rely on computers to "discover" the relevant pattern.

               Finally, multivariate approaches assume that statistically describing maximum
               variation will identify the most meaningful signal about biological condition. But
               because  multivariate methods reduce the dimensionality of the original data by
               extracting or "loading" the maximum amount of variation on successive axes, they
               lose biological information at each step. This problem is compounded if the initial
               choice of biological variables was made without considering whether the variables
               responded across degrees of human influence.
               The most common applications of multivariate statistics rely on lists of taxa and
               their abundances to detect differences among sampled sites or times (Reynoldson
               and Metcalfe-Smith 1992; Norris and Georges 1993; Norris 1995; Pan et al. 1996;
               Reynoldson and Zarull 1993). PGA, for instance, uses mathematical algorithms to
               extract variance from a matrix of species abundances, one of the most variable
               aspects of biology, rather than examining how the animals feed, reproduce, use
               their habitat, or respond to human activities. When species-abundance matrices are
               the focus, important ecological attributes  never even make it into the analysis. The
               combined loss of signal, because major important components of biology are
               ignored and because the statistical procedure cannot apportion variation to defin-
               able causes, limits the ability of the most common multivariate applications to
               discern complex patterns and to help investigators understand them.

               In one telling example of the pitfalls of multivariate analyses of species abundances,9
               two investigators advocated excluding rare species, saying that they simply add
               "noise to the community structure signal and . . . little information to the data
               analysis. ... We  recommend excluding all taxa that contribute less than 1% of the
               total number or occur at less that 10% of the sites" (Reynoldson and Rosenberg
               1996: 5; see also Marchant 1989; Norris 1995). Yet the presence of rare taxa indi-
               cates ecological conditions capable of supporting such often sensitive taxa, thereby
               offering  special clues about a site's environmental quality (Karr 1991;
               Courtemanch 1996; Fore et al. 1996).

               Furthermore, comparing the results of PGA using real data with PGA using matri-
               ces of random numbers shows that the  percentage of variation described may be
               similar for both, especially for the second and subsequent principal components;
               that loadings of original variables on principal axes are often as high for random
9  From the Ninth Annual Technical Information Workshop on study design and data analysis in benthic
  macroinvertebrate assessments (North American Benthological Society meeting, June 1996).
                                                                                    113

-------
              numbers as for real data; and that matrix size is an important determinant of the
              amount of variation extracted (Karr and Martin 1981). Multivariate techniques
              were unable to discern known deterministic relationships in one study (Armstrong
              1967), and in another, they manufactured relationships in data sets containing no
              such relationships (Rexstad et al. 1988).

              PGA reflects the underlying linear correlation (or covariance) among all the
              variables in the matrix. If no, or small, correlations exist, then PGA can manufac-
              ture relationships. The problem can be avoided with a careful examination of the
              correlation matrix before applying PGA. Without careful choice of variables
              conveying reliable signals about biological condition or, as Gotelli and Graves
              (1996) argue, without a comparison of the data against a null model showing
              pattern(s) that would occur in the absence of any effect, multivariate statistics can
              misguide resource assessment efforts. General uses of PGA seldom give results that
              go beyond common sense (Karr and Martin 1981; Fore et al. 1996; Stewart-Oaten
              1996). Gotelli  and Graves (1996: 137) go so far as to  suggest that "multivariate
              analysis has been greatly abused by ecologists. . . . [D]rawing polygons (or amoe-
              bas) around groups of species [or points],  and interpreting the results often
              amounts to ecological palmistry. Ad hoc 'explanations' often are based on the
              original untransformed variables, so that the multivariate transformation offers no
              more insight than the original variables did."

              The key danger of overreliance on multivariate analyses is that management
              decisions may be based on statistical properties of data—on the  structure of a
              covariance matrix—rather than on biological knowledge and understanding. In
              fact, when multivariate analyses examine the same biological attributes used in
              multimetric indexes, they yield essentially identical results (Hughes et al., in press).
              The key message, then, is to use procedures to account for biological impacts, not
              just to describe pattern. Avoid analytical "shortcuts"  that are not easily understood
              or that must be done idiosyncratically for every data set. There is simply no
              substitute, either in multivariate statistics or in multimetric indexes, for careful
              application of biological and ecological knowledge, regardless of analytical tool.
              Careful design of sampling, thoughtful analysis of data, and careful description of
              biological condition can eliminate the need for general approaches that merely
              extract variation.
114

-------
                                                              PREMISE  33
                                ASSESSING HABITAT  CANNOT  REPLACE
                                                        ASSESSING  THE  BIOTA
 Don't assume
    that if you
        build
    "habitat,"
the inhabitants
    will come
In its broadest sense, habitat means the place where an organism lives, including all
its physical, chemical, and biological dimensions; an oak-hickory forest or a cold-
water stream is a habitat. Habitat also refers more narrowly to the physical struc-
ture of an environment. In streams, habitat structure generally means the physical
structure of the channel and near-channel environment. Stream biologists see
habitat structure as a critical component of environmental condition; they view
habitat assessment, which involves measuring physical habitat structure, as a way to
compare present structure  with some idealized habitat.
Increasingly, scientists and managers  have come to equate the presence of such
idealized habitat with  the presence of an organism; measuring habitat can even
take the place of looking for the living inhabitants. But the presence of a given
habitat structure does  not guarantee the presence of desired biological inhabitants,
any more than chemically clean water guarantees a biologically healthy stream.
Stream habitat features include channel width and stability,  water depth, streambed
particle size, current velocity, and flow volume (Gorman and Karr 1978; Rankin
1995). These factors interact to define the mix of pools and  riffles, pattern of
meanders, or braiding characterizing  a stream channel. Width of the riparian area
and floodplain, riparian canopy cover, bank condition, and  woody debris are also
important components of habitat structure.
Habitat assessments focus on such physical features to determine the suitability of
a physical environment for an aquatic biota. In a habitat assessment, managers
may measure the physical habitat directly, as  in the habitat evaluation procedures
developed by the US Fish and Wildlife  Service (USFWS), or they may infer habitat
condition from mathematical models, such as USFWS's in-stream flow incremen-
tal method. Unfortunately, some have used these models to justify spending
millions of dollars  on  "in-stream structure" without assessing biological responses
or even the persistence of those structures in  dynamic  channels.
But habitat structure, like water quality, is only one of the five factors affected by
human activities in a watershed (see Table 9, page 67).  Severe physical damage to a
stream channel is easy to see and document, but subtle degradation invisible to
human observers may be biologically just as destructive. When resource agencies
measure habitat variables in lieu of testing the response of biological systems to
                                                                                  115

-------
              human disturbance, they effectively assume that disturbance affects only physical
              habitat and that only visible damage harms the biota.

              Yet measuring habitat structure may not reflect past sediment torrents or debris
              flows from upstream or from a road built along the channel. Habitat assessments
              do not reliably account for how floods or droughts are exacerbated by changes in
              the extent of impervious area in a watershed or the effects of water withdrawals.
              Hyporheic connections, too, are difficult to measure and poorly understood, yet
              the hyporheic zone is a critical refuge for organisms during floods or drought.
              When groundwater flow patterns are altered by water withdrawal, these connec-
              tions are broken; the consequences can be judged only by measuring the condition
              of the  biota. Although simple biotic measures may not detect specific changes in
              the hyporheic zone, a biological change can lead to further investigations to
              identify the cause.

              Measuring physical habitat cannot determine the effects on  resident organisms of
              introduced and alien species, chemical contaminants, changes in temperature, or
              dissolved oxygen. Measuring habitat structure in a stream where an invisible or
              unmeasurable form of water pollution is impairing the biota, for example, could
              lead one to conclude that the biota is healthy when it is not. Measures of stream
              habitat convey an incomplete picture of a stream's biological condition. Sampling
              water quality or habitat structure can aid in interpreting data on biological condi-
              tion; it cannot and should not be used to define biological condition.

              Fishery managers once neglected the physical structure of stream environments or
              considered it unimportant. But simply reversing that view is equally misguided.
              Habitat assessment alone does not capture all the ways that humans influence
              water resources. Using habitat surrogates to draw inferences about biological
              condition does not account for interactions between predators and prey, timing of
              peak or low flows, competition, alien species, or harvesting.

              Worse, to talk of protecting "fish habitat" (or, more extreme, "fishery habitat")
              implies that we know what fish need; it implies that we can "fix" biological condi-
              tion by fixing the habitat—by adding woody debris, building spawning channels, or
              bulldozing to create pools. Yet anadromous fish populations continue to decline in
              the Pacific Northwest despite expensive projects to restore stream channels and
              construct "spawning channels." A stream is more than a collection of habitat types.
              Physical habitat criteria are necessary, but entirely insufficient, to ensure commod-
              ity production of wild salmon, let alone biological integrity.
116

-------
                                  SECTION V
MANY  CRITICISMS  OF MULTIMETRIC

                       INDEXES ARE  MYTHS

        Ihe multimetric approach has come under fire from toxicologists,
     ecologists, and water managers on several grounds (Calow 1992; Suter
    1993; Wicklum and Davies 1995). Yet numerous successful applications
       of multimetric biological monitoring and assessment (Yoder 199 la;
       Davis and Simon 1995; Lyons et al. 1995, 1996; Davis et al. 1996),
       explicit responses to the critics (Karr 1993; Simon and Lyons 1995;
       Hughes et al., in press), and the work on which this report is based
   suggest that biological criteria and multimetric indexes constitute robust
    tools for monitoring rivers and streams, especially when compared with
                  the virtual lack of biological monitoring in the past.
                          We  explore some of the criticisms here.
                                                  117

-------
           MYTH  1
"BIOLOGY is  TOO VARIABLE TO MONITOR"
              The success of biological monitoring rests on our ability to select good indicators,
              indicators that are sensitive to the underlying conditions of interest (i.e., human
              influence) but insensitive to extraneous factors (Patil 1991). The belief that biology
              is too variable to monitor comes not from a lack of good indicators but from past
              failures to find the right indicators.

              Because studies of naturally variable attributes such as population size, density, and
              abundance have dominated ecology for the better part of a century, resource
              managers as well as ecologists tend to regard biological assessments as less consis-
              tent than chemical assessments. But not all biological attributes vary as much as
              population size, density, and abundance; indeed, attributes such as taxa richness
              yield clear, consistent patterns in response to human actions. The issue, then, is
              not "biology vs. consistency" but, rather, which attributes of biology make sense to
              monitor: Which attributes respond predictably to gradients of human influence?
              Measuring biological attributes that do respond consistently gives important
              insights about the condition of water bodies.

              The sources of variability in data—whether chemical, physical, or biological—must
              be controlled in field sampling protocols and laboratory procedures. Standardized
              lab procedures helped reduce the variability of chemical data but did not eliminate
              it.  In the past decade, major advances have been made to standardize field biologi-
              cal sampling—in particular, to identify those biological attributes whose signal-to-
              noise ratio is high and that respond predictably to human impact.

              Patterns in biological variability also offer some unexpected insights into  human
              impact. Several studies have observed a correlation between mean and variance in
              IBI (see Premise 14, page 56): as IBI decreases, its variance  increases (Karr et al.
              1987;  Steedman 1988; Rankin and Yoder 1990; Yoder 1991b). This association
              could reflect real changes in the resident biota at degraded  sites, it could be a
              statistical artifact, or it may not be a general phenomenon. Hugueny et al. (1996),
              for example, reported lower variation in IBI at a disturbed  site than at an  upstream
              site. In the Willamette River, Oregon, standard deviations of IBI were highest at
              intermediate values (Hughes et al., in press). Using the bootstrap algorithm, Fore et
              al. (1994) demonstrated that the increased variance of IBI values at degraded sites
              did reflect biological changes in the resident assemblage; this conclusion supports
              the observation that biological systems subjected to high human disturbance are
              less resilient to environmental change. A thoughtful exploration of the specific
              circumstances in each of these cases might clarify these relationships.
118

-------
Of course, natural variability cannot be separated entirely from human-induced
variability, for human disturbance often exacerbates the effects of natural events
(Schlosser 1990); floods or low flows are often more extreme in damaged water-
sheds, for example (Poff et al. 1997). The higher variability of IBI values observed
at degraded sites (Karr et al.  1987; Steedman 1988; Fore et al. 1994; Yoder and
Rankin 1995b) does point to effects on the sites' biological systems that mirror
physical signs of degradation and suggests that highly variable IBIs may be an
early-warning sign of excessive human impact.
                                                                    119

-------
           MYTH 2
"BIOLOGICAL ASSESSMENT is  CIRCULAR"
                    have complained that IBI development is circular because biologists look at
              a site, decide whether it is degraded or pristine, and then develop metrics and an
              index that show the sites to be degraded or pristine as first observed. This view is
              flawed on two levels. On a concrete level, comparison of site condition with a
              regionally defined reference condition and assemblage— not one's own first obser-
              vations—is built into metric testing and index development.

              On a second, more abstract level, index development may appear circular because
              of the interplay of observation and experimentation that lies at the heart of sci-
              ence. Assessing water resources rarely allows replicated experiments; only one
              Puget Sound is available, for example, and controlled experiments at that scale are
              unlikely. Yet the links between certain human activities in watersheds and the
              biological health of the rivers running through those watersheds are clearly visible.
              As knowledge accumulates from repeated observation of real-world patterns, our
              confidence in the generality of those patterns increases.

              Circularity can be avoided through repeated  rigorous documentation of biological
              responses to a wide range of human actions (development of ecological dose-
              response curves) in a wide range of geographic areas. Ecological dose-response
              curves depict patterns that are both qualitative and quantitative, as well as consis-
              tent across a broad range of circumstances. For river fishes, for example, the same
              metrics (see Table 8, page 59) respond to human influence in studies in many
              habitats, under many human impacts, and for many regional assemblages (Miller
              et al. 1988; Lyons 1992a; Lyons et  al. 1995, 1996; Oberdorff and Hughes  1992;
              Hughes et al., in press). The same holds true for invertebrates (see Table 6, page 57;
              Table 7, page 58; and Table 11, page 103). Indeed, many of the same attributes are
              consistent indicators for a variety of faunas (see Table  5, page 52, and Table 11,
              page 103).
              In her study of 115 streams in west-central Japan, Rossano (1995, 1996) convinc-
              ingly demonstrated that IBI development is not circular; her work also verified
              dose-response patterns previously described for North America. Rossano first
              classified all 115 streams according to the type and magnitude of human activity
              within their watersheds (see Figure 4, page 31). After selecting a few streams that
              appeared the best and the worst, she randomly chose half the streams and plotted
              the quantitative values for biological attributes expected to change in those streams
              across her gradient of human influence (see Figure 5, top, page 32). She found
              distinct dose-response curves for some of the plotted metrics, including total taxa
120

-------
richness, number of intolerant taxa, number of clinger taxa, and relative abundance
of tolerants (see Figure 14, page 42); these attributes also respond to human impact
in North America. Rossano then scored these metrics (see Premise 14, page 56),
summed the scores to yield a B-IBI for each site, and plotted the B-IBI values
against human influence (see Figure 5, top, page 32). Finally, she applied the same
metrics and scoring criteria from the first half of the data set to the other half of
the  115 streams; B-IBIs from both sets of streams followed nearly identical patterns
(see Figure 5, bottom, page 32; Rossano 1995,  1996).

Such systematic documentation and testing of metrics in many places and with
many human influences reinforces the validity of those metrics and the resulting
IBIs as accurate yardsticks of human impact.
                                                                    121

-------
          MYTH 3
"WE  CAN'T PROVE THAT HUMANS  DEGRADE  LIVING
SYSTEMS WITHOUT KNOWING THE MECHANISM"
             Xhis comment implies that we must understand the means by which something
             happens, not just that it happens, before we can act. We hear this comment from
             two rather different groups. The first is basic natural scientists, who focus on
             process and cause and effect and subscribe to the mantra of a = 0.05 and the null
             hypothesis of no effect (Shrader-Frechette 1996). Rarely have these scientists been
             faced with day-to-day environmental decision making. The second group embraces
             this view as a stalling tactic for overusing ecological systems, sidestepping their
             own responsibility while blaming "science" for knowing too little.

             But where would medicine be now if doctors had to understand how diseases
             worked before treating them or how drugs worked before using them? For centuries,
             people have prevented or cured diseases and alleviated symptoms with drugs, such
             as aspirin, even though they did not know the physiological mechanism by which
             the drugs acted. Modern medicine recognizes and combats viral and bacterial
             diseases without fully understanding how each virus or bacterium does its damage.
             Humans routinely act on the basis of what they see without knowing every mecha-
             nism behind it.
             Of course, we want to know how observed changes come about in biological
             systems altered  by humans. But those mechanistic explanations are not essential
             for using biological monitoring to indicate degradation and find likely causes. The
             number of clinger taxa declines very reliably along gradients of human influence
             (Figure 43),  regardless of what we do or do not know about the  specific mecha-
             nisms responsible. Perhaps fine sediments fill the spaces among cobbles, destroying
             the clinger's physical habitat. Perhaps clingers are more exposed to predators as
             they move out of the sediment-laden spaces. Perhaps upwelling from hyporheic
             zones no longer supplies cool oxygenated water. Perhaps the diverse foods of many
             clinger species are no longer available. Perhaps all these factors are operating.
             Perhaps some other mechanism is responsible. But although the mechanism is not
             documented, the empirical pattern is clear. We would be foolish not to use it to
             detect degradation and to take actions to protect water resources.
122

-------
FIGURE 43. Number of clinger
taxa plotted against a human
influence gradient for Japanese
streams. (From Rossano 1995.)
_c
o
CO
g
•t—'
1_
CD
c
O
    20
                                         10
                                                @@    e
                                             Low
                                            High
                                                        Human influence
                                                                               123

-------
          MYTH  4
"INDEXES COMBINE AND  THUS LOSE INFORMATION"
              .Because a multimetric index like IBI is a single numeric value, critics have as-
              sumed that the information associated with the metrics is somehow lost in calcu-
              lating the index itself (USEPA 1985; Suter 1993). Not at all.
              Multimetric indexes condense, integrate, and summarize—not lose—information.
              They comprise the summed response signatures of individual metrics, which
              individually point to likely causes of degradation at different sites (Karr et al. 1986;
              Yoder 1991b; Yoder and Rankin 1995b). Although a single number, the index, is
              used to rank the condition of sites within a region, details about each site—ex-
              pressed in the values of the component metrics—remain (Simon and Lyons  1995).
              It is straightforward to translate these numeric values into words describing the
              precise nature of each component in a multimetric evaluation. These descriptions,
              together with their numeric values, are available for making site-specific assess-
              ments, such as pinpointing sources of degradation (Yoder and Rankin 1995a) or
              identifying which attributes of a biotic assemblage are affected by human activities
              (see Figure 17, page 43).
              At a site in urban Thornton Creek in Seattle, for example, total taxa richness is
              25% of a reference stream minimally affected by human activity, Rock Creek in
              rural King County. Thornton Creek has only one mayfly taxon and no caddisflies
              or stoneflies, compared with five, six, and seven taxa of mayflies, caddisflies, and
              stoneflies, respectively, in Rock Creek.  Individuals belonging to tolerant taxa make
              up more than 50% of the individuals in Thornton Creek samples and only 26% in
              Rock Creek samples. Thornton Creek has no long-lived or intolerant taxa, while
              Rock Creek supports four intolerant and two long-lived taxa. Rock Creek has a
              benthic IBI of 44 (maximum 50), whereas Thornton Creek's IBI is only 10 (mini-
              mum 10). Narrative descriptions of the sites as  well as the numeric values for each
              metric and the B-IBI tell us a great  deal about these two streams.
              Those who advocate multivariate statistical analyses for biological monitoring
              insist that multimetric indexes lose  information selectively. In their view, multivari-
              ate statistics extract biological patterns  from the whole data set. Yet many multi-
              variate analyses exclude rare taxa (see Premise 32, page 112) or examine only
              species lists and abundances, an approach that  overlooks organisms' natural history
              and ecology or the known responses of specific taxa to human actions. Multivari-
              ate statistical algorithms are based on the structure of variance-covariance matrices,
              not on specific knowledge of how organisms develop, find food, reproduce, and
              interact with one another and their physical and chemical surroundings.


124

-------
Although management decisions can be, and have been, based on multivariate
statistical analyses of biological data (Reynoldson and Zarull 1993; Wright et al.
1993; Davies et al. 1995), the decision process is hardly transparent to anyone who
does not understand the mathematical algorithms or the models' underlying
assumptions. In our view, multivariate statistics' inherent complexity distracts
biologists from making clear, testable statements to one another and to nonscien-
tists about how the biota of a place responds to human influence.
                                                                    125

-------
          MYTH  5
"MULTIMETRIC  INDEXES AREN'T EFFECTIVE  BECAUSE THEIR
STATISTICAL PROPERTIES  ARE  UNCERTAIN"
              Although there may have been a basis for this statement in years past, recent work
              on the statistical properties of biological data and of the multimetric index suggests
              that, as for any other procedure, careful program design—from sampling and field
              work to data analysis—can yield data and conclusions that are both biologically
              useful and statistically robust. More important, perhaps, recent work also shows
              that the problems associated with biological data of all kinds can be reduced by
              systematic planning, data collection, and analytical procedures. Conversely, when
              sampling design and data quality are not rigorously controlled, no procedure or
              approach can have known statistical properties.

              In particular, bootstrap analysis of real data has demonstrated that the fish IBI
              approximates a normally distributed random variable (Fore et al.  1994; see Premise
              15, page 63). In this study, the statistical precision of the fish IBI  agrees with data
              collected over periods of two to eight years for both fish and invertebrates
              (Angermeier and Karr 1986; Karr et al. 1987). For example, 13 lowland Puget
              Sound streams were sampled at the same sites in successive years  (1994-95) to
              evaluate between-year variation in the streams when human activities had not
              changed. B-IBI for these streams changed by no more than 4 during that two-year
              study; two sites increased by 2,  four decreased by 2, three decreased by four; and 4
              were unchanged. All changed by 10% or less of the range of B-IBI, an exceptional
              stability for most biological analyses. Similar concordance among years was de-
              tected in studies in Oregon (R. M. Hughes, pers. commun.).

              Statistical properties of multimetric indexes are known (see Premise 15, page 63), as
              are the sources of variation (see Premise 19, page 80). When one knows the sources
              of variation, one can construct studies to limit their influence. Too often biologists
              seek to incorporate all sources of variation rather than design a study to focus on
              the kinds of variation relevant to program goals.

              Biological monitoring has come a long  way since the early 1980s  in identifying the
              biological attributes to measure and in integrating these measures statistically in
              ways precise enough to describe the status and trends of biological systems. The
              declines in living aquatic systems tell us that we cannot afford not to use the tools
              we have or to stop seeking still better ones.
126

-------
                                                MYTH  6
"A  NONTRIVIAL EFFORT IS  REQUIRED TO  CALIBRATE
                                     THE  INDEX REGIONALLY"
This criticism hinges on the assumption that developing and using a multimetric
biological index costs lots of time and money. True, the required effort is non-
trivial, but how trivial is it to count permits issued, accumulate fines, collect
samples, or produce meaningless "305(b) reports" that are not representative of
regional or national conditions? How much money do agencies spend on these
activities?
In fact, the cost of biological monitoring is often less than that of more conven-
tional approaches (Yoder 1989; Table 12). Most important, the long-term cost of
not doing effective biological monitoring is highest of all—the continued degrada-
tion and ultimate loss of the most valued components of life in our waters. "The
specter of millions of dollars being misspent on environmental controls, without
strong evidence of the efficacy of the treatment, indicates that money spent on
high-quality monitoring programs is money well spent" (Rankin 1995).

Over the past three years, Karr and several graduate students have developed and
implemented region-specific biological standards in small streams and shown that
biological responses to human actions can be documented and generally under-
stood from studies lasting months, not years. Two master's students at the Univer-
sity of Washington each sampled about 30 sites in one year and one season (four
weeks of field work). Each study yielded enough data to define and calibrate a B-
IBI for the Puget Sound lowlands (Kleindl 1995) or Grand Teton National Park
(Patterson  1996). Kleindl and Patterson also required approximately three months
of laboratory time for counting and identifying three replicate benthic invertebrate
samples for each study site.  Thus, geographic calibration can be accomplished
within the  time frame and budget of a master's project. Surely each region's water
resources are worth that level of commitment.
                                                                 127

-------
TABLE  12. Comparative costs (in US dollars) of collecting, processing, and analyzing samples to evaluate the
quality of a water resource. (Data from Ohio EPA provided by C. O. Yoder.)

                                                          Per sample8           Per evaluation"

Chemical and physical water quality
4 samples per site                                                 1436                  8616
6 samples per site                                                 2154                 12,924

Bioassay
Screening (acute 48-hour exposure)                                 1191                  3573
Definitive (LC50C and EC50d, 48- and 96- hour)                         1848                  5544
Seven-day (acute and chronic effects, 7-day exposure, single sample)   3052                  9156
Seven-day (as above but with composite sample collected daily)        6106                 18,318
Macroinvertebrate community                                       824                  4120
Fish community                                                    740                  3700
Fish and macroinvertebrates combined                               1564                  7820
" Cost to sample one location or one effluent; standard evaluation protocols specify multiple samples per location.
' Cost to evaluate the impact of an entity; this example assumes sampling five stream sites and one effluent discharge.
c Dose of toxicant that is lethal to  50% of the organisms in the test conditions at a specified time.
d Concentration at which specified effect (e.g., hemorrhaging, pupil dilation, swimming cessation) is observed in 50% of
  tested organisms.
 128

-------
                                                           MYTH  7
"THE SENSITIVITY OF MULTIMETRIC INDEXES IS  UNKNOWN"
             I his statement implies that multimetric indexes cannot discern and separate
            patterns of biological consequence from the noise of variation (natural, sampling,
            crew, seasonal, and so on). But the many examples we cite from scientists and
            managers show that a modest effort by a few people can systematically document
            biological patterns that are useful in research, management, and regulatory con-
            texts. The key is to define ecological dose-response curves for a range of geographic
            areas and diverse human influences (logging, agriculture, recreation, and urbaniza-
            tion). We must connect human actions to biological change.
                                                                          129

-------
                                            SECTION VI
                               THE  FUTURE  Is  Now
   Iwenty-five years after passage of the Clean Water Act, we can be thankful that
our rivers no longer catch fire. But the science of biological monitoring is still way
    ahead of the regulatory and policy framework used to manage water resources.
The problem lies not in the letter or spirit of our laws but in a pervasive reluctance
        to shift from a narrow pollution-control mentality to a broader regard for
                                     the biological condition of our waters.
                    Humans tend to fiddle while Rome burns—not deliberately
     but because we react ineptly to complex situations.  Faced with problems that
   exceed our grasp, we pile small error upon small error to arrive at spectacularly
wrong conclusions (Dorner 1996). We did this when we built Egypt's Aswan Dam,
     disrupting a cycle of flooding and Nile Valley fertilization that had sustained
     farmers for millennia; we did it in the series of events leading up to the 1986
            explosion of Reactor 4 at Chernobyl. Are we doomed to do it while
                 our rivers, lakes, wetlands, and oceans get deeper into trouble?
                                                               131

-------
         PREMISE 34
 WE CAN  AND  MUST  TRANSLATE  BIOLOGICAL CONDITION
 INTO  REGULATORY STANDARDS
  We have the
   knowledge
and the know-
   how to use
    biological
  criteria; let's
  stop arguing
 and use them
 Vv hen the 1972 amendments to the Water Pollution Control Act were being
debated in Congress, then-EPA Administrator William Ruckelshaus testified in the
House of Representatives against the House bill. Referring to its general objective
to "restore and maintain . . . chemical, physical, and biological integrity,"
Ruckelshaus stated, "We do not support the new purpose or 'general objective' that
would be provided. The pursuit of natural integrity for its own sake without regard
to the various beneficial uses of water is unnecessary" (Committee on Public Works
1973). Later, after President Nixon had vetoed the amendments, the Senate Com-
mittee on Environment and Public Works underwent 33 days of hearings, 171
witnesses, 470 statements, 6400 pages of testimony, and 45 subcommittee and full-
committee markup sessions—and concluded that "chronic adverse biological
impact may be a greater problem than the acute  results  of discharge of raw sewage
or large toxic spills" (Muskie  1992). The 1972 Water Pollution Control Act amend-
ments finally passed, over the presidential veto, setting the restoration and mainte-
nance of the biological integrity of water as the first of three broad goals.

For Ruckelshaus at the time, apparently, water "use" by  humans was the whole
story, and consumptive uses of water were legitimate while nonconsumptive uses,
such as keeping fish and wildlife alive, recreation, or aesthetics, were not suffi-
ciently "beneficial." Like so many water resource managers before and since, the
EPA administrator saw water as a fluid,  a commodity to be bought and sold, not as
a complex biological system that provides diverse goods and services to society. For
him and his agency, clean water was enough.

Clean water still seems to be enough for many in agency circles. Water resource
managers schooled in the language and dogma of chemical pollution have been
slow to adopt a broader view of resource degradation. Decision makers stay safely
with existing rules and standards, most often interpreting them more narrowly than
even the letter of the law suggests they should be interpreted. The federal and state
agencies responsible for writing regulations, tracking water resource condition, and
creating water-protecting incentives are  reluctant to embrace biological integrity as
a primary goal.

At present, water quality standards—the formalized rules regulators use to protect
water resources—contain three components: designated  uses, criteria, and the
principle of antidegradation.  (The antidegradation goal entered the regulatory
 132

-------
agenda in the 1980s under the broad reasoning that water resource decisions
should allow no further degradation. In theory, the antidegradation philosophy
was supposed to end past acceptance of "dilution is the solution to pollution.")
Under these rules, each state must define designated uses, or goals, for all water
bodies within its boundaries. Criteria—generally numeric and chemical but some-
times narrative and biological (e.g., that conditions be "fishable and swimmable"
or adequate to "protect aquatic life")—are then established on the assumption that
preventing violations of the criteria will protect the designated uses.

Chemical water quality measures, permits issued, and fines levied are still the
primary currencies in most state water quality programs for protecting designated
uses. The lion's share of water resource funding still goes  to controlling point-
source pollution, despite widespread knowledge that nonpoint pollution and
nonchemical factors damage more miles of streams and acres of lakes  than do
point sources (see Table 9, page 67)—and this despite advances in biological moni-
toring that have laid a strong foundation for setting numeric biological criteria. It is
past time to include biological monitoring, and the scientific assessment of re-
source condition it produces, into decision making. Biological criteria, and the
regulations to implement them, would be better able to address society's present
values and more appropriate for targeting expenditures to protect the quality of life
in our waters and our communities.
As we have tried to show in this report, when supported by classification to mini-
mize the heterogeneity of samples, an appropriate number of metrics proven to
vary along a gradient of human influence, and standardized scoring procedures,
multimetric biological monitoring and assessment can give decision makers clear
signals about the condition of water resources—knowledge that is the essential first
step toward wise targeting of expenditures to protect or restore those resources. So
why have only two states incorporated biological monitoring and numeric biologi-
cal criteria into water quality standards? Why have only 15 more begun to develop
such criteria (Davis et al. 1996)—despite calls to do so in the law, the scientific
literature (Karr and Dudley 1981; Davis and Simon 1995), and the government's
own documents (USEPA 1988, 1990, 1996b)?

One may regard the glass as half full or half empty. Virtually no state had biologi-
cal criteria in 1981 when the first multimetric fish IBI appeared (Karr 1981). And
although adoption of numeric biological criteria has been slow (Davis et al. 1996),
the last decade has brought progress: 29 more states now  have narrative biological
water  quality standards, and 11 are developing them. Ohio, for example, has used
the fish IBI and ICI, an invertebrate derivative of the fish IBI, to define two levels
of biocriteria, excellent warm-water habitat and warm-water habitat, expressed as
numeric standards. The criterion for excellent warm-water habitat was initially set
at IBI = 50 for most of Ohio, to protect the state's highest-quality  waters from
additional degradation. Warm-water habitat (IBI > 40) applies to moderately
degraded areas; this criterion is intended to prevent further degradation and
provides an attainable benchmark for restoration of streams in watersheds that
humans have heavily influenced.


                                                                      133

-------
              Thus it is hardly farfetched to imagine use of biological criteria in all states. We
              have broad national objectives, reasonable criteria, and multimetric indexes that
              are biologically sound and statistically robust. Isn't it time for researchers and
              policymakers to stop arguing about whether we know enough to act definitively?
              Of course we don't know everything; of course water bodies, like forests, are more
              complicated than we can know. But we know a great deal. Perhaps we would make
              more progress in protecting our waters if researchers all agreed not to ask for
              further funding until regulatory agencies used the knowledge already piled up in
              their archives. Can we look forward to a lull in our research programs?
134

-------
                                                         PREMISE 35
CITIZEN  GROUPS ARE CHANGING THEIR THINKING  FASTER
                                             THAN  BUREAUCRACIES ARE
           Polls and a fast-rising number of grassroots watershed activities clearly show that
           the American people are aware of and concerned about the nation's rivers, lakes,
           wetlands, and oceans. Citizens are more informed scientifically than they were a
           couple of generations ago, and they are increasingly alarmed by what they see
           being lost from our waterways. People across the country identify water pollution
           as the most important environmental issue (e.g., in the Pacific Northwest; Harris
           and Associates 1995). US coastal county and city managers have ranked safe, clean
           drinking water as number one among critical national issues (NOAA press release,
           May  1997, http://www.noaa.gov/public-affairs); indeed, 58% of these managers
           ranked clean water as equal to or more important than health care. In a survey
           conducted for American Rivers,  94% of respondents identified contamination of
           drinking water by sewage and industrial waste as a primary concern.

           Such concerns have sparked thousands of citizen initiatives to monitor water
           quality and river health. The 1996-97 River and Watershed Conservation Directory
           (River Network 1996) lists some  3000 organizations and agencies in the United
           States whose missions directly address river or watershed protection. Mainstream
           organizations from the Izaak Walton League to Trout Unlimited have also ex-
           panded their view of rivers and river health. Local chapters of both these groups
           have  begun to emphasize broader understanding of the causes and treatment of
           river  degradation. New national  organizations are developing as well. These in-
           clude Project GREEN, Adopt-a-Stream Foundation, River Network, and River
           Watch Network (Karr et al. 1998).

           River monitoring done through  the schools has become one of the fastest growing
           elements of volunteer monitoring (USEPA 1994c). Colorado Waterwatch, for
           example, is a partnership of the State Division of Wildlife and teachers and  stu-
           dents at more than 250 schools;  students monitor some 500 stations throughout
           the state of Colorado. In Seattle, Washington, the Thornton Creek Alliance ties
           together the teachers and students in 28 elementary through high schools in a
           network, centered on rivers, with local business and political leaders. Rivers pro-
           vide the theme for interdisciplinary education, and everyone gains a better under-
           standing of local landscapes and a stronger sense of community.
  We need not
 be trapped by
 our old ways
  of thinking;
rather, we can
   learnfrom
       them
           At the same time, individual scientists and historically conservative scientific
           groups such as the American Fisheries Society, the Ecological Society of America,
                                                                            135

-------
              and the North American Benthological Society have expanded their efforts to
              reach governments and citizen groups. The Ecological Society, for example, has
              started a new series of publications, Issues in Ecology, targeted to the press,
              policymakers, and the public. The Benthological Society is establishing liaisons
              with major North American conservation organizations, developing a database of
              professionals willing to share their expertise widely, and selling slides and slide sets
              for use in educational programs.
              A curious, and telling, element in many citizen initiatives is that they are funded in
              part by local, state, and federal governments. King County, Washington, supports
              numerous citizen alliances seeking to learn more about their watersheds. A state-
              wide Governor's Watershed Enhancement Board in Oregon makes substantial
              amounts of money available for local watershed initiatives. EPA has also funded
              numerous local groups  to monitor and restore the condition of rivers. Why, we
              ask, are these agencies not doing more to broaden perspectives in their own ranks?
              Why are they not strengthening their own programs to track biological condition,
              as required under section  301(b) of the Clean Water Act?

              If, as Dorner (1996) argues, failure has its own logic, that logic is seldom more
              obvious than in the workings of our bureaucracies. Humans long ago developed
              the tendency to deal with problems on an  ad hoc basis.  We defined and solved
              problems one at a time; we didn't need to  see a situation embedded in the context
              of other situations; we thought in straight, cause-and-effect lines about one dimen-
              sion at a time. Contemporary decision makers still (Dorner 1996: 18)

              m  Act without first analyzing the situation.
              B  Fail to anticipate side  effects and long-term repercussions.
              B  Assume that the absence of immediately obvious negative effects means that
                 correct measures have been taken.
              B  Let over-involvement  in "projects" blind them to emerging needs and changes
                 in the situation.
              B  Are prone to cynical reactions.

              The inappropriateness of these reactions for solving modern problems is only
              made worse by the difficulty of separating good information from bad when we are
              overloaded with information;  our reluctance to accept new knowledge even when
              we see that it's good; and  defense of the status quo by bureaucracies and other
              vested economic, scientific, and social interests. This kind of approach worked fine
              in simpler, slower times; it doesn't work now in this complex, increasingly high-
              speed world. We need to respond quickly, and correctly, to our present environ-
              mental problems, but bureaucracies seem incapable of fast responses.

              Still, there are no magic solutions for overcoming our plodding ways of dealing
              with complex problems. But it helps to know how we think—that we sometimes
              think badly, that we often become stuck in old  ways when new ways would be far
              better. It helps to realize that facing up to the next century's challenges does not
              necessarily require us to tap into some hitherto fallow 90% of our brain potential;
              rather, it requires the development of our common sense, our flexibility, our ability
136

-------
to anticipate consequences (Dorner 1996). Albert Einstein put it this way: "You
cannot solve a problem by applying the same conceptual framework that created
the problem." Environmental educator David Orr (1994) says simply, "Think at
right angles."
                                                                 137

-------
       PREMISE 36
CAN WE AFFORD HEALTHY  WATERS?  WE  CAN AFFORD
NOTHING LESS
              Until all states see protecting biological condition as a central responsibility of
              water resource management, until they see biological monitoring as essential to
              track attainment of that goal and biological criteria as enforceable standards
              mandated by the Clean Water Act, life in the nation's waters will continue to
              decline.
              We are all responsible, and we all need to do better. We must take a broader view
              of the problems we face if we hope to devise effective solutions;  we must also
              explicitly recognize the nature of modern organizational systems and hold them
              accountable (Bella 1997). Citizens need  to increase their understanding of science
              and continue to put pressure on governments to act. Scientists need to strengthen
              their biological monitoring approaches,  talk with neighbors and  relatives, write
              outside of technical publications, and dare to speak up in the realm of day-to-day
              decision making. Managers need to reexamine "the way it's always been done" and
              do what works to keep waters alive. Agency administrators need to allocate funding
              inside their own agencies to programs that actually protect water resources. They
              should refocus their own professional energies on activities they  are funding citizen
              watershed groups to do.

              "Can we afford rivers and lakes  and streams and oceans, which continue to make
              life possible on this planet?" We must answer Edmund Muskie's  question with a
              resounding yes.
138

-------
   SECTION VII
LITERATURE CITED
             139

-------
Adamus, P. R.  1996. Bioindicators for assessing ecological
   integrity of prairie wetlands. EPA/600/R-96/082. US En-
   vironmental Protection Agency, National Health and En-
   vironmental Effects Research Laboratory, Western Ecol-
   ogy Division, Corvallis, OR.
Allan,]. D., and A. S. Flecker. 1993. Biodiversity conserva-
   tion in running waters. Bioscience 43: 32-43.
Allan,]. D., D. L. Erickson, and]. Fay. 1997. The influence
   of catchment land use on stream integrity across multiple
   spatial scales. FreshwaterBiol. 37: 149-161.
Angermeier, P. L., and]. R. Karr. 1986. Applying an index of
   biotic integrity based on stream  fish communities: Con-
   siderations in sampling and interpretation. N. Am.]. Fish.
   Manage. 6: 418-429.
Angermeier, P. L., andj. R. Karr. 1994. Biological integrity
   versus biological diversity as policy directives. Bioscience
   44: 690-697.
Angermeier, P. L.,  and I. J.  Schlosser. 1995. Conserving
   aquatic biodiversity. Am. Fish.  Soc. Symp. 17: 402-414.
Angermeier, P. L., and R. A. Smogor. 1995. Estimating num-
   ber of species and relative abundances in stream-fish com-
   munities:  Effects of sampling effort and discontinuous
   spatial distribution.  Can.}. Fish. Aquat.  Sci. 52: 936-949.
Armitage, P. D., D. Moss, J. F. Wright, and M. T. Furse. 1983.
   The performance of a new biological water quality score
   system based on macroinvertebrates over a wide range of
   unpolluted running-water sites. Water Res. 17:333-347.
Armstrong,}. S. 1967. Derivation of theory by means of fac-
   tor analysis, or Tom Swift and  his electric factor analysis
   machine. Am. Stat. 21: 17-21.
Auerbach, A. J. 1982. The index of leading indicators: "Mea-
   surement without theory," thirty-five years later. Rev. Econ.
   Stat. 64: 589-595.
Augspurger, C.  1996. Editor's note. Ecology 77: 1698.
Bahls,  L. L. 1993. Periphyton bioassessment methods for
   Montana streams. Water Quality Bureau, Department of
   Health and Environmental Sciences, Helena, MT.
Ballentine, R. K., and L.J. Guarraia, eds. 1977. The Integrity of
   Water: A Symposium. US Environmental Protection
   Agency, Washington, DC.
Barbour, M. T, andj. Gerritsen. 1996. Subsampling of benthic
   samples: A defense of the fixed-count method./ N. Am.
   Bentbol. Soc. 15: 386-391.
Barbour, M. T, J. L. Plafkin, B. P. Bradley, C. G. Graves, and
   R. W. Wisseman.  1992.  Evaluation of EPA's rapid
   bioassessment benthic metrics: Metric  redundancy and
   variability among reference stream sites. Environ. Toxicol.
   Chem. 11:437-449.
Barbour, M. T., J. B.  Stribling,  and J. R. Karr. 1995.
   Multimetric approach for establishing biocriteria and
   measuring biological condition. Pages 63-77 in W. S. Davis
   and T. P. Simon, eds. Biological Assessment and Criteria:
   Took for Water Resource Planning and Decision Making. Lewis,
   Boca Raton, FL.
Barbour, M. T.J. Gerritsen, G. E. Griffith, R. Frydenborg,
   E. McCarron, andj. S. White. 1996a. A framework for
   biological criteria  for Florida streams using benthic
   macroinvertebrates./ N. Am. Bentbol. Soc. 15: 185-211.
Barbour, M. T, J. B. Stribling, J.  Gerritsen, and J. R. Karr.
   1996b. Biological criteria: Technical guidance for streams
   and small rivers. EPA 822-B-96-001. US Environmental
   Protection Agency, Washington, DC.
Barbour, M. T.J. Gerritsen, B. D. Snyder, andj. B. Stribling.
   In press. Revision to Rapid bioassessment protocols for
   use in  streams and rivers: Periphyton, benthic
   macroinvertebrates, and fish. EPA 841-D-97-002. US En-
   vironmental Protection Agency, Washington, DC.
Beals, E. W. 1973. Ordination: Mathematical elegance and
   ecological naivete./ EcoL 61: 23-35.
Bella, D. E. 1997. Organizational systems and the burden of
   proof. Pages 617-638 in D. J. Stouder, P. A. Bisson,  and R.
   J. Naiman, eds. Pacific Salmon and Their Ecosystems: Status
   and Future Options. Chapman and  Hall, New York.
Bisson, P. A., T. P. Quinn, G. H. Reeves, and S. V. Gregory.
   1992. Best management practices, cumulative effects, and
   long-term trends in  fish abundance in Pacific Northwest
   river systems. Pages  189-232 in R. J. Naiman, ed.  Water-
   shed Management: Balancing Sustainability and Environmen-
   tal Change. Springer-Verlag, New York.
Botkin, D.  B. 1990. Discordant Harmonies. Oxford Univer-
   sity Press, New York.
Bottom, D. L.  1997. To till the water: A history of ideas in
   fisheries conservation. Pages 569-597 in D. J. Stouder, P.
   A. Bisson, and R.J. Naiman, eds. Pacific Salmon and Their
   Ecory stems: Status and Future Options. Chapman and Hall,
   New York.
Boyle, T. P., G. M. Smillie, J. C. Anderson, and D. P. Beeson.
   1990. A sensitivity analysis of nine diversity and seven
   similarity indices./  Water Pollut. Control Fed. 62: 749-762.
Bradford, D. F., S. E. Franson, A.  C.  Neale, D. T. Heggem,
   G. R. Miller, and G. E. Canterbury. In press. Bird species
   assemblages as indicators of biological integrity in Great
   Basin rangeland. Environ. Manage. Assess.
Brooks, R. P. and R. M. Hughes. 1988. Guidelines for  assess-
   ing the biotic communities of freshwater wetlands. Pages
   276-282 in J. A. Kusler, M. L. Quammen, and G. Brooks,
   eds. Proceedings of the National Wetland Symposium: Mitiga-
   tion  of Impacts and Losses. Association of State Wetland
   Managers, Berne, NY.
Calow, P. 1992. Can ecosystems be healthy? Critical consid-
   erations of concepts./ Aquat. Ecosyst. Health 1: 1-5.
Carlson, C. A., and R. T. Muth. 1989. The Colorado  River:
   Lifeline of the American Southwest. Can. Spec. Publ. Fish.
   Aquat. Sci. 106: 220-239.
 140

-------
Casella, G.,  and R. L. Berger. 1990. Statistical Inference.
   Wadsworth, Belmont, CA.
Chu, E. W. 1997. Why assess ecological risk? Environ. Health
   News, winter: 3,9. Department of Environmental Health,
   University of Washington, Seattle.
Chutter, F. M. 1972. An empirical biotic index of the quality
   of water in South African streams and rivers. WaterResour.
   6: 19-30.
Colborn, T. E., and C. Clement, eds. 1992. Chemically in-
   duced alterations in sexual and functional development:
   The wildlife-human connection. Advances in Modern En-
   vironmentalToxicology 21. Princeton Scientific, Princeton.
Colborn, T. E., A. Davidson, S. N. Green, R. A. Hodge, C. I.
   Jackson, and R. A. Liroff. 1990. Great Lakes, Great Legacy?
   Conservation Foundation, Washington, DC.
Colborn, T. E., D. Dumanoski,  and J. P. Myers. 1996. Our
   Stolen Future: Are We Threatening Our Fertility, Intelligence,
   and Survival? A Scientific Detective Story. Button, New York.
Committee on Public Works. 1973. A legislative history of
   the Water  Pollution Control Act Amendments  of 1972
   together with a section-by-section index, vol. 1, serial no.
   93-1. Environmental  Policy Division, Congressional Re-
   search Service, Library of Congress. US Government Print-
   ing Office, Washington, DC.
Costanza, R., and 12 others. 1997. The value of the world's
   ecosystem  services and natural capital. Nature 387: 253-
   260.
Courtemanch, D. L. 1996. Commentary on the subsampling
   procedure used for rapid bioassessments./. N. Am. Benthol.
   Soc.  15: 381-385.
CRESP (Consortium for Risk Evaluation with Stakeholder
   Participation).  1996. CRESP at one year: March 1995-
   1996. Department of Environmental Health, University
   of Washington, Seattle.
Croonquist, M. J., and R. P. Brooks. 1991. Use of avian and
   mammalian guilds as indicators of cumulative impacts in
   riparian-wetland areas. Environ. Manage. 15: 701-704.
Cuffney, T. R, M. E. Gurtz, and M. R. Meador. 1993. Meth-
   ods for collecting benthic invertebrate samples as part of
   the national water-quality assessment program. US Geol.
   Sum. Open  File Rep. 93-406.
Cummins, K.  W. 1974. Structure and function of stream eco-
   systems. Bioscience 24: 631-641.
Cummins, K. W, M. A.  Wilzbach, D. M. Gates,]. B. Perry,
   and  W. B.  Taliaferro. 1989. Shredders and riparian veg-
   etation. Bioscience 39: 24-30.
Cummins, K. W, C. E. Gushing, and G. W. Minshall. 1995.
   Introduction: An overview of stream ecosystems. Pages
   1-10 in C. E. Gushing, K. W.  Cummins, and G.  W.
   Minshall, eds. River and Stream Ecosystems. Elsevier, New
   York.
Cushman, R. M. 1984. Chironomid deformities as indica-
   tors of pollution from a synthetic coal-derived oil. Fresh-
   water Biol. 14: 179-182.
Daily, G. C., ed. 1997. Nature's Services: Societal Dependence on
   Natural Ecosystems. Island Press, Washington, DC.
Daubenmire, R. 1970. Steppe vegetation of Washington.
   Wash. Agric. Exp. Stn. Tech. Bull. 63.
Davies, S. P., and L. Tsomides. 1997. Methods for biological
   sampling and analysis of Maine's inland waters. DEP-
   LW107-A97. Maine Department  of Environmental Pro-
   tection, Augusta.
Davies, S. P., L. Tsomides, D. L.  Courtemanch, and F.
   Drummond. 1995.  Maine biological monitoring and
   biocriteria development program. Maine Department of
   Environmental Protection, Bureau of Land and Water
   Quality, Division of Environmental Assessment, Augusta.
Davis, W. S. 1995. Biological assessment and criteria: Building
   on the past. Pages 15-29 in  W. S. Davis and T. P. Simon,
   eds. Biological Assessment and Criteria: Toolsfor Water Resource
   Planning and Decision Making. Lewis, Boca Raton, FL.
Davis, W. S., and T. P. Simon, eds. 1995. Biological Assessment
   and Criteria: Tools for Water Resource Planning and Decision
   Making. Lewis, Boca Raton, FL.
Davis, W. S., B. D. SnyderJ. B. Stribling, and C. Stoughton.
   1996. Summary of state biological  assessment programs
   for streams and rivers. EPA 230-R-96-007. Office of Policy,
   Planning, and Evaluation, US Environmental Protection
   Agency, Washington, DC.
Deegan, L. A., J. T. Finn, S. G. Ayvasian, and C. Ryder. 1993.
   Feasibility and Application of the Index of Biotic Integrity to
   Massachusetts Estuaries (EBI). Massachusetts Executive Of-
   fice of Environmental  Affairs, Department of Environ-
   mental Protection, North Grafton.
Deegan, L. A., J. T. Finn, S. G. Ayvazian, C. A. Ryder-KiefFer,
   and J. Buonaccorsi. 1997. Development and validation of
   an estuarine biotic integrity index. Estuaries 20: 601-617.
DeShon, J. E. 1995. Development  and application  of the
   invertebrate community index (ICI). Pages 217-244 in
   W. S. Davis and T. P. Simon, eds. Biological Assessment and
   Criteria: Took for Water Resource Planning and Decision Mak-
   ing. Lewis,  Boca Raton, FL.
Dorner, D. 1996. The Logic of Failure: Why Things Go Wrong
   and What We Can Do to Make Them Right. Holt, New York.
Dufrene, M., and P. Legendre. 1997. Species assemblages and
   indicator species: The  need for a flexible asymmetrical
   approach. Ecol. Monogr. 67:  345-366.
Ebel, W. J., C. D. Becker,]. W. Mullan, and H. L Raymond.
   1989. The Columbia River: Toward a holistic understand-
   ing. Can. Spec. PuU. Fish. Aquat. Sci. 106: 205-219.
Ellis,]. L, and D. C. Schneider. 1997. Evaluation of a gradi-
   ent sampling design for environmental impact assessment.
   Environ. Monit. Assess. 48: 157-172.
                                                                                                      141

-------
Engle, V. D., J. K.  Summers, and G. R. Gaston. 1994. A
   benthic index of environmental condition of Gulf of
   Mexico estuaries. Estuaries 17: 372-384.
Fausch, K. D., J. R.  Karr, and P. R. Yant. 1984. Regional ap-
   plication of an index of biotic integrity based on stream
   fish communities. Trans. Am. Fish. Soc. 113: 39-55.
Fausch, K. D.,  J. Lyons, J. R. Karr, and P. L. Angermeier.
   1990. Fish communities as indicators of environmental
   degradation. Am. Fish. Soc. Symp. 8: 123-144.
Fauth, J. E., J. Bernardo, M.  Camara, W. J. Resetarits, Jr., J.
   Van Buskirk, and S. A. McCollom. 1996. Simplifying the
   jargon of community ecology: A conceptual approach.
   Am. Nat. 147: 282-286.
Florida  DEP (Department of Environmental Protection).
   1996. Standard Operating Procedures for Biological Assessment.
   Florida Department of Environmental Protection, Talla-
   hassee.
Ford, J.  1989. The effects of chemical stress on aquatic  spe-
   cies composition and community structure. Pages 99-144
   in S.  A.  Levin, M.  A.  Harwell, J. R. Kelly, and  K. D.
   Kimball, eds. Ecotoxicology: Problems and Approaches.
   Springer-Verlag, New York.
Fore, L. S.J. R. Karr, and L. L. Conquest. 1994. Statistical
   properties of an index of biotic integrity used to evaluate
   water resources. Can.]. Fish. Aquat. Sci. 51: 1077-1087.
Fore, L S., J. R. Karr, and R. W. Wisseman. 1996. Assessing
   invertebrate  responses to human activities: Evaluating
   alternative approaches./.  N. Am. Benthol. Soc.  15: 212-
   231.
Frey, D. G. 1977. Biological integrity of water: An  historical
   approach. Pages  127-140 in R.  K Ballentine  and L. J.
   Guarraia, eds. The Integrity of Water: A Symposium.  US
   Environmental Protection Agency, Washington, DC.
Frissell,  C. A. 1993. Topology of extinction and endanger-
   ment  of native fishes in the Pacific Northwest and Cali-
   fornia (USA). Conserv. Biol. 7: 342-354.
Gammon, J. R. 1976. The fish populations of the middle
   340 km of the Wabash River. Purdue Univ. Wat. Resour.
   Ctr. Tech. Rep. 86.
Gammon, J. R., A. Spacie, J. L. Hamelink, and R. L. Kaesker.
   1981.  Role of electrofishing in assessing environmental
   quality of the Wabash River. Pages 307-324 in J. M. Bates
   and C. I.  Weber,  eds. Ecological Assessments of Effluent Im-
   pacts on Communities of Indigenous Aquatic Organisms. STP
   730. American Society of Testing and Materials, Philadel-
   phia.
Gauch, H. G. 1982. Multivariate Analysis in Community Ecol-
   ogy. Cambridge University Press, Cambridge, UK.
Gerritsen, J. 1995. Additive biological indices for resource
   management./. N. Am. Benthol. Soc. 14: 451-457.
Goodall, D.  W. 1954. Objective methods for the classifica-
   tion of vegetation. III. An essay in the use of factor analy-
   sis. AustJ. Bot. 2: 304-324.
Gorman, O. T, and J. R. Karr. 1978. Habitat structure and
   stream fish communities. Ecology 59: 507-515.
Gotelli, N.J., and G. R. Graves. 1996. NullModels in Ecology.
   Smithsonian Institution Press, Washington, DC.
Green, R. H. 1979. Sampling Design and Statistical Methods for
   Environmental Biologists. Wiley, New York.
Greenfield, D. W., F. Abdel-Hameed, G. D. Deckert, and R. R.
   Flinn. 1973. Hybridization beween Chrosomus erythrogaster
   and Notropis comutus (Pisces: Cyprinidae).  Copeia 1973:
   54-60.
Gregory, S. V., and P. A. Bisson. 1997. Degradation and loss
   of anadromous salmonid habitat in the Pacific North-
   west. Pages 277-314 in D.J. Stouder, P. A. Bisson, and R.
   J. Naiman, eds. Pacific Salmon and Their Ecosystems: Status
   and Future Options. Chapman and Hall, New York.
Hager, M., and L. Reibstein. 1997. The cell from hell: Pfiesteria
   strikes again—in the Chesapeake Bay. Newsweek, 25 Au-
   gust: 63.
Hamilton, A. L., and O. A. Saether. 1971. The occurrence of
   characteristic deformities in the chironomid larvae of sev-
   eral Canadian lakes. Can. Entomol 103: 363-368.
Hannah, L., D. Lohse, C. Hutchinson, J. L. Carr, and  A.
   Lankerani. 1994. A preliminary inventory of human dis-
   turbance of world ecosystems. Amhio 23: 246-250.
Harris, L., and Associates. 1995. A survey on environmental
   issues in the Northwest. BellinghamHerald, 23 April: A-l.
Hartwell, S. I., C. E. Dawson, E. Q. Durell, R. W. Alden, P.
   C. Adolphson, D. A. Wright, G. M. Coelho, J. A. Magee,
   S. Ailstock, and M. Norman. 1997. Correlation of mea-
   sures of ambient toxicity and fish community diversity
   in Chesapeake Bay, USA, tributaries: urbanizing water-
   sheds. Environ. Toxicol.  Chem.  16: 2556-2567.
Hesse, L. W, J. C. Schmulback, J. M. Carr, K. D. Keenlyne,
   D. G. UnkenholzJ. W. Robinson, and G. E. Mestl. 1989.
   Missouri River fishery resources in relation to past, present,
   and future status.  Can. Spec. Publ. Fish. Aquat. Sci. 106:
   352-371.
Hilborn, R. 1997. Statistical hypothesis testing and decision
   theory in fisheries science. Fisheries 22(10): 19-20.
Hilborn, R., and M. Mangel. 1997. The Ecological Detective:
   Confronting Models with Data. Princeton University Press,
   Princeton.
Hilsenhoff, W. L. 1982. Using a biotic index to evaluate water
   qualify in streams. Wis. Dep. Nat. Res. Tech. Bull. 132.
Howarth, R. W. 1991. Comparative responses of aquatic eco-
   systems to toxic chemical stress. Pages 169-195 inj. Cole,
   G. Lovett, and S. Findlay, eds. Comparative Analyses of Eco-
   systems: Patterns, Mechanisms, andTheories. Springer-Verlag,
   New  York.
Hubbs,  C. L.  1961. Isolating mechanisms in the speciation
   of fishes. Pages 5-23 in W. F. Blair, ed. Vertebrate Specia-
   tion. University of Texas Press, Austin.
 142

-------
 Hughes, R. M. 1985. Use of watershed characteristics to se-
   lect control streams for estimating effects of metal min-
   ing wastes on extensively disturbed streams. Environ.
   Manage. 9: 253-262.
 Hughes, R. M. 1995. Defining acceptable biological status
   by comparing with reference conditions. Pages 31-48 in
   W. S. Davis and T. P. Simon, eds. Biological Assessment and
   Criteria: Took for Water Resource Planning and Decision Mak-
   ing. Lewis, Boca Raton, FL.
 Hughes, R. M., andj. R, Gammon. 1987. Longitudinal changes
   in fish assemblages and water quality in the Willamette River,
   Oregon. Trans. Am. Fish. Soc. 116:196-209.
 Hughes, R. M., and R. F. Noss. 1992. Biological diversity
   and biological integrity:  current concerns for lakes and
   streams. Fisheries 17(3): 11-19.
 Hughes, R. M., and 15 others. 1993. Development of lake
   condition indicators for EMAP: 1991  pilot. Pages 7-90
   in D. P. Larsen and S. J. Christie, eds. EMAP: Surface
   Waters 1991 Pilot Report. EPA-620-R-93-003. Office of Re-
   search and Development, US Environmental Protection
   Agency, Corvallis, OR.
 Hughes, R. M., L. Reynolds, P. R. Kaufmann, A. T. Herlihy,
   T. Kincaid, and D. P. Larsen. In press. Development and
   application of an index of fish assemblage integrity for
   wadeable streams in the Willamette Valley, Oregon, USA.
   Can.]. Fish. Aquat. Sci.
 Hugueny, B., S. Camara, B. Samoura, and M. Magassouba.
   1996. Applying an index of biotic integrity based on fish
   assemblages in  a  West African river. Hydrobiologia 331:
   71-78.
 Hurlbert, S. H. 1971. The nonconcept of species diversity.
   Ecology 52: 577-586.
 Huston, M. A. 1994. Biological Diversity: The Coexistence of
   Species on Changing Landscapes. Cambridge University
   Press, New York.
Jacobson, J. L., and  S. W. Jacobson. 1996. Intellectual im-
   pairment in children exposed to polychlorinated biphe-
   nyls in utero. N. EnglJ. Med. 335: 783-789.
Jacobson, J. L., S. W. Jacobson, and H. E. B.  Humphrey.
   1990. Effects of in utero exposure to polychlorinated bi-
   phenyls and related contaminants on cognitive function-
   ing in young children./. Pediatrics 116: 38-45.
James, F. C., and C. E. McCullough. 1990. Multivariate analy-
   sis in ecology and systematics: Panacea or Pandora's box?
   Annu. Rev. Ecol. Syst. 21:  129-166.
Jenkins, R. E., and N. M. Burkhead. 1994. The Freshwater
   Fishes of Virginia.  American Fisheries Society, Bethesda,
   MD.
Jennings, M. J., L. S. Fore, andj. R. Karr.  1995. Biological
   monitoring offish assemblages in Tennessee Valley reser-
   voirs. Regul. Rivers Res. Manage. 11:  263-274.
 Karr, J. R.  1981. Assessment of biotic integrity using  fish
   communities. Fisheries 6(6): 21-27.
Karr, J. R. 1987. Biological monitoring and environmental
   assessment: A conceptual framework. Environ. Manage.
   11:249-256.
Karr, J. R. 1991. Biological integrity: A long-neglected as-
   pect of water resource management. Ecol. Appl. 1: 66-84.
KarrJ. R. 1993. Measuring biological integrity: Lessons from
   streams. Pages 83-104 in S. Woodley, J. Kay, and G.
   Francis, eds. Ecological Integrity and the Management of 'Eco-
   systems. St. Lucie Press, Delray Beach, FL.
Karr, J. R. 1994. Thinking about salmon landscapes. Pages
   2-12 in M. Keefe, ed. Salmon Ecosystem Restoration: Myth
   and Reality. American Fisheries Society, Corvallis, OR.
Karr, J. R. 1995a. Risk assessment: We need more  than an
   ecological veneer. Hum. Ecol. Risk Assess. 1: 436-442.
KarrJ. R. 1995b. Clean water is not enough. Ittahee 11: 51-59.
Karr, J. R. 1996. Ecological integrity and ecological health
   are not the same. Pages 100-113 in P. Schulze, ed. Engi-
   neering within Ecological Constraints. National Academy
   Press, Washington, DC.
Karr.J. R. 1997. Seeking Suitable Endpoints: Biological Monitor-
   ing and Biological Criteriafor Wetland Assessment. US Envi-
   ronmental Protection Agency, Seattle.
KarrJ. R. 1998 (in press). Rivers as sentinels: Using the biol-
   ogy of rivers to guide landscape management.  In R. J.
   Naiman and R. E. Bilby, eds. The Ecology and Management
   of Streams and Rivers in the Pacific Northwest  Coastal
   Ecoregion. Springer-Verlag, New York.
KarrJ. R., and D. R. Dudley. 1981. Ecological  perspective
   on water quality goals. Environ. Manage. 5: 55-68.
KarrJ. R., and F. C.James. 1975. Eco-morphological con-
   figurations and convergent evolution in species and com-
   munities. Pages 258-291 in  M. L. Cody andj. M. Dia-
   mond, eds. Ecology and Evolution of Communities. Harvard
   University Press, Cambridge, MA.
KarrJ. R., and B. L. Kerans. 1992. Components of biologi-
   cal integrity: Their definition and use in development of
   an invertebrate IBI. Pages 1-16 in T. P. Simon and W. S.
   Davis, eds. Environmental Indicators: Measurement and As-
   sessment Endpoints. EPA 905/R-92/003. US Environmen-
   tal Protection Agency, Chicago.
KarrJ. R., and T. E. Martin. 1981. Random numbers and
   principal components: Further searches for the unicorn.
   Pages 20-24 in D. Capen, ed. The use of multivariate
   statistics in studies of wildlife habitat. US For. Serv. Gen
   Tech. Rep. RM-87.
KarrJ. R., R. C. Heidinger, and E. H. Helmer. 1985a. Sensi-
   tivity of the index of biotic integrity to changes  in chlo-
   rine and ammonia levels from wastewater treatment fa-
   cilities./. WaterPollut. ControlFed. 57: 912-915.
KarrJ. R., L.  A. Toth, and D. R. Dudley. 1985b. Fish com-
   munities of midwestern rivers: A history of degradation.
   Bioscience 35: 90-95.
                                                                                                       143

-------
Karr, J. R., K. D. Fausch, P. L. Angermeier, P. R. Yant, and I.
   J. Schlosser. 1986. Assessment of biological integrity in
   running waters: A method and its rationale. Illinois Nat.
   Hist. Sum. Spec. Publ. 5.
Karr, J. R., P. R. Yant, and K. D. Fausch. 1987. Spatial and
   temporal variability of the index of biotic integrity in three
   midwestern streams. Trans. Am. Fish. Soc. 116: 1-11.
Karr, J. R., D. N. Kimberling, and M. A. Hawke. 1997. Mea-
   suring ecological health, assessing ecological risks: Using
   the index  of biological integrity  at Hanford (a prelimi-
   nary report). Ecological Health Task Group, Consortium
   for Risk Evaluation with Stakeholder Particpation, Uni-
   versity of Washington, Seattle.
Karr, J. R., J. D. Allan, and A. C. Benke.  1998 (in press).
   River conservation in the United States and Canada. In
   P. J. Boon, B. R. Davies, and G. E. Petts, eds. Global Per-
   spectives on River Conservation. Wiley, London, UK.
Keeler, A. G., and D. McLemore. 1996. The value of incor-
   porating bioindicators in economic approaches to water
   pollution control. Ecol. Econ. 19: 237-245.
Kentucky DEP (Department of Environmental  Protection).
   1993. Methods for assessing biological integrity of sur-
   face waters. Kentucky Department of Environmental Pro-
   tection, Division of Water, Frankfort.
Kerans, B. L., and J. R. Karr. 1994. A  benthic index of biotic
   integrity (B-IBI) for rivers of the  Tennessee Valley. Ecol.
   Appl. 4: 768-785.
Kerans, B. L., J. R. Karr, and S. A. Ahlstedt. 1992. Aquatic
   invertebrate assemblages: Spatial and temporal differences
   among sampling protocols./. N. Am. Benthol. Soc. 11:377-
   390.
Kiffney, P. M., and W. H. Clements. 1994. Effects of heavy
   metals on a macroinvertebrate assemblage  from a Rocky
   Mountain stream in experimental microcosms./. N. Am.
   Benthol. Soc. 13: 511-523.
Kleindl, W. J.  1995. A benthic index of biotic integrity for
   Puget Sound lowland streams, Washington, USA.  MS
   thesis, University of Washington,  Seattle.
Klemm, D.J., P. A. Lewis, F. Fulk, and J. M. Lazorchak. 1990.
   Macroinvertebrate field and laboratory methods for evalu-
   ating the biological integrity of surface waters. EPA-600-
   4-90-030. Environmental Monitoring and Support Labo-
   ratory, US Environmental Protection Agency, Cincinnati.
Klemm, D. J., P. A. Lewis, F. Fulk, andj. M. Lazorchak. 1993.
   Fish field and laboratory methods for evaluating the bio-
   logical integrity of surface waters. EPA-600-R-92-111. US
   Environmental Protection Agency, Environmental Moni-
   toring and Support Laboratory, Cincinnati.
Knopman, D. S., and R. A. Smith. 1993. Twenty years of the
   Clean Water Act. Environment 35(1): 16-20, 34-41.
Kolkwitz, R.,  and M. Marsson. 1908. Okologie der pflanz-
   lichen saprobien. Ber. Dtscb. Bot. Ges. 26a: 505-519. (Trans-
   lated 1967. Ecology of plant saprobia. Pages  47-52 in L.
   E. Kemp, W. M. Ingram, and K. M. Mackenthum, eds.
   Biology of Water Pollution. Federal Water Pollution Con-
   trol Administration, Washington, DC.
Larsen, D. P. 1995. The role of ecological sample surveys in
   the implementation of biocriteria. Pages 287-300 in W.
   S. Davis and T. P. Simon, eds. Biological Assessment and
   Criteria: Tools for Water Resource Planning and Decision Mak-
   ing. Lewis Publishing, Boca Raton,  FL.
Larsen, D. P..J. M. Omernik, R. M. Hughes, C. M. Rohm,
   T. R. Whittier, A. J. Kinney, A. L. Gallant, and  D.  R.
   Dudley. 1986. The correspondence between spatial pat-
   terns in fish assemblages in Ohio streams  and aquatic
   ecoregions. Environ. Manage. 10: 815-828.
Lenat, D. R. 1988. Water quality assessment of streams us-
   ing a qualitative collection method for benthic macro-
   invertebrates./. N. Am. Benthol. Soc. 7: 222-233.
Lenat, D. R. 1993. A biotic index for the southeastern United
   States: Derivation and  list of tolerance values, with crite-
   ria for assigning water  quality ratings. / N. Am. Benthol.
   Soc. 12: 279-290.
Lenat, D. R., and D. L Penrose. 1996. History of the EPT
   taxa richness metric./.  N. Am. Benthol. Soc. 13: 305-307.
Ludwig, J. A., and J. F. Reynolds. 1988. Statistical Ecology.
   Wiley, New York.
Lyons,}. 1992a. Using the index of biotic integrity (IBI) to
   measure environmental quality in warmwater streams of
   Wisconsin.  US For. Serv. Gen. Tech. Rep. NC-149.
Lyons,}. 1992b. The length of stream to sample with a towed
   electrofishing unit when fish species richness is estimated.
   N. Am.]. Fish. Manage. 12: 198-203.
Lyons,}., S.  Navarro-Perez, P. A. Cochran, E.  Santana C.,
   and M. Guzman-Arroyo. 1995. Index of biotic integrity
   based on fish assemblages for the conservation of streams
   and rivers in west-central  Mexico. Cons. Biol. 9: 569-584.
Lyons,}., L. Wang, and T.  D. Simonson. 1996. Develop-
   ment and validation of an index of biotic integrity for
   coldwater streams in Wisconsin. N. Am. J. Fish Manage.
   16: 241-256.
MacDonald, L. H., A. Smart, and R. C. Wissmar.  1991.
   Monitoring guidelines  to evaluate effects of forestry ac-
   tivities on streams in the Pacific Northwest  and Alaska.
   EPA/910/9-91-001. US  Environmental Protection Agency,
   Seattle.
Magurran, A. E. 1988. Ecological Diversity and Its Measure-
   ment. Princeton University Press, Princeton.
Marchant, R. 1989. A subsampler for samples of benthic
   invertebrates. Butt. Aust. Soc. Limnol. 12: 49-52.
Master, L. 1990. The imperiled  status of North American
   aquatic animals. Biodiversity Network News (Nature Con-
   servancy) 3(3): 1-2, 7-8.
McAllister, D. E., A. L. Hamilton, and B. Harvey. 1997. Glo-
   bal freshwater biodiversity: Striving for the integrity of
   freshwater ecosystems.  Sea Wind 11(3): 1-140.
 144

-------
McFarland, B. H., B. H. Hill, and W. T. Willingham. 1997.
   Abnormal Fra.gila.ria spp. (Bacillariophyceae) in streams
   impacted by mine drainage./. Freshwater Ecol. 12:141-149.
Meador, M. R., R. F. Cuffhey, and M. E. Gurtz. 1993. Meth-
   ods for sampling fish communities as part of the national
   water-quality assessment program. US Geol. Surv. Open Fik
   Rep. 93-104.
Meffe, G. K. 1992. Techno-arrogance and halfway technolo-
   gies: Salmon hatcheries on the Pacific coast of North
   America. Consent. Biol. 6: 350-354.
Megahan, W.  F., J. P. Potyondy, and K. A. Seyedbagheri.
   1992. Best management practices and cumulative effects
   from sedimentation in the South Fork Salmon River: An
   Idaho case study. Pages 401-441 in R. J. Naiman, ed.
   Watershed Management: Balancing Sustainability and Envi-
   ronmental Change. Springer-Verlag, New York.
Miller, K. L, and 13 others. 1988. Regional applications of
   an index of biotic integrity for use in water resource man-
   agement. Fisheries 13(5): 12-20.
Miller, R. R., J. D. Williams, and J. E. Williams. 1989. Ex-
   tinctions of North American fishes during the past cen-
   tury. Fisheries 14(6): 22-38.
Minns, C. K., V. W. Cairns, R. G. Randall, and J. E. Moore.
   1994. An index of biotic integrity (IBI) for fish assem-
   blages in the  littoral zone of Great Lakes  areas of con-
   cern. Can. J. Fish. Aquatic Set. 51: 1804-1822.
Minshall, G. W., R. C. Peterson, K. W. Cummins, T. L. Bott,
   J. R. Sedell, C. E.  Gushing, and R. L. Vannote. 1983.
   Interbiome comparison of stream ecosystem dynamics.
   Ecol. Monogr. 51: 1-25.
Mitchell, W. C., and A. F. Burns. 1938. Statistical Indicators of
   Cyclical Revivals. National Bureau of Economic Research,
   New York.
Mosteller, F., andj. M. Tukey. 1977. Data Analysis and Regres-
   sion. Addison-Wesley, Reading, MA.
Moyle, P. B., and R. A. Leidy. 1992. Loss of aquatic ecosys-
   tems: Evidence from fish faunas. Pages 127-169 in P. L.
   Fielder and S. K. Jain, eds. Conservation Biology: The Theory
   and Practice of Nature Conservation, Preservation, and Man-
   agement.  Chapman and Hall, New York.
Moyle, P. B., and J. E. Williams. 1990. Biodiversity loss in
   the temperate zone: Decline of the  native fish fauna of
   California. Conserv. Biol 4: 275-284.
Murtaugh, P. A. 1996. The statistical evaluation of ecologi-
   cal indicators. Ecol. Appl. 6: 132-139.
Muskie, E. S. 1972. Senate consideration of the report of the
   Conference Committee, October 4, 1972. Amendment
   of the Federal Water Pollution Control Act. US Govern-
   ment Printing Office, Washington, DC.
Muskie, E. S. 1992. Testimony of Edmund S. Muskie before
   the Committee  on Environment and Public Works, on
   the Twentieth Anniversary of Passage of the Clean Water
   Act. September 22, 1992. Reprinted as S. Doc. 104-17;
   Memorial Tribute Delivered in Congress, Edmund S.
   Muskie, 1914-1996. US Government Printing Office,
   Washington, DC.
Nehlsen, W, J. E. Williams, and J. A. Lichatowich. 1991.
   Pacific salmon at the crossroads: Stocks at risk from Cali-
   fornia, Oregon, Idaho, and Washington. Fisheries 16(2):
   4-21.
Norris, R. H. 1995. Biological monitoring: The dilemma of
   data analysis./. N. Am. Benthol. Soc. 14: 440-450.
Norris, R. H., and A. Georges. 1993. Analysis and interpre-
   tation of benthic  surveys. Pages 234-286 in D. M.
   Rosenberg and V. H. Resh, eds. Freshwater Biomonitoring
   and Benthic Macroinvertebrates. Chapman and Hall, New
   York.
NRG (National Research Council). 1983. Risk Assessment in
   the Federal Government: Managing the Process. National Acad-
   emy Press, Washington, DC.
NRC (National Research  Council). 1994. Science and Judg-
   ment in Risk Assessment. National Academy Press, Wash-
   ington, DC.
NRC (National Research Council). 1996. Understanding Risk.
   National Academy Press, Washington, DC.
Oberdorff, T, and R. M. Hughes. 1992. Modification of an
   index of biotic integrity based on fish assemblages to char-
   acterize rivers of the Seine-Normandie basin,  France.
   Hydrobiologia228: 117-130.
Ohio EPA (Environmental Protection Agency). 1988. Bio-
   logical Criteriafor the Protection of Aquatic Life, volumes 1-3.
   Ecological Assessment Section, Division of Water Qual-
   ity Monitoring and Assessment, Ohio  Environmental
   Protection Agency, Columbus.
Olsen, A. R, J. Sedransk, D. Edwards,  C. A. Gotway, W.
   Leggett, S. Rathbun, K. H. Reckhow, and L. J. Young. In
   press. Statistical issues for monitoring ecological and natu-
   ral resources in the United States. Environ. Monit. Assess.
Omernik, J. M. 1995. Ecoregions: A spatial framework for
   environmental management. Pages 49-62 in W. S. Davis
   and T. P.  Simon, eds. Biological Assessment and Criteria:
   Toolsfor Water Resource Planning and Decision Making. Lewis,
   Boca Raton, FL.
Omernik, J. M. and  R. G. Bailey. 1997.  Distinguishing be-
   tween watersheds and ecoregions./. Am. Wat. Res. Assoc.
   33:935-949.
Orr, D. W. 1994. Earth in Mind: On Education, Environment,
   and the Human Prospect.  Island Press, Washington, DC.
Osenberg, C. W., R.  J. Schmitt, S. J. Holbrook, K. E.
   Abu-Saba, and A. R. Flegal. 1994. Detection of environ-
   mental impacts: Natural variability, effect size, and power
   analysis. Ecol. Appl. 4: 16-30.
Pacific Rivers Council. 1995. A call for a  comprehensive
   watershed and wild fish conservation program in eastern
   Oregon and Washington, 2d ed. Pacific Rivers Council,
   Eugene, OR.
                                                                                                     145

-------
Paller, M. H. 1995a. Relationships among number offish
   species sampled, reach length surveyed, and sampling ef-
   fort in South Carolina coastal plain streams. N. Am. J.
   Fish. Manage. 15: 110-120.
Paller, M. H. 1995b. Interreplicate variance and statistical
   power of electrofishing data from low-gradient streams in
   the southeastern United States. N. Am.J. Fish. Manage.
   15: 542-550.
Pan, Y, R. J. Stevenson, B. H. Hill, A. T. Herlihy, and C. B.
   Collins. 1996. Using diatoms as indicators of ecological
   conditions in lotic systems: A regional assessment./. N.
   Am. Benthol Soc. 15: 481-494.
Patil, G. P. 1991. Encountered data, statistical ecology, envi-
   ronmental statistics, and weighted distribution methods.
   Environmetrics 2: 377-423.
Patrick, R.  1992. Surface Water Qualify: Have the Laws Been
   Successful? Princeton University Press, Princeton, NJ.
Patterson, A. J. 1996. The effect of recreation on biotic in-
   tegrity of small streams in Grand Teton National Park.
   MS thesis, University of Washington, Seattle.
Peterman, R. M. 1990. Statistical power analysis can improve
   fisheries research and management. Can. J. Fish. Aquat.
   Sci. 47: 2-15.
Pielou, E. C.  1975. Ecological Diversity. Wiley, New  York.
Pimentel, D., C. Wilson, C. McCullum, R. Huang, P. Dwen,
   J. Flack, CXTran, T. Saltman, and B. Cliff. 1997. Economic
   and environmental benfits of biodiversity. Bioscience47:
   747-757.
Pimm, S. L. 1991. The Balance of'Nature: Ecological Issues in the
   Conservation of Species and Communities. University of Chi-
   cago Press, Chicago.
Pinel-Alloul, B., G. Methot, L. Lapierre, and A. Willsie. 1996.
   Macroinvertebrate community as  a biological indicator
   of ecological and toxicological factors in Lake Saint-
   Francois (Quebec). Environ. Poll. 91: 65-87.
Plafkin, J. L., M. T. Barbour, K. D. Porter, S. K. Gross, and R.
   M. Hughes. 1989. Rapid bioassessment protocols for use
   in streams and rivers: Benthic macroinvertebrates  and fish.
   EPA/440/4-89-001. Assessment and Water Protection Di-
   vision, US Environmental Protection Agency, Washing-
   ton, DC.
Poff, N. L.J. D. Allan, M. B. Bain,J. R. Karr, K. L. Prestegaard,
   B. D. Richter, R. E. Sparks, and J. C. Stromberg. 1997.
   The natural flow regime: A paradigm for river conserva-
   tion and restoration. Bioscience 47': 769-784.
Potvin, C., and J. Travis, eds. 1993. Statistical methods: An
   upgrade for biologists. Ecology 74: 1614-1676.
Preston, F. W. 1962. The canonical distribution of common-
   ness and rarity, part I. Ecology 43: 185-215.
Rankin, E. T.  1995. Habitat indices in water resource quality
   assessments. Pages 181-208 in W. S. Davis and T. P. Simon,
   eds. Biological Assessment and Criteria: Tools for Water Resource
   Planning and Decision Making. Lewis, Boca Raton, FL.
Rankin, E. T, and C. O. Yoder. 1990. The nature of sam-
   pling variability in the index of biotic integrity in Ohio
   streams. Pages 9-18 in W. S. Davis, ed.  Proceedings of the
   1990 MidwestPollution Control Biologists Meeting. EPA 905-
   9-90-005. Environmental Sciences Division, US Environ-
   mental Protection Agency, Chicago.
Rexstad, E. A., D. D. Miller, C. H. Flather, E. M. Anderson,
   J. H. Hupp, and  D. R. Anderson. 1988.  Questionable
   multivariate statistical inference in wildlife habitat and
   community studies./ Wildl. Manage. 52: 794-798.
Reynoldson, T. B., andj. L. Metcalfe-Smith. 1992. An over-
   view of the assessment of aquatic ecosystem health using
   benthic invertebrates./. Aquat. Ecosyst. Health 1:295-308.
Reynoldson, T. B., and D. M.  Rosenberg.  1996. Sampling
   strategies and practical considerations in building refer-
   ence data bases for the prediction of invertebrate com-
   munity structure. Pages 1-31 in R. C. Bailey, R. H. Norris,
   and B. Reynoldson, eds. Study Design and Data Analysis in
   Benthic Macroinvertebrate Assessments of Freshwater Ecosys-
   tems Using a Reference Site Approach. Technical Information
   Workshop, North American Benthological Society,
   Kalispell, MT.
Reynoldson, T. B., and M. A. Zarull. 1993.  An approach to
   the development of biological sediment guidelines. Pages
   177-200 in S. Woodley, J. Kay, and G. Francis, eds. Eco-
   logical Integrity and the Management of Ecosystems. St. Lucie
   Press, Delray Beach, FL.
Richards, C. L, L. B.Johnson, and G. E. Host. 1996. Land-
   scape-scale influences on stream habitats and biota. Can.
   J. Fish. Aquat. Sci. 53(suppl. 1): 295-311.
Richards, C. L., R. J. Haro, L. B. Johnson, and G. E. Host.
   1997. Catchment and reach-scale properties as indicators
   of macroinvertebrate species traits. Freshwater Biol. 37:
   219-230.
Rickard, W. H., and R. H. Sauer. 1982. Self-revegetation of
   disturbed ground in the deserts of Nevada and Washing-
   ton. Northwest Sci. 56: 41-47.
Risk Commission (Presidential/Congressional Commission
   on Risk Assessment and Risk Management). 1997. Frame-
   work for Environmental Health Risk Management.  Presiden-
   tial/Congressional Commission on Risk Assessment and
   Risk Management, Washington, DC.
River Network. 1996.  1996-1997 River and Water Conserva-
   tion Directory. To-the-Point Publications, Portland, OR.
Rivera, M.,  and C. Marrero. 1994. Determinacion de la
   calidad de las aquas en las cuencas hidrograficas, mediante
   la utilization  del indice de integridad biotica (IIB).
   Biottaniall: 127-148.
Rodriguez-Olarte, D., and D. C. Taphorn.  1994. Los peces
   como indicadores biologicos: Aplicacion del indice de
   integridad biotica en ambientes acuaticos de los llanos
   occidentales de Venezuela. Biollania 11: 27-56.
Rogers, L. E., R. E. Fitzner, L. L. Cadwell, and B. E. Vaughan.
   1988. Terrestrial animal habitats and population responses.
 146

-------
   Pages 182-250 in W. H. Rickard, L. E. Rogers, B. E.
   Vaughan, and S. F. Liebetrau, eds. Balance and Change in a
   Semi-arid Terrestrial Ecosystem. Elsevier, New York.
Rossano, E. M. 1995. Development of an index of biologi-
   cal integrity for Japanese streams (IBI-J). MS thesis, Uni-
   versity of Washington, Seattle.
Rossano, E. M. 1996. Diagnosis of Stream Environments with
   Index of Biological Integrity (in Japanese and English).  Mu-
   seum of Streams and Lakes, Sankaido Publishers, Tokyo.
Roth, N. E., J. D. Allan, and D. E. Erickson. 1996. Land-
   scape influences on stream biotic integrity assessed at
   multiple  spatial scales. Landscape Ecol. 11: 141-156.
Rowe, C. L., O. M. Kinney, A. P. Fiori, and J. D. Congdon.
   1996. Oral deformities in tadpoles (Rana catesbiana] asso-
   ciated with coal ash deposition: Effects on grazing ability
   and growth. FreshwaterBiol. 36: 723-730.
SAB  (Science Advisory Board). 1990. Reducing Risk: Setting
   Priorities and Strategies for Environmental Protection.
   SAB-EC-90-021. US Environmental Protection Agency,
   Washington, DC.
Schelske, C. L. 1984. In situ and natural phytoplankton as-
   semblage bioassays. Pages 15-47 in L. E. Shubert, ed. Al-
   gae As Ecological Indicators. Academic Press, London.
Schindler, D. W. 1987. Determining ecosystem responses to
   anthropogenic stress. Can.]. Fish. Aquat. Sci. 44(suppl. 1):
   6-25.
Schindler, D. W. 1990. Experimental perturbations of whole
   lakes as tests of hypotheses concerning ecosystem struc-
   ture and function. Oikos 57: 25-41.
Schlosser, I. J. 1990. Environmental  variation, life history
   attributes, and community structure in stream fishes:
   implications for environmental management and assess-
   ment. Environ. Manage. 14: 621-628.
Schmitt, R. J., and C. W. Osenberg, eds. 1996. Detecting Eco-
   logical Impacts:  Concepts and Applications in Coastal Habi-
   tats. Academic Press, San Diego, CA.
Seattle Times. 1996. Surface water getting dirtier: Uphill battle
   cleaning rivers, streams, lakes, says state. 10 July: B3.
Shrader-Frechette, K,  1996.  Methodological rules  for four
   classes of scientific uncertainty. Pages 12-39  inj. Lem-
   ons,  ed. Scientific Uncertainly and Environmental Problem
   Solving. Blackwell Science, Cambridge, MA.
Simberloff, D., D. C. Schmitz, and T. C. Brown, eds. 1997.
   Strangers  in Paradise: Impact and Management of Non-
   indigenous Species in Florida. Island Press, Washington, DC.
Simon, T. P., ed. In press. Assessing the Sustainability and Bio-
   logical Integrity of Water Resource Quality Using Fish Assem-
   blages. CRC Press, Boca Raton, FL.
Simon, T. P., andj. Lyons. 1995. Application of the index ofbiotic
   integrity to evaluate water resource integrity in freshwater eco-
   systems. Pages 245-262 in W S. Davis and T. P. Simon, eds.
   Biological Assessment and Criteria: Tbolsfor Water Resource Planning
   and Decision Making. Lewis, Boca Raton, FL.
 Sokal, R. R., and F. J. Rohlf. 1981. Biometry, 2d ed. Freeman,
   New York.
 Statzner, B., H. Capra, L. W. G. Higler, and A. L. Roux.
   1997. Focusing environmental management budgets on
   non-linear system responses: potential for significant im-
   provements to freshwater ecosystems. Freshwater Biol. 37:
   463-472.
 Steedman, R. J. 1988.  Modification and assessment of an
   index of biotic integrity to quantify stream quality in
   southern Ontario. Can.J. Fish. Aquat. Sci. 45: 492-501.
 Stemberger, R. S., andj. M. Lazorchak. 1994. Zooplankton
   assemblage responses to disturbance gradients. Can.J. Fish.
   Aquat. Sci. 51:2435-2447.
 Stemberger, R. S., A. T. Herlihy, D. L. Kugler, and S. G.
   Paulsen. 1996. Climatic forcing on zooplankton richness
   in lakes of the  northeastern United  States. Limnol.
   Oceanogr. 41: 1093-1101.
 Stewart-Oaten, A.  1996. Goals in environmental monitor-
   ing. Pages 17-28 in R. J. Schmitt and C. W. Osenberg,
   eds. Detecting Ecological Impacts: Concepts and Applications
   in Coastal Habitats, Academic Press,  San Diego, CA.
 Stewart-Oaten, A., W. W. Murdoch, and K. R. Parker. 1986.
   Environmental impact assessment:  "Pseudoreplication"
   in time? Ecology 67: 929-940.
 Stewart-Oaten, A., J. R. Bence, and C. W. Osenberg. 1992.
   Assessing effects of unreplicated perturbations: No simple
   solutions. Ecology 73: 1396-1404.
 Summers, J. K., and V. Engle.  1993. Evaluation of sampling
   strategies to characterize dissolved oxygen conditions in
   Gulf of Mexico estuaries. Environ. Monit. Assess. 24: 219-
   229.
 Summers, K., L. Folmar, and  M. RodonNaveira. 1997.  De-
   velopment  and testing of bioindicators for monitoring
   the condition of estuarine  ecosystems. Environ. Monit.
   Assess. 47: 275-301.
 Suter, G. W. 1993. A critique of ecosystem health concepts
   and indexes. Environ. Toxicol. Chem.  12: 1533-1539.
 Swift, B. L. 1984. Status of riparian ecosystems in the  United
   States. Water Resour. Bull. 20: 233-238.
Tabachnick, B. G., and L. S. Fidell.  1989. Using Multivariate
   Statistics, 2d ed. HarperCollins, New York.
Tait, C. K.J. L. Li, G. A. Lamberti, T.  N. Pearsons,  and H.
   W. Li. 1994. Relationships  between riparian cover  and
   the community structure of high desert streams./. N. Am.
   BentholSoc. 13:45-56.
Ter Braak, C. J. F. 1986. Canonical correspondence analysis:
   A new eigenvector technique for multivariate direct  gra-
   dient analysis. Ecology 67: 1167-1179.
Thoma, R. F. 1990. A preliminary assessment of Ohio's Lake
   Erie estuarine fish communities.  Division of Water Qual-
   ity Planning and Assessment, Ecological Assessment Sec-
   tion, Ohio Environmental Protection Agency, Columbus.
                                                                                                       147

-------
 Thompson, B. A., and G. R. Fitzhugh.  1986. A use attain-
   ability study: An evaluation offish and macroinvertebrate
   assemblages of the Lower Calcasieu River, Louisiana. LSU-
   CFI-29. Center for Wetland Resources, Coastal Fisheries
   Institute, Louisiana State University, Baton Rouge. (See
   Miller et al. 1988 for a synopsis of this study.)
 Thompson, P. B. 1995. The Spirit of the Soil: Agriculture and
   Environmental Ethics. Routledge, London.
 Thomson,}. D.,  G. Weiblen, B. A. Thomson, S. Alfaro, and
   P. Legendre. 1996. Untangling multiple factors in spatial
   distributions: lilies, gophers, and rocks. Ecology 77:1698-
   1715.
 Thorne, R. St. J., and W. P. Williams. 1997. The response of
   benthic invertebrates to pollution in developing coun-
   tries: A multimetric system of bioassessment. Freshwater
   Biol. 37: 671-686.
 Tufte, E. R.  1983. The Visual Display oj'Quantitative Informa-
   tion. Graphics Press, Cheshire, CT.
 Tufte,  E. R. 1990. Envisioning Information. Graphics Press,
   Cheshire, CT.
 Tufte, E. R. 1997. Visual Explanations. Graphics Press,
   Cheshire, CT.
 Underwood, A. J.  1991. Beyond BACI: Experimental de-
   signs for detecting human environmental impacts on tem-
   poral variations in natural populations. Aust.J. Mar. Fresh-
   water Res. 42:  569-587.
 Underwood, A. J.  1994. On beyond BACI: Sampling de-
   signs that might reliably detect environmental distur-
   bances. Ecol. Appl. 4: 3-15.
 USEPA. 1985. Technical Support Document for Conducting Use
   Attainability Studies. Office of Water Regulations and Stan-
   dards, Office  of Water, US Environmental Protection
   Agency, Washington, DC.
 USEPA. 1988.  WQS Draft Frameworkfor the Water Quality Stan-
   dards Program. Draft 11-8-88. Office of Water, US Envi-
   ronmental Protection Agency, Washington, DC.
 USEPA. 1990. Biological Criteria: National Program Guidance
  for Surface Waters. EPA 440-5-90-004. Office of Water Regu-
   lations and Standards, US  Environmental Protection
   Agency, Washington, DC.
 USEPA. 1992a. National Water Quality Inventory: 1990 Report
   to Congress.  EPA-503/9-92/006. US Environmental Pro-
   tection Agency,  Washington, DC.
 USEPA. l992b.FrameworkforEcobgicalRisk Assessment. EPA/
   630/R-92/001. Risk Assessment Forum, US Environmen-
   tal Protection  Agency, Washington, DC.
USEPA. 1994a. Ecological Risk Assessment Issue Papers. EPA/
   630/R-94/009. Risk Assessment Forum, Office of Research
   and Development, US Environmental Protection Agency,
   Washington, DC.
USEPA. 1994b. Peer Review Workshop Report on Ecological Risk
   Assessment Issue Papers. EPA/630/R-94/008. Risk Assess-
   ment Forum, Office of Research and Development, US
   Environmental Protection Agency, Washington, DC.
 USEPA. 1994c. National Directory of Volunteer Environmental
   Monitoring Programs. EPA 841-B-94-001. Office of Water,
   US Environmental Protection Agency, Washington, DC.
 USEPA. 1995. National Water Quality Inventory: 1994 Report
   to  the Congress. US Environmental Protection Agency,
   Washington, DC.
 USEPA. 1996a. National listing offish and wildlife consump-
   tion  advisories. EPA-823-F-96-006 (four-page fact sheet),
   EPA-823-C-96-001 (five PC diskettes). US Environmen-
   tal Protection Agency, Washington, DC.
 USEPA. 1996b. Liquid Assets: A Summertime Perspective on the
   Importance of Clean Water to the Nation's Economy. EPA 800-
   R-96-002. Office of Water, US Environmental Protection
   Agency, Washington, DC.
 USEPA. 1996c. Environmental Indicators of Water Quality in
   the United States. EPA 841-R-96-002. US Environmental
   Protection Agency, Washington, DC.
 USEPA. 1996d. Proposed guidelines for ecological risk as-
   sessment: Notice. FRL-5605-9. Federal Register 61:47552-
   47631.
 van Belle, G., G. S. Omenn, E. M. Faustman, C. W. Powers,
   J. A.  Moore, and B. D. Goldstein. 1996. Dealing with
   Hanford's  legacy. Wash. Publ. Health 14:  16-21.
 Vannote, R. L, G. W. Minshall, K. W. Cummins,]. R. Sedell,
   and C. E. Gushing. 1980. The river continuum concept.
   Can.]. Fish. Aquat. Set. 37:130-137.
 Vinson, M. R., and C. P. Hawkins. 1996. Effects of sampling
   area and subsampling procedure on comparisons of taxa
   richness among streams./ N. Am. Benthol. Soc. 15: 392-
   399.
 Walsh, C. J. 1997. A multivariate method  for determining
   optimal subsample size in the analysis of macro-
   invertebrate samples. Mar. Freshwater Res. 48: 241-248.
 Wang, L., J. Lyons, P. Kanehl, and R. Gatti.  1997. Influences
   of watershed land use on habitat quality and biotic integ-
   rity in Wisconsin streams. Fisheries 22(6): 6-12.
 Ward, R. C., and J. C. Loftis.  1989. Monitoring systems for
   water quality. Crit. Rev. Environ. Control 19:  101-118.
Warwick, W. R, and N. A. Tisdale. 1988. Morphological de-
   formities in Chironomus, Cryptochironomus, and Procladius
   (Diptera: Chironomidae) from two differentially stressed
   sites in Tobin Lake, Saskatchewan. Can.]. Fish. Aquat. Sci.
   45:1123-1144.
Warwick, W. R, J. Fitchko, P. M. McKee, D. R. Hart, and A.
   J. Bunt. 1987. The incidence of deformities in Chironomus
   spp. from Port Hope Harbour, Lake Ontario./. Great Lakes
   Res. 13: 88-92.
Washington, H. G. 1984. Diversity, biotic and similarity in-
   dices: A review with special relevance to aquatic ecosys-
   tems. Water Res. 18: 653-694.
 148

-------
Water Quality 2000. 1991. Challenges for the Future: Interim Re-
   port. Water Pollution Control Federation, Alexandria, VA.
Weaver,  M. ]., and L. A. Deegan. 1996. Extension of the
   estuarine biotic integrity index across biogeographic re-
   gions (abstract). Bull. Ecol. Soc. Am. (suppl.) 77(3): 472.
Weaver,  M. J., J. J. Magnuson, and M.  D.  Clayton. 1993.
   Analyses for differentiating littoral fish assemblages with
   catch data from multiple sampling gears.  Trans. Am. Fish.
   Soc. 122: 1111-1119.
Weisberg, S. B., J. A. Ranasinghe, L. C. Schaffner, R. J. Diaz,
   D. M. Dauer, and J. B. Frithsen. 1997. An estuarine benthic
   index of biotic integrity (B-IBI) for Chesapeake Bay. Es-
   tuaries 20: 149-158.
Whittier, T. R. 1998. Development of IBI metrics for lakes
   in southern New England. In T. P. Simon, ed. Assessing the
   Sustainability and Biological Integrity of Water Resource Qual-
   ity Using Fish Assemblages. CRC Press,  Boca Raton, FL.
Whittier, T. R., R. M. Hughes, and D. P. Larsen. 1988. The
   correspondence between ecoregions and  spatial patterns
   in stream ecosystems in Oregon. Can.J. Fish. Aquat. Set.
   45:1264-1278.
Whittier, T., D. B. Halliwell, and S. G. Paulsen. 1997a. Cyp-
   rinid distributions  in northeast USA  lakes: Evidence of
   regional-scale minnow biodiversity losses. Can.J. Fish.
   Aquat. Sci. 54: 1593-1607.
Whittier, T. R., P. Vaux, G. D. Merritt, and R. B. Yeardleyjr.
   1997b. Fish sampling. In J. R. Baker,  D. V. Peck, and D.
   W. Sutton, eds. Environmental Monitoring and Assessment
   Program, Surface Waters: Field Operations Manual for Lakes.
   EPA/620/R-97/001.  US Environmental  Protection
   Agency, Washington, DC.
White, R. J., J. R. Karr, and W. Nehlsen. 1995. Better roles
   for fish  stocking in aquatic resource  management. Am.
   Fish. Soc. Symp. 15: 527-547.
Wicklum, D., and R. W. Davies. 1995. Ecosystem health and
   integrity? Can.J. Bot. 73: 997-1000.
Wilcove, D. S., and M.J. Bean, eds. 1994. The Big Kill: De-
   clining Biodiversity in America's Lakes and Rivers. Environ-
   mental Defense Fund, Washington, DC.
Wilhm, J. L., and T. C. Dorris. 1968. Biological parameters
   for water quality criteria. Bioscience 18: 477-481.
Williams, J. D.,  M. L. Warren, Jr., K. S. Cummings, J. L.
   Harris, and R. J. Neves. 1993. Conservation status of fresh-
   water mussels of the United States and Canada. Fisheries
   18(9): 6-22.
Williams,]. E., and R. R. Miller. 1990. Conservation status
   of the North American  fish fauna in  fresh water. /. Fish
   Biol. 37(suppl. A): 79-85.
Williams,}. E., and R. J. Neves. 1992. Biological diversity in
   aquatic management. Trans. N. Am. Wildl. Nat. Res. Conf.
   57: 343-432.
Williams, J. E., J. E. Johnson,  D. A. Hendrickson, S.
   Contreras-Balderas,J. D. Williams, M. Navarro-Mendoza,
   D. E. McAllister, and J. E. Deacon. 1989. Fishes of North
   America endangered, threatened, or of special concern:
   1989. Fisheries 14(6): 2-20.
Williams, J. E., C. A. Wood, and M. P. Dombeck, eds.  1997.
   Watershed Restoration: Principles and Practices. American Fish-
   eries Society, Bethesda, MD.
Williamson, M. H. 1981. Island Populations. Oxford Univer-
   sity Press, Oxford, UK.
Winterbourn,  M. J., J. S. Rounick, and B. Cowie. 1981. Are
   New Zealand stream ecosystems really  different? NZJ.
   Mar. Freshwater Res. 15: 321-328.
Wolda, H. 1981. Similarity indices, sample size, and diver-
   sity. Oecologia 50: 296-302.
Wright, J.  F., M. T.  Furse, and P. D. Armitage. 1993.
   RIVPACS: A technique for evaluating the biological  qual-
   ity of rivers in the UK. Eur. Water PoUut. Contrail: 15-25.
Yoccoz, N. G. 1991. Use, overuse, and misuse of significance
   tests in evolutionary biology and ecology. Butt. Ecol. Soc.
   Am. 71:106-111.
Yoder, C. O. 1989. The development and use  of biological
   criteria for Ohio surface waters. Pages 139-146 in G. H.
   Flock, ed. Water Quality Standards for the 21st Century. Of-
   fice of Water,  US  Environmental Protection Agency,
   Washington, DC.
Yoder, C. O. 1991a. Answering some questions  about bio-
   logical criteria based on experiences in Ohio. Pages 95-
   104 in Water Quality Standards for the 21st Century. US En-
   vironmental Protection Agency, Washington,  DC.
Yoder, C. O. 1991b. The integrated biosurvey as a tool for
   evaluation of aquatic life use attainment and impairment
   in Ohio surface waters. Pages 110-122 in Biological  Crite-
   ria: Research and Regulation. EPA-440-5-91-005. Office of
   Water, US Environmental Protection Agency, Washing-
   ton, DC.
Yoder, C. O.,  and E. T. Rankin. 1995a. Biological criteria
   program development and implementation in Ohio. Pages
   109-144 in W. S. Davis and T. P. Simon, eds. Biological
   Assessment and Criteria: Tools for Water Resource Planning and
   Decision Making. Lewis, Boca Raton, FL.
Yoder, C. O., and E. T. Rankin.  1995b. Biological response
   signatures and the area of degradation value:  New  tools
   for interpreting multimetric data. Pages 263-286 in W. S.
   Davis andT. P. Simon, eds. Biological Assessment and  Crite-
   ria: Tools for Water Resource Planning and Decision Making.
   Lewis, Boca Raton, FL.
Zakaria-Ismail, M. 1994. Zoogeography and biodiversity of
   the freshwater fishes of Southeast Asia. Hydrobiologia 285:
   41-48.
                                                                                                       149

-------