vvEPA
United States
Environmental Protection
Agency
Office of Research and
Development
Washington DC 20460
EPA/600/R-97/114
July 1999
A Review of Single
Species Toxicity Tests:
Are the Tests Reliable
Predictors of Aquatic
Ecosystem Community
Responses?
-------
-------
EPA/600/R-97/114
July 1999
A Review of Single Species Toxicity
Tests: Are the Tests Reliable
Predictors of Aquatic Ecosystem
Community Responses?
By
Victor de Vlaming1
Teresa J. Norberg-King2
1 State Water Resources Control Board
901 P Street
PO Box 9442:13
Sacramento, California 94244-2130
2Mid-Continent Ecology Division
6201 Congdon Boulevard
Duluth.MN 55804-1636
Office of Research and Development
U.S. Environmental Protection Agency
Duluth, Minnesota 55804
Printed on Recycled Paper
-------
Notice
This document has been reviewed according to U.S. Environmental Protection Agency Policy and
approved for publication. Mention of trade names or commercial products does not constitute
endorsement or recommendation for use.
The views expressed in this document are those of the individual authors and do not necessarily
reflect the view and policies of the U.S. Environmental Protection Agency or the State Water
Resources Control Board.
-------
Abstract
This document provides a comprehensive review to evaluate the reliability of single species (also
referred to as indicator species) toxicity test results in predicting aquatic ecosystem impacts, also
known as the ecological relevance of laboratory single species toxicity tests. Since aquatic
ecosystem biological assessments have been performed to determine whether toxicity test results
are predictive of biological community impacts, the strengths and limitations of these validation tools
have been assessed. Ecological relevance has been analyzed in studies on ambient waters,
effluents, and other types of aqueous media. Furthermore, the effectiveness of laboratory single
species toxicity tests with individual chemicals in predicting biological community impacts and/or
environmental adverse effect concentrations is evaluated. Merits of published criticisms of the
predictive effectiveness of single species used in laboratory toxicity tests are analyzed. Also, the
question of whether single species used in laboratory toxicity tests are more sensitive than most
natural populations is discussed. Alternatives to single species toxicity tests are explored. A
preponderance of evidence reveals that laboratory single species toxicity test results are reliable
qualitative predictors of aquatic ecosystem community impacts.
in
-------
Foreword
The US Environmental Protection Agency (USEPA) has begun a long-term process aimed at
restoring and maintaining the chemical, physical, and biological integrity of the Nation's waters. One
major element in this effort was removing the discharge of toxic materials in toxic amounts to surface
waters. Through the policy designed to reduce or eliminate toxics discharges and to assist in
achieving objectives of the Clean Water Act (CWA), USEPA issued technical direction in the
Technical Support Document for Water Quality-Based Toxics (TSD) guidance (March 1984 Policy
for the Development of Water Quality-Based Permit Limitations for Toxic Pollutants; 49 FR 9016).
Through these directives, the Agency described its integrated toxics control program. The integrated
program consists of the application of both chemical-specific and biological methods to address the
discharge of toxic pollutants. USEPA continued with the development of the toxics control program
by developing effluent toxicity test methods, and these methods are being used to assess the quality
of surface waters, effluents, stormwater, as wellasothertypes of aqueous media. The use of toxicity
tests for biological monitoring provides tools that can be used to assess the combined effect of
mixtures and unknown constituents in a water sample to be evaluated, which in turn provides a direct
evaluation of the attainment of protection to the aquatic life.
Many uses for laboratory toxicity tests are to determine compliance with enforceable water quality
standards and effluent limits. The concept behind, and the intent of, the single species toxicity tests
(also referred to as indicator species tests) is to assess the probability of impacts on aquatic
ecosystems. To be effective water quality monitoring tools, toxicity test results should have a
predictive relationship with aquatic ecosystem impacts. USEPA (1991) reported that the results of
indicator species toxicity tests are effective predictors of aquatic ecosystem impacts. This
comprehensive literature review was undertaken to provide a critical examination of the relationships
among ambient water toxicity, effluent toxicity, and effects on organisms in ambient waters.
IV
-------
Contents
Abstract iii
Foreword iv
List of Tables vii
List of Figures vii
Acknowledgments viii
Acronyms and Abbreviations ix
Definitions x
Section 1 1
1.0 Introduction 1
2.0 Intent of Single Species Toxicity Tests 2
3.0 Validation Procedures: Ecological Surveys/Bioassessments 2
3.1 Bioassessments 2
3.2 Can Laboratory Single Species Tests Be Validated? 3
3.3 To What Extent Should These Tests Be Validated? 4
4.0 False Positives and False Negatives . 4
5.0 Field Studies 4
5.1 CETTP Studies 4
5.2 Associated Studies 5
5.2.1 South Elkhorn Creek Study 5
5.2.2 North Carolina Study 6
5.3 Review of CETTP Studies 6
5.3.1 Dickson et al. Analysis 7
5.3.2 Marcus and McDonald Analysis 9
5.4 Independent Evaluation of Statistical Analyses 10
5.5 Review of CETTP Studies in Which Significant Correlation Was Not Observed 10
5.5.1 Ottawa River Study 11
5.5.2 Five Mile Creek Study 12
5.5.3 Skeleton Creek 12
5.5.4 Ohio River 13
5.5.5 General Comments Regarding the Four CETTP Studies Summarized 13
6.0 Criticisms of CETTP and Associated Studies 13
6.1 CETTP Studies Compared Ambient Water Test Results with Bioassessment Variables . 13
6.2 Nonrandom Selection of Study Areas and Sites 14
6.3 Use of the Most Sensitive Toxicity Test Results 14
6.4 Relationship Between Toxicity Test Results and Instream Biological Measurements Relied
Heavily on High Magnitude Toxicity 15
6.5 Temporal Repeatability of the Ambient Water Toxicity/Biological Response Was Not
Demonstrated 15
6.6 Confounding Factors Were Not Considered 15
6.7 Was the CETTP Classification System Mathematically Biased? 16
6.8 High Rate of False Positives 16
6.9 Miscellaneous Criticisms 16
6.10 Conclusions 16
-------
Contents (continued)
7.0 Single Species Tests with Effluent 17
8.0 Single Species Tests with Individual Chemicals or Small Groups of Chemicals 17
8.1 Organic Chemicals: Pesticides 17
8.2 Organic Chemicals: Nonpesticides 17
8.3 Metals 17
8.4 Other Data and Views of Predictiveness of Single Species Test Results 17
9.0 Comparison of Single Species and Multiple Species (Microcosm, Mesocosrn) Toxicity Test
Results 18
9.1 Okkerman et al. (1993) 19
9.2 Emans et al. (1993) 19
9.3 Slooff (1985) '.'.'.'.'.'.'.'.'.'. 19
9.4 Persoone and Janssen (1994) 19
9.5 Phluger (1994) 19
9.6 Dom(1996) 20
9.7 Crane (1995) 20
10.0 Alternatives to Single Indicator Species Tests 20
10.1 Tests with Single Indigenous Species 20
10.2 Tests with Multiple Indigenous Species 21
11.0 Studies in Ocean or Estuarine Settings 22
Section 2 23
1.0 Conclusions " ] 23
2.0 Summary ' ] 26
Section 3
1.0 References 28
2.0 Bibliography '.'.'.'.'.'.'.'.'.'.'.'.'. 35
Appendices
Appendix A Single Species Tests with Effluents 36
Appendix B Single Species Tests with Individual Chemicals 42
Appendix C Single Species Tests with Ocean Water or Sediment 51
Appendix D Strengths and Limitations of Single Species Toxicity Tests 54
VI
-------
List of Tables
Table 1. Toxicity testing summary for the Ottawa River site study (Mount et a!., 1984) 11
Table 2. Equations showing relationships between laboratory (single species) and ecosystem
determined endpoints 18
Table 3. Summary of studies examining the relationship between laboratory single species
test results and aquatic ecosystem responses 18
List of Figures
Figure 1. Summary of Eagleson et al., 1990 analysis 7
Figure 2. Summary of Dickson et al., 1992 analysis 9
Figure 3. Summary of studies in which a cladoceran was used as a laboratory test organism
when comparing toxicity test results to ecological survey data and/or field test
concentrations 24
Figure 4. Summary of studies reviewed in this report in which the results of laboratory single
species toxicity tests were compared to biological community surveys and/or field
effect concentrations 26
VII
-------
Acknowledgments
This document was peer reviewed by numerous individuals and this version of the document
incorporates reviewer recommendations. These review comments were considerably valuable
in improving the quality, accuracy and clarity of the literature review.
The reviewers of the early drafts of this document who offered helpful suggestions are:
Larry Ausley (North Carolina DENR, Raleigh, NC),
Tom Dean (Coastal Resources Associates, Vista, CA),
Debra Denton (USEPA, Region 9, San Francisco, CA),
Regina Donohoe (California EPA, Sacramento, CA),
Chris Foe (Central Valley Regional Water Quality Control Board, Sacramento, CA),
Jeff Miller (Aqua-Science, Davis, CA),
Don Mount (AscI Corporation, Duluth, MN), and
Michael Perrone (State Water Resources Control Board, Sacramento, CA).
The critiques and suggestions of the following individuals were particularly valuable:
Brian Anderson (University of CA-Santa Cruz, Monterey, CA),
Gordon Anderson (deceased, formerly of Santa Ana Regional Water Quality Control Board,
Riverside, CA),
Rodger Baird (City Sanitation Districts of Los Angeles, Whittier, CA),
Peter Chapman (EVS Consultants, Vancouver, BC),
JoAnne Cox (State Water Resouces Control Board, Sacramento, CA),
Carol DiGiorgio (Department of Water Resources, Sacramento, CA), and
Mike Marcus (The Cadmus Group, Albuquerque, NM).
John Caims, Jr. (Virginia Tech, Blacksburg, VA) provided an abundance of the relevant
literature for this review.
We appreciate the peer reviews conducted by Robert Spehar and Jo Thompson, USEPA,
Office of Research and Development, Mid-Continent Ecology Division, Duluth, MN.
Without the assistance and cooperation of Robert Holmes (California State University,
Humboldt, Arcata, CA) and M. Perrone (Water Resources Control Board) this document could
not have been produced.
viii
-------
Acronyms and Abbreviations
A. punctulata
AEC
C. dubia
C. variegatus
C. parvula
CETTP
CWA
EC
F1FRA
1C
IWC
km
LOEC
m
M. bahia
M. berylina
NOEC
NPDES
P. promelas
POTW
r
RWC
S. capricornutum
STP
TSCA
TSD
WET
WWTP
sea urchin, Arbacia puntulata
Acceptable Effluent Concentration
cladoceran, Ceriodaphnia dubia
sheepshead minnow, Cyprinodon variegatus
red algae, Champia parvula
Complex Effluent Toxicity Testing Program
Clean Water Act
Effect Concentration
Federal Insecticide, Fungicide and Rodenticide
Inhibition Concentration
Instream Waste Concentration
kilometers
Lowest Observed EEffect Concentration
meters
mysid shrimp, Mysidopsis bahia
inland silverside, Menedia berylina
No Observed Effect Concentration
National Pollutant Discharge Elimination System
fathead minnow, Pimephales promelas
Publicly Owned Treatment Works (wastewatertreatment plant)
and also referred to as WWTP
Correlation coefficient
Receiving Water Concentration
green algae, Selenastrum capricornutum
Sewage Treatment Plant
Toxic Substances Control Act
Technical Support Document (cf., USEPA, 1991)
Whole Effluent Toxicity
Wastewater Treatment Plant, also referred to as a POTW
IX
-------
Definitions
Accuracy is the degree of difference between observed
values and known or actual values! This is appropriate for
chemical and physical measurements, but not biological
systems. Toxicity is relative rather than absolute and the
organisms measure toxicity without a reference organism
In a reference toxicant solution.
Acute Toxicity is a test to determine the concentration of
effluent or receiving waters (or ambient waters) that produces
an adverse effect on a group of test organisms during a short-
term exposure (e.g., 24,48, or96 h). The endpoint is lethality.
Acute toxicity is measured using statistical procedures (e.g.;
point estimate techniques orat-test). Acute toxicity is usually
defined as TUa =100/LC50.
Acirte-tc-Chronic Ratio (ACR) is the ratio of the acute toxicity
of an effluent or a toxicant to its chronic toxicity. It is used
as a factor for estimating chronic toxicity on the basis of acute
toxicity data, or for estimating acute toxicity on the basis of
chronic toxicity data.
Additivity is the characteristic property of a mixture of
toxicants that exhibits a total toxic effect equal to the arithmetic
sum of the effects of the individual toxicants.
Ambient Toxicity is measured by a toxicity test on a sample
collected from a surface water.
Bioassay is a test used to evaluate the relative potency of
a chemical or a mixture of chemicals by comparing its effect
on a living organism with the effect of a standard preparation
on the same type of organism. Bioassays frequently are used
in the pharmaceutical industry to evaluate the potency of
vitamins and drugs.
Criteria Continuous Concentration (CCC) is the USEPA
national water quality criteria recommendation forthe highest
instream concentration of a toxicant or an effluent to which
organisms can be exposed indefinitely without causing
unacceptable effect.
Criteria Maximum Concentration (CMC) is the USEPA
national water quality criteria recommendation forthe highest
Instream concentration of a toxicant or an effluent to which
organisms can be exposed fora brief period of time without
causing an acute effect.
Chronic Toxicity is defined as a long-term toxicity test in
whteh sublethal effects (e.g., reduced growth or reproduction)
are usually measured in addition to lethality. Chronic toxicity
is defined as TUc = 1007NOEC or TUc = 100/ECp (ICp)
The ICp and ICp value should be the approximate equivalent
of the NOEC calculated by hypothesis testing for each test
method.
Coefficient of Variation (CV) is a standard statistical measure
of the relative variation of a distribution or set of data, defined
as the standard deviation divided by the mean. Coefficient
of variation is a measure of precision within (intralaboratory)
and among (interlaboratory) laboratories.
Critical Life Stage is the period of time in an organisms life
span in which it is the most susceptible to adverse effects
caused by exposure to toxicants, usually during early develop-
ment (egg, embryo, larvae). Chronic toxicity tests are often
run on critical life stages to replace long duration, life-cycle
tests since the most toxic effect usually occurs during the
critical life stage.
Effect Concentration (EC) is a point estimate of the toxicant
concentration that would cause an observable adverse effect
(e.g., survival or fertilization) in a given percent of the test
organisms, calculated from a continuous model (e.g., USEPA
Probit Model).
Hypothesis Testing is a technique (e.g., Dunnett'stest) that
determines what concentration is statistically different from
the control. Endpoints determined from hypothesis testing
are NOEC and LOEC. Null hypothesis (Ho): The effluent
is nottoxic; Alternative hypothesis (Ha): The effluent is toxic.
Inhibition Concentration (1C) is a point estimate of the
toxicant concentration that would cause a given percent
reduction in a non-quantal biological measurement (e.g.,
reproduction or growth) calculated from a continuous model.
Instream Waste Concentration (I WC) is the concentration
of a toxicant in a riverine system after mixing. Also referred
to as the receiving water concentration (RWC). The IWC
or RWC is the inverse of the dilution factor.
LC50 is the toxicant concentration that would cause death
to 50% of the test organisms.
Lowest Observed Effect Concentration (LOEC) is the
lowest concentration of toxicant to which organisms are
exposed in a test, which causes statistically significant adverse
effects on the test organisms (i.e., where the values forthe
observed endpoints are statistically significant different from
the control). The definitions of NOEC and LOEC assume
a strict dose-response relationship between toxicant
concentration and organism response. If this assumption
-------
were always the case, there would be no issue concerning
the endpoint definitions because the NOEC would always
be a lower concentration level than the LOEC. However,
this strict dose-response relationship does not exist with all
toxicants. When this occurs the test must be repeated or
the lowest NOEC should be reported for compliance purposes.
Minimum Significant Difference (MSD) is the magnitude
of difference from control where the null hypothesis is rejected
in a statistical test comparing a treatment with a control MSD
is based on the number of replicates, control performance
and power of the test.
Mixing Zone is an area where an effluent discharge under-
goes initial dilution and may be extended to cover the second-
ary mixing in the ambient waterbody. A mixing zone is an
allocated impact zone where water quality criteria can be
exceeded as long as acutely toxic conditions are prevented.
No Observed Adverse Effect Level (NOAEL) is a tested
dose of an effluent or a toxicant below which no adverse
biological effects are observed, as identified from chronic
orsubchronic human epidemiology studies or animal exposure
studies.
No Observed Effect Concentration (NOEC) is the highest
tested concentration of toxicant to which organisms are
exposed in a fuirlife-cycle or partial life-cycle (short-term)
test, that causes no observable adverse effect on the test
organism (i.e., the highest concentration of toxicant at which
the values for the observed responses are not statistically
significant different from the controls). NOECs calculated
by hypothesis testing are dependent upon the concentrations
selected.
Point Estimation Techniques are used to determine the
effluent concentration at which adverse effects (e.g., fertiliza-
tion, growth or survival) occurred, such as Probit, Interpolation
Method, Spearman-Karber. For example, concentration at
which a 25% reduction in fertilization occurred.
Precision is a measure of mutual agreement among
individual measurements or enumerated values of the
same property of the sample; can be described by the
mean, standard deviation and coefficient of variation. The
precision is usually discussed by test consistency or re-
peatability both with a laboratory (intralaboratory) and
among several laboratories (interlaboratory) using the
same test method and reference toxicant.
Receiving Water Concentration (RWC) is the concentra
tion of a toxicant or the parameter toxicity in the receiving
water (i.e., riverine, lake, reservoir, estuary or ocean) after
mixing. Isopleths of effluent concentration can be
established by dye studies or modeling techniques is
determining CMC and CCC.
Significant Difference is defined as statistically significant
difference (e.g., 95% confidence level) in the means of two
distributions of sampling results.
Tesst Acceptability Criteria (TAG) are defined for toxicity
tests results to be acceptable or valid for compliance, the
effluent and the concurrent reference toxicant controls
must meet specific criteria as defined in the test method
(e.g., Ceriodaphnia dubia survival and reproduction test,
the criteria are: the test must achieve at least 80% survival
and average 15 young/female in the controls).
Toxicity Tests are laboratory experiments which employ
the use of standardized test organisms to measure the
adverse effect (e.g., growth, survival or reproduction) of
effluent or receiving waters.
Toxic Unit Acute (TUa) is the reciprocal of the effluent
concentration that causes 50% of the organisms to die by
the end of the acute exposure period (i.e.,
TUa = 100/LC50).
Toxic Unit Chronic (TUc) is the reciprocal of the effluent
concentration that causes no observable effect on the test
organisms by the end of the chronic exposure period (i.e.,
TUc = 100/NOEC).
Toxic Units (TUs) are a measure of toxicity in an effluent
as determined by the acute toxicity units or chronic
toxicity units. Higher TUs indicate greater toxicity.
Toxicity Identification Evaluation (TIE) is a set of
procedures to identify the specific chemical(s) responsi-
ble for effluent toxicity. TIEs are subset of the Toxicity
Reduction Evaluation (TRE).
Toxicity Reduction Evaluation (TRE) is a site-specific
study conducted in a stepwise process designed to iden-
tify the causative agents of effluent toxicity, isolate the
sources of toxicity, evaluate the effectiveness of toxicity
control options, and then confirm the reduction in effluent
toxicity.
Whole Effluent Toxicity (WET) is the total toxic effect of
an effluent or receiving water measured directly with a
toxicity test.
XI
-------
-------
Section 1
1.0 Introduction
The Clean Water Act (CWA), Federal Insecticide, Fungi-
cide, and Rodenticide Act (FIFRA), and Toxic Substances
Control Acts (TSCA) are the federal legislation mandating
those potential hazards of chemicals and wastewaters be
assessed. In particular, the CWA aims at preventing the
release of toxic concentrations of chemicals, regardless
of whether they originate from point or nonpoint sources,
into the nation's surface waters by stating "it is the national
policy that the discharge of toxic pollutants in toxic amoun-
ts be prohibited."
As part of the effort to implement the above CWA policy,
the USEPA incorporated toxicity-based discharge limits
into National Pollutant Discharge Elimination System
(NPDES) permits. To support this approach, USEPA
published a Technical Support Document (TSD) (USEPA,
1991) and short-term toxicity test methodologies (USEPA,
1994a; 1994b; hereafter referred to as the USEPA toxicity
tests). The intent of these toxicity tests is to rapidly and
reliably estimate the potential chronic effects of toxic chem-
icals in ambient water and wastewater, stormwater and
other water matrices on aquatic life.
Forfreshwater ecosystems, USEPA has focused on three
species for short-term tests designed to estimate the de-
gree of chronic toxicity in a water sample (USEPA, 1994a).
These freshwater methods include a fish, larval fathead
minnow (Pimephales promelas), a zooplankton
(Ceriodaphnia dubia), and an alga (Selenastrumcapricorn-
utum). The marine and estuarine short-term tests estimate
chronic toxicity (USEPA, 1994b) with two fish species,
sheepshead minnow (Cyprinodon variegatus) and the
inland silverside (Menidia berylina), a red alga (Champia
parvula), an east coast mysid (Mysidopsis bahia), and a
sea urchin (Arbacia punctulata).
USEPA states "whole effluent toxicity (WET) is a useful
parameter for assessing and protecting against impacts
upon water quality and designated uses caused by the
aggregate toxic effect of the discharge of pollutants" (in
the TSD; USEPA, 1991). Four data sets were the focus of
supportforthe reliability of the USEPA toxicity tests results
in predicting aquatic ecosystem community responses:
USEPA's Complex Effluent Toxicity Testing Program
(CETTP) studies (USEPA, 1991), the South Elkhorn
Creek, Kentucky study (Birge et al., 1989), the Trinity
River, Texas study (Dickson et al., 1989), and the North
Carolina study performed by Eagleson et al. (1990).
The eight CETTP studies include: Scippo Creek, Ohio
(Mount and Norberg-King, 1985); Ottawa River, Ohio
(Mount et al., 1984); Five Mile Creek, Alabama (Mount et
al., 1985); Skeleton Creek, Oklahoma (Norberg-King and
M:ount, 1986); Naugatuck River, Connecticut (Mount etal.,
1986a); Back River, Maryland (Mount et al., 1986b); Ohio
River, West Virginia (Mount et al., 1986c); and Kanawha
River, West Virginia (Mount and Norberg-King, 1986). In
these studies the 7-d Ceriodaphnia and/or early life stage
larval fathead minnow toxicity test results from surface
water and/or effluents were compared with data from
aquatic ecosystem community surveys (bioassessments)
to determine whetherthe toxicity test results were effective
predictors of instream biological responses. USEPA
concluded (USEPA, 1991) that the four data sets "com-
prise a large database specifically collected to determine
the validity of toxicity tests to predict receiving water
community impact. The results, when linked together,
clearly show that if toxicity is present (in discharges) after
considering dilution, impact will also be present."
Criticisms of the CETTP and associated studies, as well
as their conclusions, have been published (see Section 6
below). In a broader sense, there have been questions
regarding the reliability of single species (frequently
described as indicator species) toxicity test results in-
predicting aquatic ecosystem responses (impairments).
Moreover, there are questions regarding the validity of, and
the uncertainty associated with, extrapolations from single
indicator species toxicity test results to aquatic ecosystem
responses. USEPA also bases their chemical-specific
water quality criteria on laboratory single species toxicity
test estimates of chronic toxicity, yet the validity and
reliability of these criteria are less frequently questioned.
The central aspect of the uncertainty appears to be
whetherthe indicator species toxicity test results, obtained
under controlled laboratory conditions, can be reliably
translated into responses by complex and multivariant
aquatic ecosystem communities. For example, laboratory
effluent toxicity test results could overestimate biological
community responses if aquatic ecosystem physi-
cal/chemical or biotic factors mitigated (e.g., altered
chemical bioavailability) effluent toxicity. On the other
hand, some aquatic ecosystem physical/chemical and
biotic factors could act as stressors which exacerbate the
effects of toxic chemicals such that laboratory toxicity test
results underestimate instream biological responses.
There is also the concern that indicatorspecies toxicity test
-1-
-------
results do not represent the range of sensitivities and the
different levels of biological organization which exist in
aquatic ecosystems. These, as well as other, concerns
regarding the reliability of single species toxicity test
results in predicting aquatic ecosystem biological re-
sponses will be considered in this document.
Regulatory agencies have tended to rely on single species,
especially USEPA toxicity tests, test results on surface
water and wastewater samples to estimate potential
toxicity threats to aquatic ecosystem communities. Since
there have been criticisms of the predictive effectiveness
of single species tests, the intent in this review is to
evaluate and summarize the published literature, as well
as other available reports, on the ecological relevance of
laboratory single species toxicity test results. This review
examines, but is not limited to, the CETTP and associated
studies (e.g., Birge et al., 1989; Dickson et al., 1989;
Eagleson et al., 1990). Various aspects of the reliability of
single species toxicity test results as predictors of biologi-
cal community impacts have been reviewed by many
authors (as noted in Section 3). This report is a compre-
hensive review of the literature in this area.
The following sections address:
4 the intent of single species toxicity tests,
* the procedures (bioassessments) used to "validate"
indicator species toxicity test results,
4- the concepts of false positives and false negatives,
4 the USEPA CETTP and associated studies,
4 criticisms of the CETTP studies,
4 single species tests with effluent,
4 single species tests with individual chemicals or small
groups of chemicals,
4 comparisons of single species and multiple species test
results, and
4 alternatives to single species toxicity tests.
The vast majority of the literature in this area relates to
toxicity tests with freshwater species and ecosystems.
There Is a paucity of studies which attempt to relate
laboratory toxicity test results with bay and estuary or
ocean impacts, nonetheless, the few relevant studies
(Section 11) are summarized in this review. The conclu-
sions (Section 2) of this report are weighted toward
freshwater toxicity test results as predictors of aquatic
ecosystem community responses.
2.0 Intent of Single Species Toxicity Tests
Before summarizing and discussing data which relate to
how reliably the USEPA toxicity tests (and other single
species toxicity test results) predict ecosystem responses,
a consideration of the intent of these tests seems war-
ranted.
A criticism of the single species tests has been that their
results are invalid predictors of aquatic community re-
sponses because only qualitative (i.e., statistically
significant toxicity test results indicate some degree of
biological community response/impairment) rather than
quantitative relationships are established between toxicity
test results and ecosystem community responses.
Quantitative in this context refers to a case in which some
level or percent response in toxicity test results can be
directly correlated with a specific level/percent response
in instream biological communities. However, these tests
were not designed to be quantitative predictors of
ecosystem responses. The USEPA toxicity tests and other
indicator single species tests were intended to be
screening tools (i.e., to indicate the potential for
wastewater or ambient water samples to cause biological
community impacts, characterizing relative ecosystem
effects) and "early warning" signals (a measurement which
indicates the potential for aquatic ecosystem impairment
priorto actual damage to biological communities (USEPA,
1991; USEPA, 1994a). The toxicity tests are applicable to
ambient water samples regardless of the sources (i.e.,
point or nonpoint) of contaminants.
Because the USEPA toxicity tests were intended to be
early warning signals of biological community impacts, the
results of a single toxicity test should not constitute a
violation of a water quality standard, or of an effluent
limitation. Unfortunately, such misuses have occurred and
these cases may be major contributors to the criticisms
leveled at the USEPA toxicity tests.
3.0 Validation Procedures: Ecological
Surveys/Bioassessments
3.1 Bioassessments
The method generally used for "validating" the reliability of
single species toxicity test results in predicting aquatic
ecosystem impairments (and "safe" concentrations) has
been to perform ecological surveys (biological assess-
ments), and then compare these data to toxicity test results
with water samples from the same ecosystem sites or to
data from effluent toxicity tests. Bioassessments can
consist of estimates of species composition, diversity, and
density of aquatic organisms.
Because bioassessments play a crucial role in this
"validation" process, there are considerations regarding
these procedures which must be explored. From these
surveys, a judgement is made as to whether or not the
aquatic ecosystem or a part of it is impacted. Bioassess-
ments are not ate facto better or easier to interpret than
other types of measurements!.
Bioassessments are subject to most of the same pitfalls as
other biological and toxicolocjical studies, including poor
design and careless performance. Sound experimental
design and careful conduct are crucial, requiring a
thorough understanding of the complexity of aquatic
ecosystems, as well as confounding factors (e.g., current
-2-
-------
velocity, depth, light penetration, shading, temperature,
substrate, organic matter, nutrients) which can affect site
selection so that they can be "controlled" or accounted for.
Moreover, sites within a stream should be chosen to
minimize differences among them with respect to physical
and chemical parameters. The idea is to minimize factors
which can influence ecosystem parameters so that any
change can be ascribed to toxic chemicals.
To be effective in "validation" of toxicity test results,
ecological surveys must be able to clearly distinguish
between contaminant-caused effects and all other effects
on aquatic populations. Aquatic ecosystem biological
surveys are not, by themselves, sufficient to determine
toxic chemical impacts because biological community
structure and function are influenced by a host of other
factors (e.g., dissolved oxygen, temperature, physical
parameters, habitat conditions).
Limitations (LaPoint, 1994; 1995) in bioassessment studies
have included failure to consider seasonal variations
(frequently sampling is only a one or two time event), poor
selection of endpoints (endpoints should be reliable,
having ecological relevance), poor sampling procedures,
lack of sample replication, failure to consider nonchemical
stressors, failure to identify cause of change, use of
inappropriate procedures and statistics which are not
standardized, and failure to provide early warning of
impairment. Many ecological assessments have been
characterized by a high degree of variability (greater than
in chemical and toxicity measurements), imprecisions, and
lack of repeatability (e.g., LaPoint, 1994; 1995).
Many bioassessments provide qualitative (not quantitative)
data; for example, macroinvertebrate surveys with kick-
nets are qualitative and, usually, are not replicated. Most
of the ecological surveys associated with field "validation"
of single species test results have consisted of "simplistic
field designs" and "superficial study" of the natural system
(Neuhold, 1986; Chapman et al., 1987; Luoma, 1995).
Neuhold (1986) contends that measurements such as
biomassand population numbers, which are frequently the
basis of ecological surveys, are too insensitive as
endpoints because they take considerable time to change
enough to ".clear" the background noise level. Interpreta-
tions of bioassessment data are frequently controversial.
For example, according to LaPoint et al. (1996), biological
assessment of contaminant(s) effects is more difficult than
laboratory single species toxicity tests with regards to the
possible ecological significance due to the large number
of aquatic species potentially responding in the system.
Clements and Kiffney (1996) state, "Most importantly,
inability to establish a direct cause-and-effect relationship
between contaminants and selected endpoints greatly
limits instream biomonitoring."
There is yet to be agreement on meaningful ecological
endpoints or the amount of change in ecological
measurements which represent impairment. It has been
difficult to identify and measure subtle damage in aquatic
ecosystems. No procedures/protocols for performing
ecological surveys on large waterways, such as major
rivers, have been published. Developing scientifically valid
biological assessment methods for such systems is
needed, but it seems unlikely that regulatory agencies will
have the budgets to fund such large efforts. In relation to
these bioassessment concerns, the difficulties surrounding
"validation" of laboratory single species toxicity test results
have been reviewed (Cairns, 1983; 1988a; Livingstone and
Meeter, 1985; Chapman, 1995a,b). These considerations
should be remembered when using bioassessment
measurements in evaluating the predictive accuracy of the
single species or multiple species toxicity test results.
The intent here is not to malign bioassessments, but to
draw attention to the fact that they are not de facto
conclusive. On the other hand, well designed and
performed bioassessments are powerful tools, crucial to
environmental monitoring and assessment. The advan-
tages of using ecological surveys and, in particular,
macroinvertebrate surveys as water quality indicators have
been thoroughly discussed in an informative book edited
by Davis and Simon (1995).
3.2 Can Laboratory Single Species Tests Be
Validated?
Mount (1995) suggests that it is impossible to conclusively
establish that ecosystem impairments are caused by
ambient water or effluent toxicity. This is because there
are many stressors and other confounding factors at work
in natural ecosystems. Proving cause in complex, poorly
understood ecosystems will be difficult at best. Recently,
Chapman (1995a) wrote, "Basically, I consider the
perceived need for validation of the laboratory by field
studies to be incorrect dogma." A reactive toxicity test can
confirm ecosystem impairments, but proactive tests can
only be "validated" by waiting for ecosystem effects to
appear. Furthermore, absence of biological community
effects can never be fully proven. While recognizing and
addressing various short-comings in ecotoxicology,
Chapman (1995a) warns against perpetuating the "estab-
lished" validation dogma; he points out, as did Mount
(1995), that field studies can never validate laboratory
studiessince there is no certainty that effects observed (or
not) in field studies were caused by effects measured in
the lab.
Another inherent problem in "validating" that single species
toxicity test results can be reliably extrapolated to
ecosystem responses is the status of many aquatic
ecosystems. Given the everexpanding number of aquatic
-3-
-------
population declines (e.g., Herbold et al., 1992; Obrebski
et al., 1992; Bailey et al., 1994) and number of extinct,
endangered, and threatened species, it is clear that many
aquatic ecosystems are partially to seriously impaired. If
single species test results are to be early warnings
(predictive of future events), proactive in function, they
cannot be "validated"' in all circumstances with existing
ecological conditions. The point is notto discontinue study
of the relationship, but rather to understand the limitations
of the procedures used to "validate" the predictiveness of
single species tests.
3.3 To What Extent Should These Tests Be
Validated?
Without partial disturbance of healthy or relatively healthy
aquatic ecosystems, it may not be possible to "validate"
that extrapolations from laboratory toxicity tests reliably
predict aquatic ecosystem responses. Depending on
scale, biological surveys, especially if repeated through
time, may be destructive to aquatic ecosystems. The
question is, should toxic chemicals be released into
ecosystems to repeatedly establish a link between
laboratory toxicity test results and ecological impairments?
Since unequivocal demonstration that effluent or ambient
watertoxicity is the sole cause of ecosystem impairments
may not possible, it seems sensible to question how much
effort, time, and money should be expended to "validate"
a quantitatively accurate correlation between single
species toxicity test results and instream biological
responses.
USEPA's toxicity tests were designed as screening tools
to provide early warning of potential environmental
impacts. For this and other reasons mentioned above, it
has been difficult to establish a quantitative correlation
between the results of these tests and ecological
responses in all aquatic ecosystems. As Chapman
(1995b) suggests, we can never be sure that a proactive
prediction (based on laboratory toxicity test results) is
correct without allowing for potential environmental
degradation. Possibly, surrogate aquatic ecosystems will
allow us to establish a better link between laboratory test
results and ecosystem responses, while minimizing
Impacts on natural aquatic systems.
4.0 False Positives and False Negatives
In this review, the concepts of "false positives" and "false
negatives" will emerge when comparing the results of
single species tests with ecological survey measurements.
Caution is essential in the application of such concepts.
There has been a tendency to label any one statistically
significant toxicity test result which does not match with an
ecological endpoint as a false positive. This may be an
inaccurate designation. A single effluent or ambient water
sample can contain toxic levels of chemicals but, due to
effluent or ambient water variability, the duration,
magnitude, and frequency of the toxicity are not sufficient
to elicit a measurable biological community response.^
More sampling and testing could reveal this. On the other
hand, the toxic sample could be an early warning, signaling
toxicity of a magnitude, duration, and frequency to cause
adverse ecosystem responses. The false positive
designation is also based on the assumption that the
measure of ecosystem integrity is accurate.
A false negative designation has sometime been applied
to cases in which statistically significant toxicity is absent
from an effluent or ambient water sample, but instream
impairment is indicated. Such a designation is not
necessarily true. For example, one sample may not
typically characterize effluent or ambient water toxicity.
More frequent sampling and testing could reveal that
toxicity is of sufficient magnitude, duration, and frequency
to evoke biological community responses. On the other
hand, assuming that the bioassessment measurement
reliably demonstrated impairment, the impact could be a
consequence of nonchemical, non-wastewater related
causes. The presence of bioaccumlative toxic chemicals
in a water sample could lead to a false negative designa-
tion because short term toxicity tests are not designed to
detect such substances. In addition, there are biological
endpoints in aquatic ecosystems which are not repre-
sented in indicator species toxicity tests.
Following a systematic analysis, Luoma and Ho (1993)
concluded that false negative predictions (finding no
statistically significant toxicity in laboratory single species
tests when, in truth, there is biological community
degradation) are just as probable as false positive
predictions. Luoma and Ho contend that "false negatives
may be common in toxicity tests, but they are difficult to
detect. The main reason is that the ecological tests
included in many validation studies are insensitive.
Typically, validations are conducted only at one point in
time, make inadequate replication, consider ambiguous
community structure indices, or do a poor job of docu-
menting exposures." Caution should be exercised when
describing the relationship between a single toxicity test
result and an index (which represents the integration of
many types of stresses over time) of ecosystem integrity,
as a false positive or negative,,
5.0 Field Studies
5.1 CETTP Studies
The eight CETTP and three related studies examined the
relationship between 7-d Ceriodaphnia and/or larval
fathead minnow early life stage toxicity test results on
surface water or wastewater and instream survey indices
.for zooplankton, benthic macroinvertebrates, and/or fish
populations. The intent of these studies was to determine
. -4-
-------
how effectively toxicity test results on ambient waters or
effluents corresponded with ("predicted") estimates of
aquatic ecosystem community health. In the eight CETTP
studies there were 80 sites in eight different watersheds
where instream bioassessment indices were compared to
surface water toxicity test results.
The intent is not to summarize and evaluate each of these
CETTP studies separately since they have, as a group,
been the subject of recent analyses (Dickson et al., 1992;
Marcus and McDonald, 1992). The approach is to
summarize two of the studies (Birge et al. 1989; Eagleson
et al., 1990) which have been associated with the CETTP
and then the two analyses (Dickson et al., 1992; Marcus
and McDonald, 1992) of the CETTP studies. This
summary is followed by an evaluation of those two
analyses by an independent statistician. The final portion
of this section is a review of four CETTP studies which,
according to Marcus and McDonald (1992), do not
evidence a statistically significant canonical correlation
between toxicity test results and instream indices of
biological community health.
Marcus and McDonald (1992) make the interesting
observation that, "There is, unfortunately, an excess
emphasis by many investigators and reviewers on
significance in assessing statistical results. The question
of primary concern is not whether there is high or low
frequency of significant correlations, but what the degree
of correlation between pairings of each laboratory and field
variable is." Examination of the CETTP studies reveals a
distinct qualitative correspondence between ambient
water toxicity and ecosystem variables. In most of the
CETTP studies there appeared to be biological
impairments in a gradient below discharge points (which
showed toxic effluents) compared to upstream sampling
sites. In most cases, ambient water toxicity at a site was
associated with biological community impairments.
Because of small sample sizes in the CETTP studies,
routine correlative parametric statistics were not applied
to compare bioassessment and toxicity test data.
Statistics are frequently used to demonstrate "proof" of
effect, the threshold of effects being arbitrary. McBride et
al. (1993) conclude that routine application of significance
tests does not extract the maximum information from
environmental data. These authors discuss the advan-
tages of equivalence tests where the investigator must
state what degree of difference is considered a practical
difference. In an equivalence test the null hypothesis is
that the difference in means is greater than some
practically significant value which the tester must state in
advance. They recommend that environmental managers
and scientists focus attention on statistical power (the
probability of rejecting the null hypotheses of no difference
in the test groups when in fact it is false-ideally the level
of power should be high) and decide what is a practical
difference. This practical difference concept could apply
to both the bioassessment and toxicity data.
In a book on ecological risk estimation Bartell et al. (1992)
write, "It might also be easier to design experiments or to
monitor natural systems for qualitative endpoints rather
than having to demonstrate statistical differences between
quantitative results. The large variances that typify
ecological experiments may argue for adopting more
qualitative endpoints." Statistics are a valuable tool in our
attempt to understand biological and ecosystem
operations. As we endeavorto comprehend the biological
world, it may be useful, however, to remember that
statistical significance does not guarantee biological
significance and biological significance does not always
equate with statistical significance.
5.2 Associated Studies
5.2.1 South Elkhorn Creek Study
Birge and associates (Birge etal., 1989; 1990) performed
ecological assessements on a stream, which received a
point-source discharge. Results of single species toxicity
tests were compared to ecological endpoints. One
objective was to assess the reliability of the laboratory test
results in predicting ecological responses. Ecological
measurements included macroinvertebrate species
richness, abundance, diversity, and functional group
analysis. Toxicity in effluent and ambient water samples
from the different stream sites was assessed using a
fathead minnow embryo/larval 8-d test.
The point-source discharge was from a wastewater
treatment plant (WWTP) into Town Branch Creek. Town
Branch Creek entered into South Elkhorn Creek about 14
km below the WWTP outfall. There were three control
sites, one above the WWTP outfall on Town Branch Creek
and two on South Elkhorn Creek above the confluence
with Town Branch Creek. There were seven sampling
sites at various distances downstream of the discharge
point. The most distant station was 67.8 km downstream
of the WWTP outfall.
Embryo-larval survival in water samples from all three
control (upstream of the WWTP) was greater than 90%.
Toxicity tests with WWTP effluent generated data on effect
concentrations (expressed as percent effluent). Hydrology
of the creek was studied so that percent dilution of effluent
could be predicted at each sampling site. Toxicity at sites
downstream of the discharge point reflected the toxicity
predicted by the effluent toxicity test data considering
instream dilution. That is, instream toxicity was reliably
predicted by effluent dilution data. These data are
significant in that they demonstrate that the major
modification of effluent toxicity was stream dilution;
physical and chemical characteristics of the stream did not
appear to mitigate toxicity to any great extent.
-5-
-------
Ambient water samples collected at the three sites
immediately below the point of discharge showed
statistically significant toxicity, whereas none of the
reference sites yielded significant toxicity. A decreasing
gradient of toxicity downstream of the discharge point was
evident. Both the fish and invertebrate data suggested
adverse impacts at the three sites immediately below the
discharge point. Below these heavily impacted sites there
tended to be a gradient of increasing diversity of both fish
and macroinvertebrates downstream of the discharge
point. The correlation coefficient (r) between embryo/larval
survival in water samples from the stream sites and
estimated percent effluent at those sites was -0.87 (i.e., the
greater the percent effluent at a site the lower the embryo-
larval survival). The number of fish species® =-0.83) and
number of invertebrate taxa r = -0.94) were also inversely
correlated with the estimated percent effluent at a site.
The correlation coefficients between embryo-larval survival
in water samples from the stream sites and number of
Invertebrate taxa was r = 0.96 while the value for the
number of fish species was r = 0.92. All of these.corre-
lation coefficients were statistically significant. Results of
this study illustrates the laboratorytoxicity test results were
very reliable predictors of instream biological community
responses. Data from this study were included in the
statistical analysis by Dickson et al. (1992).
5.2.2 North Carolina Study
Effluent toxicity test results were compared to indices of
aquatic ecosystem community health at 43 sites on rivers
and streams in North Carolina (Eagleson et al., 1990).
Toxicity tests were performed with both municipal waste
treatment and industrial facilities effluents. The 7-d
Cerfodaphnlatest, was used to estimate chronic toxicity in
effluents. Instream biological responses were gauged by
surveys of benthic macroinvertebrates above and below
points of discharge. Attempts were made to reduce habitat
type confounding factors, as well as other physical
confounding factors. Care was taken to compare the
results of toxicity tests and field responses at low and
average flow conditions; toxicity decay was also
incorporated into the comparisons.
Results of this study revealed that, if proper consideration
was given to effluent dilution, the USEPA toxicity tests
results can be reliable predictors of ecological effects.
Comparisons of upstream and downstream sites with
regard to biological indices were made with the
nonparametric Wilcoxon signed-rank test. If a site
downstream of an effluent discharge point was identified
as a statistically significant response (degradation)
compared to the reference site above the discharge point,
the site was classified as "instream impact measured." If
there were no differences between upstream and
downstream site biological indices measurements, the site
was classified as "no instream impact measured." When
an effluent sample, diluted to the appropriate instream
waste concentration (IWC), resulted in a statistically
significant response compared to controls, the sample was
designated as "instream impact predicted." If an effluent
sample did not produce statistically significant toxicity, it
was designated as "instream effect not predicted."
The classification system described in the above
paragraph was combined into a contingency table which
is best illustrated in Figure 1.
Toxicity test predictions were accurate in 88% of the cases.
If non-effluent anthropogenic factors contributed heavily to
instream biological impacts;, one might expect a high
frequency of "false negatives." However, there were only
5% false negatives. If habitat differences or other physical
factors between the reference site and the sites
downstream contributed to those sites being classified as
impaired, one would expect a more equal distribution
between the two different categories with impacted sites.
However, the distribution in the two categories where
instream impact was measured was very unequal (i.e., two
tests showing no toxicity and 29 testing positive fortoxicity,
cf., Figure 1).
Although these investigators did not apply statistical analy-
ses to this contingency table, results from Fisher Exact and
Chi-Square tests showed statistical significance (P<0.001).
Moreover, one must reject the null hypothesis that toxicity
test results do not predict biological responses. Even
though there is some potential that confounding factors
(see Section 6.6 below) influenced biological
measurements, it appears that the dominant impairments
were due to wastewater constituents. Assuming that the
ecological indicators were accurate, the results of this
study provide a strong case that the Ceriodaphnia (even
though not indigenous to these stream ecosystems)
toxicity tests results were reliable qualitative predictors of
aquatic ecosystem impairments. A more powerful
statistical design would have included more non-impacted
sites, but prior to the ecological surveys the nature of sites
was unknown.
5.3 Review of CETTP Studies
5.3.1 Dickson et al. Analysis
In 1992 Dickson and colleagues published the results of
a study undertaken to statistically analyze data from the
eight CETTP studies, the South Elkhorn Creek, Kentucky
study (Birge et al., 1989), and a study on the Trinity River,
Texas (Dickson et al., 1989). The intent of the Dickson et
al. (1992) study was to apply a statistical method and
classification approach to all of the above mentioned data
to elucidate relationships between surface water toxicity
test results and ecosystem community responses.
After entering data from all of the studies listed above into
a database, a canonical correlation analysis was
performed to examine the relationship between ambient
-6-
-------
EJ5%
021
ElToxicity test predicts instream impact; instream survey measures impact [29/43 = 67%]
DToxicity test predicts no instream impact; instream survey measures no impact [9/43 = 21%]
BToxicity test predicts instream impact; instream survey measure no impact [3/43 = 7%]
ElToxicity test predicts no instream impact; instream survey measures impact [2/43 = 5%]
Figure 1. Summary of Eagleson et al., 1990. The categories represent the four possible outcomes when comparing
laboratory effluent toxicity test results to ecological survey data collected at 43 stations.
water toxicity and estimates of biological community
condition. Canonical correlation tests for significant
relationships between two matrices of data. The
bioassessment metrics can be explored and meshed into
a variate for ecosystem condition which in turn is
compared to the toxicity variate composed of the toxicity
data (e.g., from both Ceriodaphnia and larval fathead
minnow tests). A goal of canonical correlation is to identify
a combination of the predictor variables (i.e., toxicological
responses) and response variables (biological community
indices) which have the strongest correlation among all
possible combinations. The output of canonical correlation
includes indicators of the relative importance (sometimes
defined as weights) of each variable to the overall
correlation.
There were two major goals in the Dickson et al. (1992)
study: 1) ascertain whether or not statistically significant
correlations existed between the surface water toxicity
variable and the biological community variable and 2) use
the results -of the canonical correlations to identify
important variables. Using the toxicity test and biological
community indices variables, a classification system was
developed to determine the reliability of toxicity test results
as predictors of instream community responses.
A major aspect of the analysis was data collected in the
Trinity River study (Dickson et al., 1989). In that study the
relationship between ambient water toxicity test results and
biological community response was scrutinized at 11 sites
along the river. Reference sites were located above a
WWTP discharge point and the remaining sites were below
the outfall. The relationships between ambient water
toxicity and biological community indices were examined
through time, with sampling and testing in six separate
months. Assessments of ambient water toxicity consisted
of the Ceriodaphnia and larval fathead minnow short-term
test estimates of chronic toxicity. Instream biological
community assessments included fisheries data (richness,
evenness, and an index of biotic integrity) and benthic
macroinvertebrate data (richness and evenness).
Separate canonical correlations were performed with the
toxicity variable compared to the fisheries indices and to
the benthic macroinvertebrate indices; the toxicity variable
was not correlated with a consolidated bioassessment vari-
able consisting of combined fisheries and
macroinvertebrate indices. Statistically significant
(p<0.001) coefficients of determination (r2 represents the
proportion of variation in one variable determined by the
variation of the other) were observed for both canonical
(range of r*was 0.38 to 0.59) and robust canonical (range
r2 was 0.38 to 0.94) analyses in all six months of the Trinity
River study for the fisheries and macroinvertebrate mea-
surements. These findings imply that the matrix of toxicity
test results were effective predictors of instream biological
community responses.
Unfortunately, detailed information on canonical correlation
for the CETTP studies was not presented by Dickson et al.
(1992). In fact, r^'s were presented for only the Five Mile
Creek (Mount et al., 1985) and Kanawha River (Mount and
Norberg-King, 1989) studies. Data showing the relative
contributions (i.e., weights) of each of the toxicological and
each of the biological community variables were not
presented for the two CETTP studies. Statistically
-7-
-------
significant i^'s from the robust canonical correlations were
noted for the Five Mile Creek (r2 = 0.81, p = 0.0005) and
Kanawha River (r2 = 0.81, p<0.00001) data. These
correlations suggest that toxicity test results from ambient
water samples were reliable predictors of instream
biological community responses.
Based on the canonical analyses, fish species richness
was shown to be an important aquatic ecosystem
response variable. Therefore, Dickson et al. (1992) se-
lected fish richness as the ecological response variable in
all studies where it was available for the next phase of the
analysis. However, two CETTP studies, Kanawha River
(Mount and Norberg-King, 1986) and Ohio River (Mount
et al., 1986c) did not include fish surveys, so benthic
macroinvertebrates richness was substituted as the
biological response variable.
The next step in the analysis was to develop a classifica-
tion system to judge whether or not a site was predicted to
be impacted based on ambient water toxicity data, and
whether or not a site was observed to be impacted based
on instream community metrics.
For the ambient water toxicity data, a low value for test
performance (e.g., low Ceriodaphnia neonate production
or low larval minnow growth) was used to classify a site as
"impact predicted" and a high value fortest species perfor-
mance was classified as "impact not predicted." For
instream biological community variables a low value (e.g.,
low species richness) resulted in that site being classified
as "impact observed," whereas a high biological
community value classified the site as "impact not
observed." Rather than establish arbitrary thresholds
(cutoffs) for classification of ambient water toxicity results
into categories, the natural variability of the measured
parameters was incorporated into the system. Because
the measure of toxicity consisted of the sum of a subset of
the toxicity variables, with each of these variables
standardized, and with the assumption that the majority of
the observations were normally distributed, the authors
reasoned that the sum of a set of these variables should
have an approximately normal distribution. Assuming
independence of variables, the authors reasoned that the
sum divided by the square root of the number of variables
being summed should have a standard normal distribution.
For these reasons, Dickson et al. (1992) concluded that a
classification scheme could be defined such that a site
would be classified "impact predicted" if the normalized
toxicity measure fell below a threshold obtained from
percentiles of the standard normal distribution.
Controversy surrounds the biological metrics and the
amount of change or difference in these metrics which
represents impairment. Therefore, as with the toxicity test
data, Dickson etal. (1992) used the biological community
data to determine a classification. Sites that revealed a
biological response below a corresponding Poisson distri-
bution percentile (representing counts of the number of fish
and invertebrates at a site) were classified as "impact ob-
served."
Two misclassif ication errors were possible in this scheme,
which were 1) misclassifying a nonimpacted site as
impacted or 2) misclassifying an impacted site as
nonimpacted. The percentiles selected for threshold de-
pends on which misclassification error is of greater con-
cern. If the desire is to keep the error rate of classifying an
impacted site as nonimpacted low, then one might select
a 95th percentile threshold. To keep the error rate of clas-
sifying a nonimpacted site as impacted low, the 5th
percentile could be the threshold.
The classification scheme described above produced two-
way contingency tables for predicted and observed
impacts at aquatic ecosystem sites. Fisher's test was used
to evaluate the accuracy of toxicity test predictions of
instream impacts. The classification scheme was applied
to the CETTP and Trinity River data sets, as well as to the
combined data sets. Using both the 95-95 and the 5-5
percentile cutoffs, strong, statistically significant qualitative
relationships were demonstrated between ambient water
toxicity and instream biological response (impairment).
The contingency table for the 95-95 percentile threshold
using the combined data sets is reproduced below.
Figure 2 shows the data from a contingency table
summarizing the data analyzed by Dickson et al (1992).
The total percentage of sites in all of the CETTP and Trinity
River studies where toxicity test results reliably predicted
instream biological findings was 84.4%. Fisher's Exact test
revealed that toxicity test results effectively (p = 0.003)
predicted instream biological responses. The low
percentage (6.2%) of "false negatives" suggests that
factors other than toxicity were not major contributors to
biological community impacts.
These data can be grouped and examined in a different
manner. Grouping by whether sites were biologically
impacted or not yields totals of 136 and 22, respectively.
For a stronger statistical design, a much larger number of
potentially unimpacted sites would be necessary.
However, the condition of sites was unknown prior to the
biological surveys. Looking only at the impacted sites,
ambient water toxicity tests predicted impacts correctly in
93 % of the cases with 7% "falsie negatives." Examination
of the non-impacted site data reveals that toxicity tests
were reliable predictors in 32% of the cases, with 68%
"false positives." This potential (see discussion on false
positives/negatives above) high rate of "false positives" is
disturbing and confirms that the results of a single toxicity
test should not be used to characterize wastewater or an
ambient water toxicity.
-8-
-------
19.4%
ne.2%
D4.4%
El 80%
EJToxicity test predicts instream impact; instream survey measures impact [128/160]
EToxicity test predicts instream impact; instream survey measures no impact [15/160]
EJToxicity test predicts no instream impact; instream survey measures impact [10/160]
DToxicity test predicts no instream impact; instream survey measures no impact [7/160]
Figure 2. Summary of Dickson et al., 1992 analysis. The categories represent the four possible outcomes when
comparing laboratory toxicity test results on ambient water samples from stream sites with ecological survey data from
the same sites. Total number of stream sites is 160.
The procedures used in this study required that gradients
of ambient water toxicity and of biological community re-
sponses exist. The statistical analyses performed revealed
that the frequency of observing instream impairments
when toxicity test results predicted an impact was
significantly greater than the overall frequency of impair-
ments observed. The analysis by Dickson et al. (1992)
provides a compelling qualitative relationship between
ambient water toxicity and indigenous species responses.
Toxicity test endpoints identified as effective qualitative
predictors of aquatic ecosystem responses were
Ceriodaphnia neonate production and larval fathead
minnow growth.
5.3.2 Marcus and McDonald Analysis
Marcus and McDonald (1992) also analyzed the CETTP
and Elkhorn Creek (Birge et al., 1989) data sets using ca-
nonical correlation. In this analysis, the null hypothesis
was that no correlation existed between the matrix of
instream biological community measurements and the
matrix of toxicity test results (i.e., neonate production by
Ceriodaphnia and larval fathead minnow growth).
The results of their analysis showed a statistically signifi-
cant canonical correlation occurred in four of the eight
CETTP site studies (Scippo Creek: r = 0.93, Naugatuck
River: r = 0.78, Back River: r = 0.996, and Kanawha
River: r = 0.79) as well as in the Elkhorn Creek (Birge et
al., 1989) data set r = 0.99). This translates to five of the
nine data sets (streams/rivers) analyzed. Relatively high
values were found for the canonical correlation coef-
ficients. For all but two of the nine data sets the coeffi-
cients indicated a greater than 50% ® > 0.7) relationship
between the sets of laboratory toxicity test results and the
instream biological variables. Marcus and McDonald
(1992) emphasized these high correlation coefficients and
downplayed statistical significance. Except for the
Naugatuck River study (Mount et al., 1986a), canonical
variable weights (weights refer to the relative importance
of each variable to the overall correlation) were not shown
in this publication, hindering the ability of the reader to
interpret the statistical analysis.
Marcus and McDonald (1992) concluded that, "Although
future improvements will be made in these test methods,
and better methods may be developed, we conclude that
at this time these two toxicity test methods (i.e., the
Ceriodaphnia and larval fathead minnow short term esti-
mates of chronic toxicity) can be potentially useful
assessment tools for screening and monitoring."
In the analysis of the CETTP data, Marcus and McDonald
(1992) found that the ambient toxicity measures often
showed greater relationships to instream biological mea-
surements than expected by chance. They observed that
potentially important relationships appeared often. "Our
analyses of the CETTP data indicate that results from the
tests of ambient water toxicity often contain potentially
important biological information about relationships con-
tained in variables of these field variables." (Marcus and
-9-
-------
McDonald, 1992). In other words, qualitative relationships
appeared often in the CETTP data.
Marcus and McDonald (1992) reported that Ceriodaphnia
neonate production generally had the greatest incidence
of significant correlations to biological community mea-
sures (greatest potential for predicting impairments).
Based on simple correlation analysis of the CETTP data
Parkhurst (1996) suggested that ambient toxicity did not
show a strong relationship with measures of instream bio-
logical communities. Nonetheless, a statistically significant
relationship was noted between ambient watertoxicity and
instream biological indices in five of nine CETTP and asso-
ciated studies. Furthermore, only sublethal endpoints from
the toxicity tests were used in the correlation analysis; that
is, Parkhurst (1996) omitted lethality data from his analysis.
5.4 Independent Evaluation of Statistical Analyses
The appropriateness of the statistical methods and an
evaluation of the major differences used by both Dickson
et al. (1992) and Marcus and McDonald (1992) was con-
ducted by Smith (1994). In this review, Smith (1994) was
not convinced thatthe canonical analyses were the optimal
statistical approach to examine the CETTP and Trinity
River data as canonical correlation assumes linear
relationships. Note that Marcus and McDonald (1992) did
address the linearity question within and between the sets
of variables. Smith suggests that there are many cases
where biological community parameters have been shown
to have nonlinear relationships with toxicity. Furthermore,
canonical correlation focuses on linear combinations of
toxicity and instream response variables that correlate
maximally. Smith concluded that, "It is possible that the
most ecologically meaningful relationships between the
toxicity tests and instream responses are not represented
by maximal correlations."
As indicated above, Marcus and McDonald (1992) did not
provide canonical variable weights in their publication, ex-
cept for one analysis, rendering interpretation of their ap-
praisal difficult. Dickson etal. (1992) presented canonical
variable weights for the Trinity River study, but not for the
CETTP studies. Smith observes that examination of the
canonical variable weights, especially for two (February
and June) of the six months of Trinity River sampling data,
call into question the usefulness of the canonical
procedureforanalyzing the CETTP and associated studies
data. More specifically, toxicity test variable weights and
bfoassessment variable weights sometimes had opposite
signs (plus or minus). When both toxicity and
bioassessment weights have the same sign (plus, plus or
minus, minus) the data indicate that increased larval
fathead survival/growth and/or Ceriodaphnia
survival/neonate production were correlated with greater
diversity/density in the biological community measure-
ments. However, when the signs were opposite, the
indication is that an increase in one variable was
accompanied by a decrease in the other variable (e.g.,
higher growth/reproduction with lower diversity/density).
Biologically, the opposite signs appear inconsistent with
the expected relationship between toxicity and community
parameters.
Smith suggests analyzing the CETTP and associated
studies using separate analyses of the toxicity and
instream response data, possibly ordination techniques.
He commented that, "These analyses would provide
insight into the relationships and patterns shown by toxicity
data alone and the instream response data alone. From
these analyses, I would produce interpretable variables
summarizing the different patterns observed (for both
toxicity and response variable sets separately). I would
then correlate the summary variables for the toxicity tests
with the summary variables for the instream responses
using multiple regression. Each regression analysis would
involve an instream response summary variable as the
dependent variable, and the toxicity test summary
variables as the independent variables. Bivariate plots
would be useful to see the nature of the relationships and
determine if nonlinearity needs to be taken into account
when using the analytical tools."
Assuming that the variables used to classify impairments
were sufficient, Smith indicated thatthe conclusions made
by Dickson et al. (1992) from their classification system
were reasonable.
Data from only two CETTP studies were analyzed in com-
mon by the two different groups of investigators. Both sets
of authors reported statistically significant canonical corre-
lations (but the correlation coefficients differed) for the
Kanawha River data set. Dickson et al. reported a statisti-
cally significant canonical correlation coefficient for the
Five Mile Creek data set, whereas Marcus and McDonald
(1992) did not. The differences between the two analyses
probably related to the fact that Dickson et al. (1992) did
not "mesh" the bioassessment fish and macroinvertebrate
data whereas Marcus and McDonald did.
From Smith's review, it is clear that there are various ways
to statistically analyze studies which attempt to examine
the relationship between toxicity test results and instream
biological community responses. Data can be used or
grouped in various arrays such that the outcome of an
analysis can be very different.
5.5 Review of CETTP Studies in Which A
Significant Correlation Was Not Observed
In Marcus and McDonald (1992) canonical analysis, four
of the CETTP studies, Ottawa River( Mount et al, 1984),
Five Mile Creek (Mount et al, 1985); Skeleton Creek
-10-
-------
(Norberg-King and Mount, 1986), and Ohio River (Mount
et al., 1986c) did not produce a statistically significant cor-
relation between ambient water toxicity test results and
instream biological community parameters. The nature of
this type of analysis can obscure valuable pieces of data,
as well as informative observations. For this reason, in-
structive aspects of these four studies are summarized
below as the studies provide very useful information which
was not revealed by any canonical correlation analysis.
5.5.1 Ottawa River Study
The CETTP Ottawa Riverstudy included three discharges.
The most upstream was a sewage treatment plant (STP),
next was the refinery, and the last discharge was a chemi-
cal manufacturing plant. Outfalls from all of these facilities
were within a 1.3 km range on the river. Ecological
surveys were performed twice (1982 and 1983) at nine
different sites on the river. Two sites were upstream of the
three outfalls. Sites 2 and 3 were immediately above, and
below the STP outfall, respectively. Sites 3 and 4 were
immediately above and below the refinery outfall,
respectively. Sites 4 and 5 were immediately above and
below the chemical plant outfall, respectively. Sites 6, 7,
8, and 9 were approximately 6.8,13.1, 31.7, and 57.8 km
downstream of the chemical plant outfall. If the discharges
from these plants were impairing the Ottawa River
ecosystem, one would expect that sites 3, 4, 5, and
perhaps 6 would appear degraded compared to sites 1
(above discharges) and 9 (most distant downstream).
The STP discharge contributed approximately 72 to 82%
of the Ottawa River flow. The refinery and chemical plant
effluents contributed about 17 to 32% and 8 to 10% of the
river flow, respectively.
Seven day early life stage toxicity tests with larval fathead
minnows and Ceriodaphnia were performed on effluents
from the three facilities and on surface water samples col-
lected at the nine river sites. Benthic macroinvertebrate
and fish population indices were used to assess instream
biological community condition. Table 1 summarizes the
toxicity testing data from the Ottawa River tests.
Examination of the benthic macroinvertebrate diversity,
community loss index, and dominant taxa data suggest
that the sites immediately downstream of the outfalls were
impaired compared to sites 1 and 9. Sites 3, 4, 5, and 6
appeared the most impacted; recovery was not evident
until site 8, (31.7 km downstream of the chemical plant
outfall).
The fish population data generally agreed with
macroinvertebrate results. In the 1982 sample, sites 4,5,
6, and 7 were characterized by zero to 8 species and a
total of approximately 50 individuals at all four sites. At all
the other sites there were 11 to 18 species with the
number of individuals in the thousands. For the 1983
sampling, sites 3, 4, 5, and 6 appeared to be the most
impacted with 1 to 5 species and low total counts. The
remainder of the sites were characterized by 10 to 23
species and high total counts.
Results from this investigation revealed a qualitative corre-
spondence between effluent and ambient water toxicity, as
Table 1. Toxicity testing summary for the Ottawa River study (Mount et al., 1984)
Effluent
STP:
Refinery:
Chem. facility:
STP:
Refinery:
Chem. facility:
Ambient Water Toxicitv
Larval Fathead Minnow Tests
No significant toxicity 1982,1983
Significant toxicity in 50% effluent 1982; 100% effluent 1983
No significant toxicity in 1982; significant toxicity in 1% effluent in 1983
C. dubia Tests
Significant toxicity in 10% effluent 1983
Significant toxicity in 10% effluent 1983
No significant toxicity
Fathead Minnow Test
Significant toxicity at sites 3 through 8 compared to sites 1 and 9 (1983)
C. dubia Test
Significant toxicity at sites 3 through 6 compared to sites 1 and 9 (1982);
Significant toxicity at sites 3 through 7 compared to sites 1 and 9 (1983).
-11-
-------
well as between ambient water toxicity and ecosystem
responses at sites downstream of effluent discharge
points. Although statistical analyses were not performed,
there was distinct correspondence between ambient water
toxfcity and biological community responses.
5.5.2 Five Mile Creek Study
The Five Mile Creek study included three dischargers, two
coke plants and a WWTP. Nine study sites were located
along the creek: two sites above the first point of discharge
and the remainder downstream of at least one discharge
point. Coke plant #1 outfall was 2.3 km upstream of coke
plant #2 and in turn , coke plant #2 was 10.7 km upstream
of the WWTP outfall.
The 7-d larval fathead minnow and Ceriodaphnia toxicity
tests were used to assess toxicity in ambient water
samples from each of the sites, as well as in effluent
samples. Ecological surveys were conducted at all sites
in February and October; effluent and ambient water
toxicity tests were also performed during these months.
The effluent LOEC for the larval fathead tests was 1 % and
3% in October and February, respectively for coke
plant#1. The coke plant #2 effluent LOEC for October and
February was 30% and 10%, respectively. In February,
significant toxicity was seen in the larval fathead test at the
two sites below the coke plants. Ceriodaphnia tests
conducted during February were not reliable because of
problems in the culture population. In October, the LOEC
in Ceriodaphnia tests was 10% effluent for coke plant #1
and 30% effluent for coke plant #2. In general, the WWTP
effluent appeared to contain little toxicity in either of the
toxicity tests.
There were fewer benthic macroinvertebrate taxa and
lower density at sites immediately below the coke plant
outfalls, but the data were not conclusive. Likewise, no
consistent pattern was seen in zooplankton data. At sites
above the coke plant outfalls 4 to 6 (total count >100) and
10 to 12 (total count >1,600) fish species were counted in
February and October, respectively. In February, 0 to 2
(total count = 9) fish species were noted at the sites
immediately below (500 m) the outfalls of the coke plants.
In October, 1 to 8 (total count = 83) fish species were
counted below the outfalls of the coke plants. This paucity
of fish species and numbers of individuals below the coke
plants discharge points suggest that those effluents
adversely affected fish life.
To understand this study it is important to note that the two
coke plant effluents contributed a relatively low percentage
of stream flow-less than 1% for plant #1 and usually not
more than 8% for plant #2. In other words, the instream
waste concentration (IWC) forthese discharges was fairly
low. Based on the results of the effluent toxicity tests, ac-
ceptable waste concentrations (AECs) were calculated for
the effluents of both coke plants. Comparison of the IWCs
to AECs revealed that the AEC seldom exceeded the IWC
with the exception of the sites immediately downstream of
the two outfalls. These were the sites where there was a
paucity of fish species and numbers. While the coke plant
effluents (as well as ambient water) toxicity qualitatively
predicted ecosystem impairments, the effluent tests tend-
ed to "underestimate" fish population impairment. That is,
some would suggest the effluent toxicity tests yielded
"false negatives"!
That the canonical correlation analysis of Marcus and Mc-
Donald (1992) did not "recognize" the specifics described
above is not surprising since their analysis consolidated
toxicity testing as well as bioassessment data. The only
significant impairments in Five Mile Creek appeared to be
on fish populations immediately downstream of coke plant
discharges. In the Dickson et al., 1992 study (where
fishery data were correlated with the consolidated toxicity
data) a statistically significant canonical correlation coeffi-
cient between toxicity and bioassessment data was noted
in the Five Mile Creek study.
5.5.3 Skeleton Creek
The Skeleton Creek study consisted of ten sites where
ecological surveys (Norberg-King and Mount, 1996) were
performed and water samples collected for toxicity analy-
sis. Sites were on Skeleton Creek or its tributary, Boggy
Creek. During the study there were two major discharges,
a refinery on Boggy Creek and a fertilizer manufacturing
facility on Skeleton Creek a short distance below the con-
fluence with Boggy Creek. On Boggy Creek there were
two sampling sites above the refinery discharge point and
one 200 meters below this point. On Skeleton Creek there
was one site above the confluence with Boggy Creek and,
thus, above the fertilizer plant discharge point. There were
also sites immediately below the confluence with Boggy
Creek and 300 meters below the outfall of the fertilizer
plant. Five sites were at various distances downstream of
the fertilizer plant on Skeleton Creek.
The 7-d larval fathead minnow and Ceriodaphnia toxicity
tests were used to assess effluents, as well as ambient
water samples collected at each of the sites. Larval
fathead minnows were more sensitive than cladocerans to
the effluents from both the refinery and the fertilizer plant.
Ten percent effluent from both facilities yielded statistically
significant larval fathead responses.
Statistically significant larval fathead minnow mortality was
seen only in the ambient water sample collected immedi-
ately below the fertilizer plant outfall. Statistically
significant larval fathead minnow growth inhibition was
seen only in ambient water samples collected immediately
below the outfalls of the refinery and fertilizer plant outfalls.
-12-
-------
These toxicity test results were consistent witH the fish
population data; the site immediately below the fertilizer
plant was the only station where there were no fish.
The fact that there were no fish collected at the site below
the fertilizer plant and that the ambient water sample from
this site was the only ambient sample to cause significant
larval fathead mortality (effluent from this facility was also
the most toxic to larval fish) suggests an effective corre-
spondence between toxicity test results and instream bio-
logical responses. That this relationship was lost in the
canonical matrices (Marcus and McDonald, 1992) of
toxicity and bioassessment metrics is not surprising.
In a study done at the same time as USEPA's, Burton and
Lanza (1987) reported that microbial assays revealed
toxicity in ambient waters below the two discharge points
and that these toxicity results were inversely correlated
with instream biological community data.
5.5.4 Ohio River
In this study, a 12 km segment of the Ohio River was in-
vestigated. Within the study area there was a steel mill
with multiple outfalls and a WWTP. This study included
eight sampling sites. One site was located above the steel
mill and WWTP outfalls. Other sites were situated immedi-
ately upstream and downstream of the outfalls. The last
river site was approximately 2.5 km downstream of the last
steel mill outfall.
Planktonic and benthic macroinvertebrate data were col-
lected at each site only once. Ambient water samples from
these sites were tested only once with the 7-d larval
fathead minnow and Ceriodaphnia toxicity tests; effluents
were not tested.
None of the surface water samples yielded significant
toxicity to Ceriodaphnia in a 7-d test. The larval fathead
minnow toxicitytest results were variable and inconsistent.
Examination of the plankton data revealed little correspon-
dence to points of discharge. The benthic
macroinvertebrate data indicated possible impacts only at
sites immediately below steel mill outfalls. These potential
impacts were not predicted by the Ceriodaphnia or larval
minnow toxicity tests. If the instream biological responses
were ecologically meaningful, they were underestimated
(i.e, yielded false negatives) by the USEPA toxicity tests.
Given the above observations, it is not particularly surpris-
ing thatthe canonical correlations (Marcus and McDonald,
1992) did not identify a statistically significant relationship
between toxicitytest results and instream biological mea-
surements, because neither varied greatly. Therefore,
failure to find a significant canonical correlation in this
study should not be used to discredit the USEPA toxicity
tests, since there was little gradient in either the toxicity or
biological community variables.
5.5.5 General Comments Regarding the Four CETTP
Studies Summarized
After reviewing the four CETTP studies in which the
Marcus and McDonald (1992) canonical correlation did not
find a statistically significant correlation between a matrix
of toxicity test results and a matrix of bioassessment met-
rics, it is not surprising that a statistically significant rela-
tionship was not identified. Moreover, in three of the
studies, consolidation of the data obscured clear relation-
ships between effluent/ambient watertoxicity and instream
measurements. Furthermore, Marcus and McDonald
(1992) argued against placing too much value on the use
of statistical significance and emphasized the high correla-
tion coefficient values identified in their analysis of the
CETTP and associated studies data. In the study, signifi-
cant toxicity was infrequent and differences in instream
parameters were minimal-not ideal for demonstrating sta-
tistically significant correlations. It would be incorrect to
suggestthat these four CETTP studies described above
constitute evidence that the USEPA toxicity tests are
unreliable qualitative predictors of instream biological com-
munity responses.
6.0 Criticisms of CETTP and Associated
Studies
A group of authors (Parkhurst et al., 1990; Marcus and
McDonald, 1992; Parkhurst, 1995,1996) has criticized the
CETTP and associated studies. The criticisms generally
refate to design and analysis considerations, most of which
are stated below. These publications consist of criticisms
of the CETTP and associated studies and do not provide
additional data regarding the predictiveness of the USEPA
toxicity tests results. Moreover, empirical evidence which
suggests that the USEPA toxicity tests are not reliable
qualitative predictors of instream impairments has not been
provided. Criticisms of the CETTP studies are stated and
discussed below.
6.1 CETTP Studies Compared Ambient Water Test
Results with Bioassessment Variables
A major criticism of the CETTP studies was that compari-
sons were made between ambient water rather than
effluent toxicity test results and biological community re-
sponses. The implication appears to be that abiotic and
biotic factors other than dilution can mitigate effluent
toxicity. Parkhurst (1995) suggests that a missing link in
these studies was to connect surface water with effluent
toxicity.
Discussion: Effluent toxicity was measured in seven of the
eight CETTP studies and, although statistical correlations
were not performed, effluent toxicity corresponded with
ambient water toxicity and ecological responses. This
criticism fails to recognize that the most probable cause
(critics point this out, see Section 6.6) of toxicity in the
streams/rivers investigated was discharged effluents.
-13-
-------
In the seven CETTP studies where effluent toxicity was
measured, ambient water was not significantly toxic at
sites above discharge points (or it was less than below
discharge points). Where effluent toxicity was noted,
ambient water toxicity was generally seen at sites below
the discharge point when dilution was taken into consid-
eration. Furthermore, in most of the seven CETTP studies
when effluent toxicity was identified there tended to be
gradients (i.e., greatest toxicity immediately below
discharge points, with progressively lower levels of toxicity
at sites downstream) of ambient watertoxicity below points
of discharge. Also, where there was effluent toxicity there
was generally evidence of instream impairments below the
discharge points when dilution was taken into
consideration. Although statistical correlations were not
performed between effluent and ambient watertoxicity (or
instream biological measurements), it seems that effluent
was responsible for ambient water toxicity and ambient
water toxicity was the major cause of instream
impairments.
6.2 Nonrandom Selection of Study Areas and
Sites
Another major criticism of the CETTP studies is that study
areas and sampling sites were not selected randomly.
Because of this, the contention is that findings cannot be
extrapolated using statistical-based induction to other
aquatic ecosystems and, secondly, there was not a strong
statistically based experimental design. A corollary to this
criticism is that USEPA intentionally selected rivers and
streams where there were likely to be water quality prob-
lems caused by discharged effluents.
Discussion: This criticism has merit and should be con-
sidered when evaluating the CETTP data. Design of the
CETTP studies was not perfect from a statistical analysis
standpoint. More upstream (control) sites would have
been desirable. Some argue that all sites below a dis-
charge point represent pseudoreplicates. However, as a
practical matter limited funds and other resources require
regulatory agencies to focus on areas where there are
likely to be environmental problems so there can be
remediation and restoration. Indeed, the idea wasto study
streams potentially impacted by effluent toxicity. It has not
been the focus of regulatory agencies to study areas which
are pristine or which have a low probability of water quality
problems. Moreover, the intent of the CETTP studies was
to examine the relationship of probable effluent toxicity and
potential instream toxicity, as well as biological community
responses.
Random selection of study areas would have resulted in
investigations of rivers and streams where there were no
discharges and possibly waterbodies known to receive
effluents free of toxicity. A recommendation has not been
advanced as to the number and types of aquatic
ecosystems which should be studied before a consensus
can be achieved on the effectiveness, or lack thereof, of
single species toxicity test results in predicting qualitative
ecosystem responses. USE:PA (1991) suggests that it is
reasonable to assume that in the absence of data showing
otnerwiseVne relationship between ambient watertoxicity
and aquatic ecosystem impacts is independent of
waterbody type.
Random selection of sites on a stream would result in con-
founding factors. For comparison among sites or to a
reference site, all sites should be equivalent, including
physical/chemical habitat and substrate; with this control
the major variable would be the potential of chemical
toxicity from point or nonpoint sources. Random selection
of sites could also introduce the confounding factor of non-
chemical, anthropogenic effects on biotic communities.
The criticisms that more sitess (controls) upstream of dis-
charge points were necessary, that more non-impacted
sites were necessary for an acceptable statistical design,
and that all sites below discharge points were
pseudoreplicates have some merit, but also disregard
some facts and observations. In design of the CETTP and
associated studies, it was unknown whether or not sites
downstream of discharge points would show ambient water
toxicity; whether or the ecological surveys would indicate
whether these sites were impacted or not also was
unknown. While, from a purely statistical standpoint, the
sites downstream of discharge points could be considered
pseudoreplicates, this criticism fails to recognize that in a
majority of the CETTP and associated studies there were
progressive gradients of decreasing ambient watertoxicity
below discharge points which corresponded with progres-
sive gradients of "improvements" in biological community
indices.
The criticism that there was limited statistical correlative
analysis in the original CETTP publications is valid. How-
ever, as indicated above, 'this was a consequence of
relatively small sample sizes (i.e., number of sites in each
study). This statistical analysis criticism has been
addressed in part by the Dickson et al. (1992) and Marcus
and McDonald (1992) analyses. As indicated in
Section 5.4 above, there are other ways that the CETTP
data could be grouped and statistically analyzed.
6.3 Use of the Most Sensitive Toxicity Test
Results
Marcus and McDonald (1992) called attention to the use
in two CETTP studies (Norberg-King and Mount, 1986;
Mount et al., 1986) of data from the most sensitive of two
toxicity tests to relate with the most sensitive
bioassessment measurements.
-14-
-------
Discussion: The USEPA procedure has biological and
statistical limitations, however, it also has some logic from
an ecological perspective. Because sensitivities of
different test organisms vary with the toxic chemical or
combination of chemicals, the occurrence and combina-
tions of toxic chemicals can vary along a stream, and
assemblages of organisms change along a stream, it
seems ideal to test with a suite of species and then relate
these data to instream biological community variations.
Likewise, different components of the instream
communities are likely to respond to different chemicals or
combinations of chemicals.
The limited responses (only two USEPA toxicity tests)
tested in the laboratory toxicity tests compared to the
multiple responses in aquatic ecosystems necessitates
that all possible relationships be explored. Therefore,
while recognizing the limitations of using maximum re-
sponses, they may provide insights into interactions of
toxicity and community responses. Because of the
extremely limited number of species and biological end-
points represented in the USEPA toxicity tests there has
been a tendency for regulatory conservatism (use of
results of the sensitive species). Whether or not this
conservatism is completely justified remains to be deter-
mined; however, the results of this review show that labo-
ratory single species tests more frequently yield reliable
predictions, or underestimates, of biological community
responses than overestimates of impacts.
6.4 Relationship Between Toxicity Test Results
and Instream Biological Measurements Relied
Heavily on High Magnitude Toxicity.
Another criticism of the CETTP conclusions is that the
correspondence between ambient water toxicity and
ecosystem community impairments relied extensively on
areas and sites where toxicity was relatively high.
Discussion: There is merit to this criticism, but the overall
significance is uncertain since toxicity theory is based on
a concentration-response relationship (i.e., a greater
response with highertoxicity). There should be no surprise
that higher levels of toxicity (enough to cause lethality) in
ambient or effluent water samples can yield measurable
responses in ecosystem parameters. Furthermore, biologi-
cal responses, as all measurements, are less reliable near
detection limits. "False positives" are of greater concern
in situations where surface water of effluent toxicity is rela-
tively low and near detection limits. The ability to reliably
detect biological community impairments when the concen-
trations of toxic chemicals are near the effect thresholds
is difficult; detection of such impairments also will be
obscured by the complexity and natural variability in
aquatic ecosystems. It should be emphasized that, in the
CETTP studies, toxicity test "predictions" were based on
effects (including sublethal) in the 7-d early life stage tests.
6.5 Temporal Repeatability of the Ambient Water
Toxicity/Biological Response Was Not
Demonstrated
The CETTP studies did not confirm through time the corre-
spondence of surface water toxicity with instream
biological variables.
Discussion: There is some validity in this criticism, yet
there can be wide temporal variations in effluent and ambi-
ent toxicity. Temporal variations in the relationships be-
tween toxicity and biological community parameters were
considered in some of the CETTP and associated studies
(Dickson et al., 1989; Mount et al., 1984; Mount et al.,
1985). Defining the magnitude, duration, and frequency of
effluent/ambient watertoxicity is important. Understanding
natural seasonal variations in aquatic biological communi-
ties is essential when attempting to relate these variables
to potential controlling factors. Significant variations in
stream flow and physico/chemical factors can also influ-
ence the relationship between effluenttoxicity and biologi-
cal community responses and must be considered in de-
scribing a temporal relationship. Failure to demonstrate
a statistically significant correlation between
effluent/ambient water toxicity throughout the year does
not discount the possibility of ecosystem impairments from
toxic chemicals (from point or nonpoint sources) during
portions of the year. The issue of temporal repeatability of
the relationship between effluent or ambient water toxicity
and biological community responses has been addressed
by Dickson et al. (1989, 1996).
6.6 Confounding Factors Were Not Considered
Parkhurst and associates (Parkhurst, 1995, 1996; Park-
hurst et al., 1990) suggested that several factors otherthan
ambient watertoxicity could have affected biological com-
munity, but were not considered in the CETTP studies.
They contend that both natural (e.g., poor habitat, low oxy-
gen, nutrient enrichment, organic enrichment, natural sea-
sonal variations) and non-effluent, anthropogenic factors
could have been responsible for biological community
changes in the CETTP studies.
Discussion: While contending that confounding factors
were not considered, these authors also point out that dis-
charged effluents were the most probable cause of water
quality problems. If their confounding factors theory is cor-
rect one would expect a high percentage of "false
negatives" (toxicity test results predict no instream impact,
but impact measured) in the CETTP and associated
studies. However, "false negatives" were noted in only
6.3% of the 160 sites in the CETTP and associated
studies.
Irrespective of potential confounding factors, statistically
significant canonical correlations were seen between ambi-
ent water toxicity test results and biological community
-15-
-------
responses (Dickson et al., 1992; Marcus and McDonald,
1992). The criticism of confounding factors appears to
disregard the CETTP and associated studies observations
which revealed impairments on a progressive gradient be-
low effluent discharge points (i.e., the greatest impairments
were at sites nearest the discharge point, decreasing with
distance from the discharge point). The argument regard-
ing confounding factors makes little biological sense given
that CETTP sites upstream of discharge sites generally
Indicated "healthy" communities, whereas sites below dis-
charge points (which showed toxic effluents) tended to
suggest impairments.
The high frequency of accurate predictions in the Dickson
et al. (1992) classification system of instream biological
responses based on toxicity test results in the CETTP and
associated studies is rather surprising given that these
relationships were based on the results of single, or few,
toxicity tests with a single bioassessment indices (which
tends to be temporally integrative, but which does not
incorporate natural variations).
6.7 Was the CETTP Classification System Math-
ematically Biased?
Marcus and McDonald (1992) criticized the procedure
used in some of the CETTP studies for identifying correct
predictions of biological impairments based on toxicity
testing data.
Discussion: This criticism appears accurate. No consis-
tent method was used throughout the CETTP studies to
select correct and incorrect predictions. Based on these
CETTP comparisons, some studies concluded that the
degree of toxicity was related to the degree of instream
taxa reduction. The analysis of the data using various
analyses appears to have been an attempt to convert a
qualitative relationship between toxicity test results and
instream biological responses to a quantitative one.
6.8 High Rate of False Positives
Parkhurst (1992) suggested that the rate of "false
positives" (toxicity test results predict instream impact, but
no impact observed) in the CETTP, South Elkhorn Creek
(Birge et al., 1989), and Trinity River (Dickson et al., 1989)
studies was 68% and 23% in the North Carolina (Eagleson
etal., 1990) study.
Discussion: Using all available data the actual rates of
"false positives" were 9.4% and 7%, respectively in the
CETTP/Associated studies and the North Carolina study.
Parkhurst values are based on only a portion of the data
collected in all of the studies, the sites identified as not
impacted. While there may be some value in the approach
presented by Parkhurst (1992), it certainly ignores a very
large portion of the data collected.
6.9 Miscellaneous Criticisms
Some criticisms of the CETTP studies do not relate directly
to those investigations. These criticisms include:
4 The size and assimilative capacity of the receiving
waterbody is not considered when evaluating WET test
results,
* the duration of exposure in aquatic ecosystems, relative
to test duration, is not considered in the evaluation of
WET test results, and
4 actual effluent dilution and flow conditions are not
usually considered in the evaluation of USEPA toxicity
tests results.
Discussion: Some of these criticisms have merit, yet these
criticisms are less concerned with the reliability with which
USEPA toxicity tests results predict ecosystem responses
than with concern that the results of single (or few) toxicity
test results could be used as evidence of an effluent permit
violation (i.e., they represent potential implementation
problems). Certainly, such factors must be considered and
incorporated into risk assessments.
6.10 Conclusions
The CETTP studies suffered from some design and inter-
pretive problems. However, even critics of the CETTP and
associated studies tend to agree that there is a good quali-
tative relationship between USEPA toxicity test results and
aquatic ecosystem community responses. These critics
correctly assert that a quantitative relationship has not
been established. Although critical of the CETTP and
associated studies, Parkhurst et al. (1992) accept that
these studies demonstrate that, if adequate consideration
is given for effluent dilution, USEPA toxicity tests results
should be reliable predictors of ecological impairments.
What appears to be lacking in the criticisms of the CETTP
studies are: 1) experimental data which indicate that single
species (EPA toxicity tests) test results are more frequently
unreliable rather than reliable predictors of ecosystem
impacts, and 2) suggestions for effective alternatives to the
single species tests.
Recognizing that ecosystems are complex and
multivariate, with many interacting factors and that sample
sizes were rather small, it is not surprising that the CETTP
and associated studies did not establish a quantitative
relationship between USEPA toxicity tests results and
biological community responses. However, the qualitative
association established was convincing enough to accept
the results as predictive of probable biological impacts. If
a series of ambient water or effluent water tests produce
statistically significant toxicity in the USEPA toxicity tests,
some degree of ecosystem impairment is likely. Since the
USEPA toxicity tests provide an early warning and are
predictive of probable aquatic ecosystem impairments, it
is not essential that they be highly quantitative predictors
of biological community impacts.
-16-
-------
7.0 Single Species Tests with Effluent
Investigations in which effluents were tested with single
species toxicity tests and in which some ecological survey
data were collected from the receiving stream for compara-
tive purposes were reviewed. A summary of these reviews
is presented in Appendix A. Studies reviewed in this Ap-
pendix, as well as in Appendices B and C were located
through literature searches. All studies related to the topic
were reviewed, none were screened out. These studies
represent a special concern because of the criticism
related to the correspondence between single species
toxicity test results and ecosystem responses.
Appendix A summarizes 13 publications and the tabula-
tions presented below are by study (i.e., by the outcome
of the entire study, not by subcomponents within studies).
In nine (69%) of the 13 studies early life stage test
NOEC/LOEC s from effluent tests provided reliable
qualitative predictions of instream impairments. In three
(23%) studies early life stage effluent test NOEC/LOECs
underestimated instream responses. Results from one
study was inconclusive, consequent to study design and
interpretive inconsistencies. Based on effluent toxicity test
results no overestimations of instream impacts were noted
in these 13 studies.
The 13 studies summarized in Appendix A, as well as the
Eagleson et al. (1990) study discussed above, demon-
strate that single species toxicity test results on effluents
can provide reliable qualitative predictions of biological
community responses ortend to underestimate ecosystem
impairments.
8.0 Single Species Tests with Individual
Chemicals or Small Groups of Chemicals
Studies in which single species toxicity tests were used to
assess the toxicity of a single chemical or a small combina-
tion of chemicals and predict aquatic ecosystem biological
responses were evaluated and summarized in Appendix
B, which is subdivided into sections on pesticides, other
organic chemicals, metals and miscellaneous substances.
8.1 Organic Chemicals: Pesticides
Eighteen studies dealing with pesticides are summarized
in Appendix B. The most studied pesticide in this group of
investigations is the organophosphorus insecticide
chlorpyrifos (seven studies). In 14 (78%) of the 18 studies,
single species laboratory toxicity test results reliably pre-
dicted direct field adverse effect concentrations. In many
of the studies the single species laboratory tests failed to
predict the secondary (indirect) effects seen the field
experiments, such that biological community effects were
underestimated by the laboratory single species toxicity
test results. In four of the studies reviewed in Appendix B
the laboratory single species toxicity test effect
concentrations overestimated the field effect concentration
(i.e., the laboratory single species data underestimated the
biological community responses). Although use of
daphnids in laboratory tests has been criticized by some
because they are indicator, rather than resident species,
data in 12 of the 18 studies suggest that daphnids are
reliable (or tend to underestimate aquatic ecosystem
impacts) predictors of a biological community response.
8.2 Organic Chemicals: Nonpesticides
Eleven investigations of organic chemicals were reviewed
and summarized in Appendix B. Laboratory single species
toxicity tests results were reliable predictors of biological
community effect concentrations in seven (64%) of the
eleven studies. In most of these six studies in which
laboratory effect concentrations were considered reliable
predictors, single species test results were somewhat
higher than the field effect concentrations (i.e., biological
communities were somewhat more sensitive to chemicals
than predicted by the laboratory tests). Laboratory toxicity
tests overestimated field effect concentrations in two (18%)
studies. Results of two studies were inconclusive or
mixed.
8.3 Metals
Ten studies dealing with metal toxicity are reviewed in Ap-
pendix B. Results of five (50%) of the ten studies suggest
that laboratory single species test effect concentrations are
reliable qualitative predictors of biological community effect
concentrations and responses. In four (40%) of the studies
laboratory single species effect concentrations were
notably higher than effect concentrations (i.e., laboratory
single species tests underestimated aquatic ecosystem
impacts). One of the ten studies was inconclusive.
8.4 Other Data and Views of Predictiveness of
Single Species Test Results
Persoone and Janssen (1994) submitted that environmen-
tal factors may notably modulate toxicity (e.g., alter
bioavailability) as measured in laboratory tests. A majority
of the studies, with the exception of investigations on met-
als, summarized in Appendix B do not support that claim.
Speculations that laboratory toxicity test results estimate
effect concentrations (e.g., LOECs, NOECs) that are con-
siderably below instream effect concentrations have been
voiced, but most of the data reviewed herein fail to support
those conjectures.
La Point (1994) concludes that direct, but not secondary,
responses of fish in ecosystems can be predicted from
laboratory single species test results. Luoma (1995)
suggests that accurate predictions of metal impacts based
on single species test results are rare. Luoma (1995) also
wrote, "As toxicity tests are increasingly used in
contaminant management, reliance on insensitive
-17-
-------
Table 2. Equations showing relationships between
laboratory (single species) and ecosystem
determined endpoints (data from Slooff et al.,
1986)
Using acute toxicity data the following equation was
derived:
log NOEC(ecosystem) = -0.55+0.81 log
LC50(single species tests).
In this case, n = 54, r = 0.77, and the
uncertainty factor was 85.7.
Using chronic toxicity data the following equation was
derived:
log NOEC(ecosystem) = 0.63+0.85 log
NOEC(single species tests)
In this case, n = 51, r = 0.85, and the
uncertainty factor was 33.5.
procedures dominated by type II error (false negative) will
lead to regulations that underprotect nature." Luoma
(1995) listed the uncertainties in single species tests which
result in underestimation of impacts due to metals on
biological communities. These sources of uncertainty
include:
* choice of species (sensitive and ecological
keystone species unrepresented),
t exposure time (underestimated),
* exposure route (rarely considered),
*• multigenerational life cycle (unrepresented),
* higher-order secondary effects (rarely considered),
and
* interaction with natural disturbances (rarely
considered).
Margins of uncertainty in predicting toxicity from laboratory
single species tests to higher levels of biological organiza-
tion were determined by regression and correlation
analyses (Slooff et al., 1986). Analyses were performed
on log-transformed data. The 95% uncertainty factors
were determined as the minimum ratio of the estimated
toxicity value and its upper and lower 95% confidence
(prediction) limits.
The regression analysis consisted of regressing eco-
system-determined effect concentrations on laboratory
single species toxicity test effect concentrations. The
uncertainty factorwas defined as the minimum ratio of the
estimated effect concentration and its 95% prediction limit.
So, the srnallerthe value of the uncertainty factor, the more
reliably would single species toxicity test results predict
biological community effect concentrations.
Using acute toxicity data for 34 chemicals, the following
relationships in Table 2 were determined. Slooff et al.,
(1986) concluded that data from laboratory single species
toxicity tests are reliable enough for ecological risk
assessments.
The studies summarized in Appendix B suggest that labo-
ratory single species test results afford a reliable qualitative
prediction (are reliable for extrapolations) of aquatic
biological community responses or of environmental effect
concentrations. Tabulation of the 47 studies (tabulation is
by outcome of the entire study) reviewed in Appendix B
yields the results presented in Table 3.
Single species toxicity test results usually provide enough
information to tak,e action. These tests can be used to
determine concentrations of chemicals in a water sample
are sufficient to affect biological functions. Subsequent
action can be taken to determine the chemicals causing
toxicity and/or the persistence and magnitude of the
toxicity in the effluent or the water body. Clearly, the
results of a single toxicity test should not be equated with
ecosystem impairment; a test result is not de facto, defin-
itive proof of biological impairment.
Table 3. Summary of studies examining the
relationship between laboratory single
species test results and aquatic ecosystem
responses (Appesndix B).
Laboratory single species effect concen-
tration provides reliable prediction of
biological community effect concentration
and/or responses
Laboratory single species effect concen-
tration > field effect concentration (single
species test underestimates biological
community responses)
Mixed or inconclusive results.
68%
23%
9%
9.0 Comparison of Single Species and
Multiple Species (Microcosm, Mesocosm)
Toxicity Test Results
Intuitively one might suspect that single species toxicity
test results would not predict biological community
responses as reliably as multiple species (this term is used
to include both micro- and mesocosm studies) test results.
-18-
-------
Direct comparisons have not been frequent, but five
groups of authors (Slooff, 1985; Emans et al., 1993;
Okkerman et al., 1993; Persoone and Janssen, 1994;
Dorn, 1996) have published literature reviews which
address this issue.
9.1 Okkerman et al. (1993)
Results from NOECs from single species and multiple
species tests were compared by Okkerman et al. (1993)
in an endeavor to gain insight into whether aquatic ecosys-
tems can be protected by setting a "safe" concentration
derived from single species toxicity test results compared
To achieve this, Okkerman et al. (1993) performed an
extensive literature search to locate all available multiple
species studies. These studies were then put through
rigorous criteria to identify the multiple species studies
considered to be reliable. Some important criteria were
that a study had to include several taxonomic groups in
fairly realistic ecosystems, the concentration of the chemi-
cal had to be analytically verified, and a concentration-
response relationship had to occur. NOECs from the multi-
ple species studies were for direct effects only.
Forthose compounds where a multiple species NOEC was
considered reliable, the authors searched for single
species tests with an NOEC they considered reliable. Data
were sufficient and reliable enough to make the multiple
species and single species comparison for only ten
organic compounds, most of them were pesticides. When
more than one single species NOEC was available, the
comparison was made using the value for the most
sensitive species. The comparison was the ratio of the
multiple species and single species NOECs. The closer
the ratio was to one, the less divergent the single species
and multiple species NOECs.
For all ten chemicals, the ratio was five or less; for six of
the compounds, the ratio was approximately one or less
than one; forthe remaining fourchemicals the ratio ranged
from 2.5 to 5. These investigators concluded that despite
the general concept that effects assessments should be
conducted in actual aquatic ecosystems or multiple
species tests, in only a few cases did NOECs differ greatly
between single species and multiple species tests. They
also surmised that with some caution, due primarily to a
paucity of data, single species toxicity test data are a good
starting point for establishing "safe" concentrations for
aquatic ecosystems.
9.2 Emans et al. (1993)
The accuracy of extrapolating from single species toxicity
test results to aquatic ecosystem communities also was
examined by Emans et al. (1993). Their approach was to
compare NOECs derived from multiple species field
studies with those from single species toxicity tests. If field
multiple species toxicity test results were more reliable
predictors of how biological communities would respond
to a chemical(s) than single species test results, then one
would suspect that "safe" concentrations generated from
these two different procedures would differ appreciably.
After an extensive literature search, acceptable data forthe
comparison of single species and multiple species tests
were identified for 29 chemicals. Based on statistical anal-
yses, the authors concluded that "there seems to be no
reason to believe that organisms differ in sensitivity under
field and laboratory conditions." Moreover, when species
tested in the multiple species experiments were compared
with similar or related species in single species studies
(given corresponding response parameters and equivalent
exposure concentrations) their response/sensitivity to a
given chemical appeared essentially equivalent. Results
of this inquiry suggest that single species toxicity test
results are reliable predictors of biological community
responses. With the caution that there are limited data,
these authors conclude that it is acceptable to derive "safe"
concentrations from single species toxicity test data.
9.3 Slooff (1985)
Slooff (1985) reached the same conclusion as Okkerman
et al. (1993) and Emans et al. (1993) regarding the equiv-
alency of effect concentrations from single species and
multiple species toxicity tests after reviewing the literature
studies.
9.4 Persoone and Janssen (1994)
The potential of laboratory single species test results to
reliably predict biological community responses was ex-
amined in an extensive fouryearinterlaboratory study with
four chemicals (copper, atrazine, lindane, and
dichloroaniline) by Persoone and Janssen (1994). NOECs
from outdoor stream and pond microcosms were
compared with those from single species laboratory tests.
The NOECs from the field studies were within one order
of magnitude of the NOECs of the most sensitive
laboratory test, suggesting that the single species tests are
effective qualitative predictors of ecosystem effect
concentrations.
In their review of the literature on field "validation" of
predictions based on single species toxicity test data
Persoone and Janssen wrote, "One of the most striking
conclusions of this literature study is that, in general,
NOECs derived from (a selected battery of) single species
laboratory tests relate relatively well to single species and
multiple species NOECs obtained in field studies." Such
studies are not truly "validation", but they do argue that
abiotic and biotic factors in aquatic ecosystems do not
greatly modify effect concentrations or bioavailability of
chemicals compared to laboratory tests.
9.5 Phluger (1994)
NOECs from multiple species field and single species labo-
ratory tests for ten pesticides were compared by Phluger
-19-
-------
(cited in Persoone and Janssen, 1994). For all ten pesti-
cides there was less than one order of magnitude differ-
ence between the single species laboratory and multiple
species field test NOECs, suggesting that the single spe-
cies test results were reliable qualitative predictors of bio-
logical community responses.
9.6 Dorn (1996)
In three separate stream mesocosm experiments, testing
a homologous series of nonionic alcohol ethoxylate sur-
factants, Dom (1996) found that laboratory single species
toxicity test effect concentrations were within a factor of
three of mesocosm effect concentrations. In summarizing
a review of the literature Dorn (1995) concluded that
effects observed in numerous mesocosm studies are
consistent with laboratory single species toxicity test
results when exposures are reconciled correctly.
9.7 Crane (1995)
In a Society of Toxicology and Chemistry (SETAC) News
article, Crane (1995) states, "Such tests (mesocosms)
cost several million dollars to perform, but the results
obtained from them have shown no greater sensitivity or
predictive power and certainly no greater interpretability,
than considerably cheaper laboratory tests with single
species". Crane refers to the reviews of Okkerman et al.
(1993) and Emans et al.(1993) as support for his position.
10.0 Alternatives to Single Indicator Species
Tests
If the desire is to continue with testing which can provide
an early warning of probable aquatic biological community
impairments while having good qualitative reliability in pre-
dicting ecosystem responses, possible options to the exist-
ing USEPA toxicity tests and other single species proce-
dures include: 1) single indigenous species tests and 2)
multiple indigenous species tests. Desirable test and end-
point characteristics for reliably predicting instream
biological community responses have been listed (16
items) by Cairns and Niederlehner (1995). No current
single species test can meet these criteria.
10.1 Tests with Single Indigenous Species
A common criticism of the indicator species tests is that the
species does not occur in a particular waterbody. The
argument is that the indicator species test should be re-
placed with an indigenous species test. From a biological
perspective, the use of an indigenous species is sound, but
care must be taken in the selection of a replacement
species from the same phyletic group. Selecting an indige-
nous species from an impaired or partially impaired
waterbody could be a mistake since that species would
represent a species likely to have developed tolerance to
chemical stressors. In the case of impaired systems se-
lecting a species that could or one that previously did (from
historical data) live in such a habitat may be necessary.
While single indigenous species tests may decrease the
uncertainty associated with extrapolating from single
species test results to biological community responses, we
need evidence that such tests will significantly increase the
accuracy of predicting instream impairments. Such
indigenous species tests may not significantly improve
predictive accuracy enough to justify the time, effort, and
cost of developing standard (with the essential QA/QC)
protocols with indigenous species for multiple watersheds.
Is it desirable, feasible, and cost effective to develop proto-
cols for indigenous species in each watershed or subre-
gions of watersheds? Furthermore, there is little evidence
that indigenous species test results are more reliable pre-
dictors of biological community responses in complex and
multivariate ecosystems.
Currently available single species tests can effectively re-
veal when there are significant levels of toxic chemicals in
an effluent or ambient water sample. The statistical proba-
bility that any one test species represents the most or the
least sensitive species, life stage, or endpoint in a given
ecosystem is very low. Persoone and Gillett (1990) con-
clude that single indicator species toxicity tests do not
represent the most sensitive species or endpoints, and
especially key components, in aquatic ecosystems. In fact,
one could argue that the single species tests (especially
the USEPA toxicity tests) have been effective predictors
of ecosystem responses because they manifest relatively
average sensitivities compared to most aquatic ecosystem
species and endpoints. Probability theory also advises us
that there can never be enough predictive potential in the
results of a single species toxicity test to encompass all
possible effects on ecosystem structure and function.
Luoma and Carter (1993) conclude that single species
toxicity tests results, when combined with chemical mea-
surements and benthic community surveys have shown
reliable qualitative relationships between toxicity tests re-
sponses, chemical concentrations, and changes in biologi-
cal community structure. Slooff and Canton (1983)
asserted that the sensitivities in three indicator species
testing (alga, daphnid.fish) effectively represented aquatic
organism sensitivity ranges for approximately 75% of the
chemicals they tested. Many of the pitfalls of developing
new single species tests are discussed by Luoma (1995).
Luoma is not convinced that increasing the number of
standardized single species toxicity tests will improve the
accuracy of predicting ecosystem impacts. In reviewing
the validity of using indicator, rather than resident species
toxicity tests, Dorn (1996) suggested that use of "new"
tests with resident species may not give us better
resolution of aquatic ecosystem responses than do the
well-developed indicator spescies tests. Rather than
develop a host of new single species tests, Dorn (1996)
advises that a better use of resources would be to assure
-20-
-------
that laboratory and field exposure regimes are comparable
(i.e., improve exposure assessments).
Chapman (1995a) concludes that the current standardized
single species test protocols do not represent the more
sensitive ecosystem endpoints; he also notes that
daphnids and fathead minnows are usually not the most
sensitive components in aquatic ecosystems. Several
other authors (Persoone etal., 1990; Baird, 1992; Forbes
and Depledge, 1992; Clements and Kiffney, 1996; LaPoint
et a!., 1996) also proposed that the USEPA.toxicity tests
and other single species tests most frequently
underestimate effects in aquatic ecosystems. Examination
of USEPA's chemical-specific water quality criteria docu-
ments illustrates that the USEPA toxicity test species
(USEPA, 1994a,b) are not consistently among the most
sensitive species tested.
In combination with Toxicity Identifications Evaluations
(TIEs) and chemical analyses, the current set of single
species toxicity tests appear to be effective in the identifi-
cation of toxicity, as well as its sources and causes.
Therefore, available funds and efforts could be focused on
improving these procedures rather than on developing a
host of new indigenous species testing procedures.
Persoone et al. (1990) assert that despite limitations of
indicator single species tests, they have been extremely
useful and reliable predictors of ecosystem responses.
Cairns and Mount (1990) conclude that developing toxicity
test methodologies with "new" aquatic organisms is
probably not productive unless the response of this new
species has a high correspondence with responses of
many other aquatic species. Cairns and Mount state, "For
regulatory purposes, it is unquestionably sound to use test
organisms that have been widely used for toxicity testing
and whose strengths and weaknesses forthis purpose are
well known." There is little or no evidence that use of
indigenous species in single species laboratory toxicity
tests will improve our ability to predict responses in the
field.
10.2 Tests With Multiple Indigenous Species
Several researchers have argued for the use of multiple
species (micro/mesocosm) tests rather than single
species tests in regulatory settings. The literature on
multiple species toxicity test strengths and limitations will
not be summarized here. Suffice it to say that the limita-
tions of these tests seem to be equivalent or greater than
for single species toxicity tests (Dickson et al., 1985;
Mount, 1985; Slooff, 1985; Cairns et al., 1993; Cairns and
Smith, 1994; Dickson, 1995; LaPoint, 1995; Smith, 1995).
Generally, the designs of multiple species tests are highly
variable with no standardized protocols or endpoints.
Multiple species tests are predisposed to be ecosystem
specific, which is a strength as well as a weakness.
Results of multiple species tests tend to be more variable
than those from single species toxicity tests. Factors that
increase complexity of toxicity tests may boost ecological
relevance, but result in greater variability, as well as less
reliability and repeatability.
Although there is controversy on this issue, multiple
species tests have not been found to be more sensitive
than single species tests. Information is increasing, but
"validation" of these multiple species test results with
ecosystem bioassessments has not been frequent.
Neuhold (1986) postulates that microcosm tests present
interpretation problems and are not likely to offer reliable
predictions of natural ecosystem impacts. Responses in
control systems are difficult to replicate and responses in
treatment groups tend to diverge greatly (Gearing, 1989).
Several groups of investigators (Cairns, 1983; Giesy and
Allred, 1985; Slooff et al., 1986; Luoma and Ho, 1993)
conclude that mesocosms are better suited to testing
process questions than to replicating nature. In regard to
multiple species tests Bailey (1995) concluded that "I am
not convinced that complex (i.e., multiple species) tests
accompanied by simple models offer a reduction in
(predictive) uncertainty over simple tests accompanied by
complex models." Slooff (1985) argued that there is no
evidence that multiple species tests are more reliable in
predicting instream impacts than are results of single
species toxicity tests.
Although multiple species tests may have greater predic-
tive capacity than single species test results, they have
limitations which include:
+ There are no standardized protocols and developing
multiple species testing procedures will be costly. End-
points have not been agreed upon. Designs, endpoints
measured, and statistical analyses of multiple species
tests vary widely, resulting in considerable debate re-
garding the interpretation of every study.
+ Multiple species protocols may have to be altered or
developed for each aquatic ecosystem.
+ Most multiple species tests are not designed to be early
warning signals.
+ Multiple species tests tend to have high within and be-
tween test variability, and especially high between
laboratory variability (more than for the single species
tests).
4 Predictions of ecosystem responses based on multiple
species test results will likely be qualitative rather than
quantitative.
According to Giesy and Allred (1985), variability increases
with the size and complexity of multiple species study de-
sign. These authors contend that replicability (ability to
establish more than one experimental unit within a particu-
lar experimental treatment) of multiple species tests is gen-
erally sufficient, but that the realism and accuracy in these
tests is largely unresolved. Further, Giesy and Allred
-21-
-------
(1995) claim that repeatability (duplicating results of a test
at a later time) of multiple species tests has seldom been
examined.
The intent is not to discredit the importance of multiple
species tests. The point is that multiple species tests may
not be more reliable screening or predictive tools than are
single species Tests. Multiple species tests are important
for providing fundamental information on structure and
function of aquatic ecosystems and for potential following
up on single species toxicity test data.
11.0 Studies in Ocean or Estuarine Settings
Although the relationship between ocean watertoxicity and
water column biological community health has not been
examined to any great extent, the link between laboratory
marine sediment toxicity and biological community
response has been studied. This review does not include
an exhaustive examination of the literature dealing with
marine sediment toxicity. However, several studies (see
Appendix C) suggest that the results of sediment toxicity
tests are fairly reliable qualitative predictors of benthic
community responses. In a review of the literature on
laboratory sediment toxicity testing with single species
Lamberson etal. (1992) concluded that, despite realized
and potential problems, the test results have proven
"enormously successful as both research and
management tools." As Luoma and Ho (1993) conclude
in a review of the literature on sediment toxicity tests, it is
inappropriate to use data from single species tests alone
to quantitatively predict specific aquatic ecosystem
impacts.
Appendix C includes summaries of ten studies in which
laboratory single species toxicity test results on marine
sediment samples were evaluated in terms of predicting
benthic biological community responses. In all ten of these
studies, the laboratory sediment tests were reliable qualita-
tive predictors of benthic community effects; the laboratory
tests tended to underestimate the extent, of benthic
community impacts.
Richardson and Martin (1994) critiqued the strengths and
shortfalls of using ocean and estuarine toxicity testing as
a procedure for evaluating potential water quality impacts.
While constraints,'including the laboratory-to-field verifica-
tion, of single species toxicity testing are thoroughly dis-
cussed, these authors strongly advocate toxicity water
quality standards and toxicity monitoring, similar to that
outlined in the California Ocean Plan (State Water
Resources Control Board, 1990), on a world-wide basis.
-22-
-------
Section 2
1.0 Conclusions
Criticisms of single species tests, including the USEPA
toxicity tests, have included excessive between test and
between laboratory variability, as well as questions
regarding the ecological relevance of test results. While
some accept that contaminants are responsible for bio-
logical impacts based on an inverse correlation between
chemical concentrations and biological indices, there has
been some reluctance to make similar, parallel interpreta-
tions when ambient water or wastewater toxicity and
biological indices are inversely correlated. Determination
of contaminant concentrations per se provides no infor-
mation on the bioavailability of these compounds to
resident biota.
The information presented in this review offers compelling
evidence that the USEPA toxicity tests and other single
species toxicity test results are, in a majority of cases,
reliable qualitative predictors of aquatic ecosystem
community responses. However, this qualitative
relationship must be based on a series of test results
(persistent toxicity) not on a single test result. Participants
at a 1996 Pellston conference on toxicity testing (Grothe
etal., 1996; Waller etal., 1996) concluded that the USEPA
toxicity tests provide "an effective tool for predicting
receiving system impacts when appropriate considerations
of exposure are considered. Further, laboratory to field
validation is not essential for the continued use of these
toxicity tests." According to that reference, the participants
felt "It is unmistakable and clear that WET procedures,
when used properly and for the intended purpose, are
reliable predictors of environmental impacts."
Ideally, laboratory toxicity tests should provide ecologically
relevant, reliable, and repeatable data. In practice, how-
ever, incorporating these desirable characteristics into
laboratory toxicity tests has been difficult. The single spe-
cies toxicity tests are successful (compared to multiple
species tests) in providing reliable and repeatable data, but
at some expense to ecological relevance (e.g., Calow,
1992). To assess effects of contaminants on aquatic
biological communities there is a need for integrated,
weight-of-evidence approaches. Especially at moderately
polluted sites, a multiplicity of testing methods is helpful in
estimating and evaluating biological community responses.
The optimal approach may be to integrate ecological sur-
veys, toxicity tests, and chemical analyses to better under-
stand contaminant effects on the health of aquatic ecosys-
tems. The principal hinderance forthis approach has been
the complexity and costs of such combined, extensive ef-
forts.
While there is some merit to the criticisms of the CETTP
studies, they are not persuasive enough to doubt the effec-
tiveness of the USEPA toxicity tests as qualitative forecast-
ers of biological community responses. There are no
empirical data which demonstrate that these tests fail to
render reliable extrapolations to instream biological
responses. Thus, when used appropriately as early warn-
ing screening procedures, these tests provide a powerful
monitoring tool. When test results fail as reliable qualita-
tive predictors of instream impairments, they have more
frequently underestimated aquatic ecosystem impacts. The
idea that biotic and abiotic factors in the environment
significantly decrease bioavailability and toxicity was not
supported by a majority of the studies reviewed.
Data from the 7-d Ceriodaphnia test, in particular, have
been very reliable predictors of instream biological
responses. This may be due, in part, to the large database
for this species. On the other hand, Slooff and Canton
(1983) contend that single species tests with daphnids
have been extremely efficient in the identification of chemi-
cal concentrations harmful to aquatic ecosystems. Sum-
marized in this document are some 49 studies in which a
cladoceran (Ceriodaphnia or Daphnia) was utilized as the
laboratory test species. These tests were performed at
many locations across the country with a wide variety of
ambient water types (in most of which these cladocerans
were not resident), effluents, and chemicals (including pes-
ticides, other organic chemicals, metals, and inorganic
chemicals). Results from these laboratory cladoceran
tests were reliable predictors of aquatic ecosystem
biological community responses or adverse effect
concentrations in 33 (67%) of the 49 studies (cf., Figure 3).
The laboratory cladoceran tests underestimated biological
community responses (or overestimated ecosystem
adverse effect concentrations of a chemical) in 16 (33%)
of the investigations. There were no studies in which the
cladoceran tests overestimated impairments to biological
communities.
Defending single species toxicity tests beyond their
capabilities is to no one's advantage. Single species test
results alone are not reliable quantitative forecasters of
-23-
-------
toxic chemical impacts on complex ecosystems. The sim-
plicity of the single species tests comes at a cost of inter-
pretation and predictive depth. The test protocols can
always be improved so that we are more confident of their
meaning and so that their results are more reliable
predictors of instream impacts. A better understanding of
the limitations of extrapolations is needed so that mod-
ifications can be made which increase reliability and
decrease uncertainty, as well as establish a stronger
theoretical basis for extrapolations. A list of limitations and
strengths of single species tests is provided in Appendix D.
Establishing a quantitative correlation between a biological
response from a single grab or composite water sample
and the biological community responses is not only impos-
sible, but unnecessary. However, with a good temporal
representation of ambientoreffluenttoxicity and with care-
fully designed/performed seasonal bioassessment data
from streams, statistically significant correlations between
data sets have been possible. However, thorough ecologi-
cal surveys in large rivers and bays will be difficult and
expensive; there is likely to be considerable controversy
regarding what sites, if any, should serve as reference
("clean") sites and, if there are not reference sites, what
represents a "healthy" aquatic ecosystem. Many rivers
and bays in the U.S. are significantly altered by human
activities; attempting to attribute degradation in these
systems to toxic chemicals with the use of bioassessments
will be very difficult. Even though there cannot be direct
proof that toxic chemicals are a cause of declining aquatic
organism populations, it is not advisable to forsake the
qualitatively predictive early warning tools.
The reliability with which single species toxicity test results
predict biological community responses relates to several
factors. One major factor was; addressed by Dickson et al.
(1992); they observed that when effluent or ambient water
toxicity is relatively low or when impacts on aquatic
ecosystems are moderate it will be difficult to establish a
relationship between toxicity and instream ecological
responses. The strength of the predictive capacity of
single species test results is substantially enhanced when
the test is performed with ambient water (e.g., as
compared to effluent) and with higher magnitude toxicity
in the sample. Chapman et al. (1987) came to a similar
conclusion regarding magnitude of toxicity in relation to
sediment tests. We appear to be approaching consensus
that when significant lethality (and in the case of effluents,
assuming accurate dilution has been considered) is seen
in toxicity tests there is a very high potential of aquatic
ecosystem impairment. As this connection is accepted, we
continue to struggle with the idea that sublethal effects on
indicator species can result in detectable adverse
ecosystem responses.
1333 %
D67%
DLaboratory toxicity tests provide reliable prediction of biological community responses and/or
aquatic ecosystem adverse effect concentration (33/49)
QLaboratory toxicity tests underestimate biological community responses and/or overestimate
aquatic ecosystem adverse effect concentrations (16/49)
Figure 3. Summary of studies in which a cladoceran was used as a laboratory test organism when comparing toxicity
test results to ecological survey data and/or field effect concentrations. Percentages represent the outcomes of the
studies. Total number of studies is 49.
-24-
-------
Possibilities for decreasing the extrapolation uncertainties
and improving the predictiveness of single species test
results include: 1) more thorough characterization (persis-
tence, frequency, magnitude) of ambient water or effluent
toxicity, 2) more effective matching (or accounting for) of
exposure patterns in natural ecosystems as compared to
laboratory tests, 3) develop a more thorough comprehen-
sion of what constitutes critical aquatic ecosystem
endpoints, 4) improved simulation, or consideration of,
ecosystem characteristics and processes in laboratory
tests (a corollary to this point would be to avoid defaulting
to worst case scenarios in all cases), 5) more thorough
knowledge of environmental fate and bioavailability of
chemicals, 6) develop models which map the quantitative
and qualitative relation between single species test
endpoints and important ecosystem endpoints (this would
include focusing on the relative sensitivities of surrogate
species compared to key ecosystem endpoints), 7) focus
on or develop tests with endpoints which have a clear
connection to important ecosystem structures/functions, 8)
enhance the intertest repeatability of single species tests,
9) improve understanding of how toxicity is manifested in
complex ecosystems, and 10) develop field and laboratory
approaches which are complementary.
A convincing relationship has been established between
ambient water toxicity (as manifested by single species
tests) and biological community responses, but has such
connection been authenticated between effluent toxicity
and instream impairments? The effluent-biological com-
munity link has not been as thoroughly investigated.
Nonetheless, in several recent studies (see Section 1.0,
subsection 7.0), as well as the CETTP and associated
studies, where effluent toxicity was assessed, a reliable
qualitative estimate of instream biological effects was
obtained. This relationship was most evident when flow
and dilution of the receiving water were effectively esti-
mated and when environmental exposure duration was
matched (or account for) by laboratory toxicity test dura-
tion.
Recently, Sprague (1995) observed that single species
toxicity test results give us answers that support action,
and the important response is to take action rather than
wait for a 98% certainty. Because these single species
toxicity test results are, in a large majority of cases, reliable
qualitative predictors of biological community responses,
the controversy surrounding these tests can diminish if the
data from these tests are used appropriately. Moreover,
if the results of a single test are not characterized as a
violation of an effluent limit or a water quality standard, but
rather as a gauge of relative toxicity and, therefore, a
signal to initiate repeated or more frequent
sampling/testing (orTIEs) to better characterize potential
effluent or ambient watertoxicity, regulated entities may be
less critical of the single species tests. Prior to making
predictions regarding biological community impacts,
ambient water or effluent toxicity must be characterized so
there can be more certainty regarding the nature of the
toxicity. Furthermore, it is difficult to control toxicity until its
nature, cause, arid source are known. If the results of
single species tests are used to signal the potential for
instream impairments, then the toxicity test (USEPA,
1994a,b) results do not have to be quantitative predictors,
but rather effective qualitative predictors. These tests do
provide a reliable and repeatable qualitative predictive
capability. Mount (1995) stated, "It is the application of the
toxicity data, not its inherent validity, that is questionable."
In harmony with Mount, Luoma and Carter (1993) conclude
that it is the "interpretation and application of results" from
single species tests that are controversial.
Critics of the single species tests fail to recognize that
these tests, even with nonindigenous species, reveal that
water samples contain biologically significant concentra-
tions of toxic chemicals. Results of laboratory single spe-
cies tests are based on the toxicological principle of
concentration-response. This principle is fundamental and
well established. The effectiveness of the single species
tests in predicting biological community responses is cen-
tered in this principle of concentration-response.
If the single species tests continue to be used as early
warning, screening tools (identification of a potential prob-
lem which is further investigated), there is less necessity
for developing new standardized indigenous species
testing protocols. The probability that any particular test
species, whether or not it is indigenous to the particular
aquatic ecosystem, represents the most or the least
sensitive group of species or endpoints in a specific
biological assemblage is very low. More likely, the sensi-
tivity of the test species would fall somewhere within the
sensitivity distribution of organisms from an aquatic eco-
system. This is perhaps one of the reasons why the short-
term toxicity tests (USEPA, 1994a,b) and other single
species test results have been effective qualitative
predictors of ecosystem biological responses. Developing
testing protocols with indigenous species in many different
aquatic ecosystems may improve accuracy of predictions,
however, such efforts will be expensive and difficult
undertakings (e.g., Persoone and Giliett, 1990;Luoma,
1995). Furthermore, there is little evidence, and no
guarantee, that the reliability of environmental impact
predictions will be significantly enhanced with indigenous
species tests.
Slooff (1985), as well as Persoone and Janssen (1994)
discuss the wide range of sensitivities of aquatic
organisms, life stages, and endpoints to toxic chemicals.
These authors-also observe that enlarging the suite of test
species, life stages, and endpoints almost always results
-25-
-------
in lowering environmental effect concentrations (i.e.,
LOECs/NOECs decrease with increasing number of test
species, life stages, and endpoints-i.e., with increasing
amounts of toxicity data). Several authors (e.g., Slooff
and Canton, 1983; Persoone and Gillett, 1990; Persoone
etal., 1990; Chapman, 1995b; Luoma, 1995; Underwood,
1995; Dom, 1996) maintain that indicator species are not
likely to represent the most sensitive aquatic ecosystem
response, but rather have been selected for robustness,
ease of culture and availability. Bartell et al. (1992)
propose that area and sensitive species, by definition, are
seldom selected for routine toxicity testing. Several other
authors (Persoone et al., 1990; Baird, 1992; Forbes and
Depledge, 1992; Clements and Kiffney, 1996; LaPoint et
al., 1996) also proposed that the USEPA's toxicity tests for
effluents and receiving waters (USEPA, 1994a,b) and
other single species toxicity tests most frequently
underestimate effects in aquatic ecosystems.
Because single species test results are reliable qualitative
predictors of biological community responses, the burden
of proof in demonstrating that persistent toxicity is not im-
pacting biological communities perhaps should rest with
the entity(ies) responsible forthe contaminants. Moreover,
at some stage it should become incumbent on the entity
responsible for the probable environmental impacts to
demonstrate the absence of ecosystem impairments. As
stated by Luoma (1995), "The toxicity tests tool may never
achieve the high probability prediction capabilities
'required' by ardent critics; however, this does not prevent
the approach from being a useful tool in the developing
arsenal available to study the effects of contaminants".
2.0 Summary
Regulatory agencies have tended to rely on single species
toxicity tests, particularly USEPA's toxicity tests, on surface
or effluent water samples to identify potential chemical
toxicity threats to aquatic biological communities.
Questions regarding the reliability of these laboratory test
results in predicting impairments to biological communities
have been advanced. Of particular concern are
uncertainties of extrapolating from the outcomes of these
highly controlled laboratory tests to complex and
multivariate ecosystems. This document is an interpretive
review of the literature on this question of ecological
relevance of single species toxicity test results; it includes,
but is not restricted to USEPA's Complex Effluent Toxicity
Testing Program (CETTP-conducted for the purpose of
examining the predictive correspondence between the
short-term toxicity tests (USEPA, 1994a,b) and instream
impacts).
Aquatic ecosystem surveys typically have been used to
assess the reliability of single species toxicity test results
extrapolations. Potential limitations to the use of these
bioassessments for "validating" predictions extrapolated
from single species tests are discussed; caution is urge
in the interpretation of these surveys. Strengths and limita^
tions of the CETTP and associated studies, as well as of
two recent statistical analyses of those studies, are
evaluated.
Approximately 80 studies in which single species tests
were used to assess ambient water or effluent toxicity and
in which some ecological survey data were gathered, for
the purpose of exploring the correspondence between tox-
icity data and biological community responses, are critically
evaluated. A preponderance of evidence reveals that
USEPA's toxicity tests (USEPA, 1994a,b) and othersingle
species test results are, in a majority of cases, reliable
qualitative (some level of response seen) predictors of
aquatic ecosystem community effects. In this document 77
independent studies in which the results of laboratory indi-
cator single species toxicity tests are assessed with regard
to reliability in predicting aquatic ecosystem biological com-
munity responses (and/or adverse effect concentrations)
are summarized. In 57 (74%) of the studies the indicator
single species tests provided reliable qualitative
predictions of biological community impacts or adverse
effect concentrations (cf., F:igure4). The laboratory single
species tests underestimated aquatic ecosystem effects
(and/or overestimated the biological community adverse
effect concentration of a chemical) in 16 (21 %) of the 77
studies. Results of four (5%) of the studies were
inconclusive or mixed. There are no experimental data
which demonstrate that the single species tests generally
fail to render reliable qualitative extrapolations to biological
community responses.
While criticisms of the USEPA toxicity tests (USEPA,
1994a,b) and other single species tests have some merit,
they are not persuasive enough to cast doubt on the effec-
tiveness of these tests in predicting ecosystem impacts.
When used appropriately as early warning signals and with
dependable temporal representation of ambient water or
effluent toxicity, these tests provide a powerful monitoring
tool. When single species tests fail as reliable qualitative
predictors, they most frequently underestimate impacts to
the ecosystem community. Single species test results
alone are not reliable quantitative forecasters of toxic
chemical impacts on complex ecosystems.
The predictive power of single species tests is substan-
tially enhanced when ambient water, as compared to
discharge, is tested and when higher magnitude toxicity
exists; reliability is also improved when exposure patterns
in natural ecosystems are matched or accounted for and,
in the case of effluents, when realistic estimates^ dilution
are taken into account.
Alternatives to indicator species tests are explored. There
is a paucity of evidence that the current standardized toxic-
-26-
-------
ity testing protocol (including the USEPA toxicity tests;
USEPA, 1994a,b) test species are more sensitive to toxic
chemicals than resident species. If the single species tests
continue to be used as early warning signals there is less
necessity for developing new standardized indigenous
species testing protocols. The wisdom of developing a
host of new standardized tests with indigenous species,
unless they will substantially improve accuracy of
predicting ecosystem impacts, is questionable.
121%
m 5%
D74%
O Laboratory single species toxicity tests provide reliable prediction of biological community responses
and/or aquatic ecosystem adverse effect concentration [57/77]
HI Laboratory single species toxicity tests underestimate biological community responses and/or
overestimate aquatic ecosystem adverse effect concentration (I.e., adverse effects occur at lower
concentration than predicted in laboratory test [57/77]
EJ Laboratory single species toxicity test yielded mixed or inconclusive results [4/77]
Figure 4. Summary of studies reviewed in this report in which the results of laboratory single species toxicity tests were
compared to biological community surveys and/or field effect concentrations. Tabulation is by overall outcome of the
study. Total number of studies summarized is 77.
-27-
-------
Section 3
1.0 References
Review articles or publications relating to the ecological
relevance of single species toxicity tests are noted by ***.
References are inclusive for appendices.
Adams, W.J., R.A. Kimerle, B.B. Heidolph, and P.R.
Michael. 1983. Field Comparison of Laboratory-derived
Acute and Chronic Toxicity Data. pp. 367-385. InrW.E.
Bishop, R.D. Cardwell, and B.B. Heidolph, eds., Aquatic
Toxicology and Hazard Assessment, ASTM STP 802.
American Society for Testing and Materials,
Philadelphia, PA.
Bailey, H.C. 1995. Letter to The Editor. Human Ecol. Risk
Assess. 1. 459-463.
Bailey, H.C., C. Alexander, C. DiGiorgio, M. Miller, S.I.
Doroshov, D.E. Hinton. 1994. The Effects of Agricultural
Discharges on Striped Bass (Morone Saxatilis) in
California's Sacramento-San Joaquin Drainage.
Ecotoxicology 3:123-142.
Baird, D.J. 1992. Predicting Population Response to
Pollutants: in Praise of Clones. A Comment on Forbes
& Depledge. Fund. Ecol. 6:616-617.
Barbour, M.T., J.M. Diamond, and C.O. Yoder. 1996.
Biological Assessment Strategies: Applications and
Limitations, pp. 245-270. In: D.R. Grothe, K.L. Dickson,
and K. Reed-Judkins (eds), Whole Effluent Toxicity
Testing: An Evaluation of Methods and Prediction of
Receiving System Impacts. SETAC Press, Pensacola,
FL
Bartell, S.M., R.H. Gardner, and R.V. O'Neill. 1992.
Ecological Risk Estimations. Lewis Publishers, Boca
Raton, FL.
Baughman, D.S., D.W. Moore, and G.I. Scott. 1989. A
Comparison and Evaluation of Field and Laboratory
Toxicity Tests with Fenvalerate on an Estuarine
Crustacean. Environ. Toxicol. Chem. 8:417-429.
Becker, D.S., G.R. Bilyard, and T.C. Ginn. 1990.
Comparisons Between Sediment Toxicity Tests and
Alterations of Benthic Macroinvertebrate Assemblages
ata Marine Superfund Site: Commencement Bay, Wash-
ington. Environ. Toxicol. Chem. 9:669-685.
Birge, W.J. and J.A. Black. 1990. In Situ Toxicological
Monitoring: Use in Quantifying Ecological Effects of
Toxic Wastes, pp. 215-231. In: S.S. Sandhu, W.R.
Lower, F.J. de Serres, W.A. Suk, and R.R. Tice, eds., In
Situ Evaluation of Biological Hazards of Environmental
Pollutants. Plenum Press, New York, NY.
Birge, W.J., J.A. Black, T.M. Short, and A.G. Westerman.
1989. A Comparative Ecological and Toxicological
Investigation of Secondary WastewaterTreatment Plant
Effluent and Its Receiving Stream. Environ. Toxicol.
Chem. 8:437-450.
Birge, W.J., D.J. Price, D.P. Keogh, J.A. Zuiderveen, and
M.D. Kercher. 1992. Biological Monitoring Program for
the Paducah Gaseous Diffusion Plant. Annual Report for
study periodOct. 1990 through March 1992. Submitted
to Oak Ridge National Laboratory, Oak Ridge, TN.
Boelter, A.M., F.N. Lamming, A.M. Farag, and H.L.
Bergman. 1992. Environmental Effects of Saline Oil-field
Discharges on Surface Waters. Environ. Toxicol. Chem.
11:1187-1195.
Boyle, T.P., S.E. Finger, R.L. Paulson, and C.F. Rabeni.
1985. Comparison of Laboratory and Field Assessment
of Fluorene. Part li: Effects on the Ecological Structure
and Function of Experimental Pond Ecosystems, pp.
134-151. In: T.P. Boyle, ed,., Validation and Predictability
of Laboratory Methods for Assessing the Fate and Ef-
fects of Contaminants in Aquatic Ecosystems. ASTM
STP 865. American Society for Testing and Materials,
Philadelphia, PA.
Brock, T.C.M., S.J.H. Crum, R. van Wijngaarden, B.J.
Budde, J. Tijink, A. Zuppelli, and p. Leeuwangh. 1992.
Fate and Effects of the Insecticide Dursban 4E in Indoor
E/ocfea-dominated and Macrophyte-free Freshwater
Model Ecosystems: I. Fate and Primary Effects of the
Active Ingredient Chlorpyrifos. Arch. Environ. Contam.
Toxicol. 23:69-84.
Burton, G.A. Jr. and G.R. Lanza. 1987. Aquatic Microbial
Activity and Macrofaunal Profiles of an Oklahoma
Stream. Wat. Res. 21:1173-1182.
-28-
-------
Burton, G.A. Jr., A. Drotar, J. M. Lazorchak, and L.L.
Baals. 1987. Relationship of Microbial Activity and
Ceriodaphnia Responses to Mining Impacts on the Clark
Fork River, Montana. Arch. Environ. Contam. Toxicol.
16:523-530
***Cairns, J. Jr. 1983. Are Single Species Toxicity Tests
Alone Adequate for Estimating Environmental Hazard?
Hydrobiologla 100:47-57.
***Cairns, J. Jr. 1986. What Is Meant by Validation of Pre-
dictions Based on Laboratory Toxicity Tests?
Hydrobiologia 137:271 -278.
***Cairns, J. Jr. 1988a. What Constitutes Field Validation
of Predictions Based on Laboratory Evidence? pp.361 -
368. In: Adams, G.A. Chapman, and W.G. Landis, eds.,
Aquatic Toxicology and Hazard Assessment Tenth Vol-
ume, ASTM STP 971, American Society for Testing and
Materials, Philadelphia, PA.
***Cairns, J. Jr., 1988b. Should Regulatory Criteria And
Standards Be Based on Multispecies Evidence? Environ.
Profess. 10:157-165.
***Cairns, J. Jr. 1988c. Putting the Eco in Ecotoxicology.
Regulatory Toxicol. Pharm. 8:226-238.
Cairns, J. Jr., and D.S. Cherry. 1983. A Site-specific Field
and Laboratory Evaluation of Fish and Asiatic Clam Pop-
ulation Responses to Coal Fired Power Plant
Discharges. Wat. Sci. Tech. 15:31-58.
***Cairns, J., Jr. and D.I. Mount. 1990. Aquatic Toxicology,
Part 2. Environ. Sci. Technol.24: 154-161.
Cairns, J. Jr., and B.R. Niederlehner. 1995. Predictive
Ecotoxicology: Methods for Making Estimates and Pre-
dictability in Ecotoxicology. pp.667-680. In: D.J.
Hoffman, B.A. Rattner, G.A. Burton, and J. Cairns Jr.,
eds., Handbook of Ecotoxicology, CRC Press, Inc., Boca
Raton, FL.
Cairns, J. Jr., and E.P. Smith. 1994. The Statistical Validity
of Biomonitoring Data. pp. 49-68. In: S.L. Loeb, and A.
Spacie, eds., Biological Monitoring of Aquatic Systems.
Lewis Publishers, Boca Raton, FL.
Cairns, J. Jr., D.S. Cherry, and J.D. Grattina 1982. Cor-
respondence Between Behavioral Responses of Fish in
Laboratory and Field Heated Chlorinated Effluents, pp.
207-215. In: W.J. Mitsch, R.W. Bosserman, and J.M.
Klopatek, eds., Energy and Ecological Modelling.
Elsevier Scientific Publishers Co., Amsterdam.
Cairns, J. Jr., P.V. McCormick, and S.E. Belanger. 1993.
Prospects for the Continued Development of
Environmentally-realistic Toxicity Tests Using Microor-
ganisms. J. Environ. Sci. 5:253-268.
Calow, P. 1992. The Three R's of Ecotoxicology. Fund.
Eco/. 6:617-619.
Canfield, T.G., N.E. Kemble, W.G. Brumbaugh, F.G.
Dwyer, C.G. Ingersoll and J.F. Fairchild. 1994. Use of
Benthic Invertebrate Structure and the Sediment Quality
Triad to Evaluate Metal-contaminated Sediment in the
Upper Clark Fork River, Montana. Environ. Toxicol.
Chem. 13:1999-2012.
Carlson, A.R., H. Nelson, and D. Hammermeister. 1986.
Development and Validation of Site-specific Water Qual-
ity Criteria for Copper. Environ. Toxicol. Chem. 5:997-
1012.
***Chapman, P.M. 1995a. Extrapolating Laboratory
Toxicity Results to the Field. Environ. Toxicol. Chem. 14-
927-930.
Chapman, P.M. 1995b. Do Sediment Toxicity Tests
Require Validation? Environ. Toxicol. C/?em.:14:1451-
1453.
***Chapman, P.M. 1995c. Ecotoxicoiogyand Pollution-Key
Issues. Marine Poll. Bull. 16:405-415.
Chapman, P.M., R.N. Dexter, and E.R. Long. 1987. Syn-
optic Measures of Sediment Contamination, Toxicity and
Infaunal Community Composition (The Sediment Quality
Triad) in San Francisco Bay. Mar. Eco/. Prog. Ser.
37:75-96.
Clark, J.R., P.W. Borthwick, L.R. Goodman, J.M. Patrick,
Jr., E.M. Lores, and J.C. Moore. 1987. Comparison of
Laboratory Toxicity Test Results with Responses of
EEstuarine Animals Exposed to Fenthion in the Field.
Environ. Toxicol. Chem. 6: 151-160.
Clements, W.H. and P.M. Kiffney. 1994. Integrated Labo-
ratory and Field Approach for Assessing Impacts of
Heavy Metals at the Arkansas River, Colorado. Environ.
Toxicol. Chem. 13:397-404.
Clements, W.H. and P.M. Kiffney. 1996. Validation of
Whole Effluent Toxicity Tests: Integrated Studies Using
Field Assessments, Microcosms, and Mesocosms. pp.
229-244. In: D.R. Grothe, K.L. Dickson, and O.K. Reed-
Judkins, eds., Whole Effluent Toxicity Testing: An Evalu-
ation of Methods and Prediction of Receiving System
Impacts. SETAC Press, Pensacola, FL.
Cooper, W.E., and R.J. Stout. 1985. The Monticello Ex-
periment: A Case Study, pp. 96-116. In: Multispecies
-29-
-------
Toxlcity Testing, J. Cairns, Jr., ed., Pergamon Press,
New York, NY.
Crane, M. 1995. Society of Environmental Toxicology and
Chemistry (SETAC) News Article. Vol. 15, No. 2.,
March.
Grassland, N.O. 1984. Fate and Biological Effects of
Methyl Parathion in Outdoor Ponds and Laboratory
Aquaria. Ecotox. Environ. Safe. 8:482-495.
Grassland, N.O. and J.M. Hillaby. 1985. Fate and Effects
of 3,4-dichloroaniline in the Laboratory and in Outdoor
Ponds: II. Chronic Toxicity to Daphnia Sp. And Other
Invertebrates. Environ. Toxicol. Chem. 4:489-499.
Grassland, N.O. and C.J.M. Wolff. 1985. Fate and Biolog-
ical Effects of Pentachlorophenol in Outdoor Ponds.
Environ. Toxicol. Chem. 4:73-86.
Grassland, N.O., G.C. Mitchell, and P.B. Dorn. 1992. Use
of Outdoor Artificial Streams to Determine Threshold
Toxicity Concentrations for a Petrochemical Effluent.
Environ. Toxicol. Chem. 11:49-59.
Davis, W.S. and T.P. Simon. 1995. Biological Assessment
and Criteria. Lewis Publishers, Boca Raton, FL.
deNoyelles, F. Jr., and W.D. Kettle. 1985. Experimental
Ponds for Evaluating Toxicity Tests Predictions, pp. 91 -
103. In: T.P. Boyle, ed., Validation and Predictability of
Laboratory Methods for Assessing the Fate and Effects
of Contaminants in Aquatic Ecosystems, ASTM STP
865. American Society for Testing and Materials, Phila-
delphia, PA.
Diamond, J.M., J.C. Hall, D.M. Pattie, and D. Gruber.
1994. Use of an Integrated Approach to Determine Site-
specific Effluent Metal Limits. Water Environ. Res.
66:733-743.
Dickson, K.L. 1995. Progress in Toxicity Testing-An
Academic's Viewpoint, pp. 209-216. In: J Cairns, Jr.
and B.R. Niederlehner, eds., Ecological Toxicity Testing,
Lewis Publishers, Boca Raton, FL.
Dickson, K.L., T. Duke, and G. Loewengart. 1985. A Syn-
opsis: Workshop on Multispecies Toxicity Tests, pp. 76-
88. In: J. Caims, Jr., ed., Multispecies Toxicity Testing.
Pergamon Press, New York, NY.
Dickson, K.L., W.T. Waller, J.H. Kennedy, W.R. Arnold,
W.P. Desmond, S.D. Dyer, J.F. Hall, J.T. Knight, Jr., D.
Malas, M.L. Martinez and S.L. Mutzner, 1989. A Water
Quality and Ecological Survey of the Trinity River, Vols.
I and II. Rnal Report. City of Dallas Water Utilities,
Dallas, TX.
***Dickson, K.L., W.T. Waller, J.H. Kennedy, and LP.
Ammann. 1992. Assessing the Relationship Between
Ambient Toxicity and Instream Biological Response.
Environ. Toxicol. Chem. 11:1307-1322.
Dickson, K.L., W.T. Waller, J.H. Kennedy, L.P. Ammann,
R. Guinn, and T.J. Norberg-King. 1996. Relationships
Between Effluent Toxicity, Ambient Toxicity, and Receiv-
ing System Impacts: Trinity River Dechlorination Case
Study, pp 287-308. In: D.R. Grothe, K.L. Dickson, and
O.K. Reed-Judkins, eds., Whole Effluent Toxicity
Testing: An Evaluation of Methods and Prediction of
Receiving System Impacts. SETAC Press, Pensacola,
FL.
Dorn, P. 1996. An Industrial Perspective on Whole Effluent
Toxicity Testing, pp. 16-37. In: D.R. Grothe, K.L. Dick-
son, and O.K. Reed-Judkins, eds., Whole Effluent Toxic-
ity Testing: An Evaluation of Methods and Prediction of
Receiving System Impacts. SETAC Press, Pensacola,
FL.
Dorn, P.B., R. van Compernolle, C.L. Meyer, and N.O.
Grassland. 1991. Aquatic Hazard Assessment of the
Toxic Fraction from the Effluent of a Petrochemical Plant.
Environ. Toxicol. Chem. 10:691-703.
Eagleson, K.W., D.L. Lenat, L.W. Ausley, and F.B.
Winborne. 1990. Comparison of Measured Instream
Biological Responses with Responses Predicted Using
the Ceriodaphnia dubia Chronic Toxicity Test. Environ.
Toxicol. Chem. 9:1019-1028.
Eaton, J., J. Arthur, R. Hermanutz, R. Kiefer, L. Mueller, R.
Anderson, R. Erickson, B. Nordling, J. Rogers, and H.
Pritchard. 1985. Biological Effects of Continuous and
Intermittent Dosing of Outdoor Experimental Streams
with Chlorpyrifos. pp. 85-118. In: R.C. Banner and D.J.
Hansen, eds., Aquatic Toxicology and Hazard Assess-
ment: Eighth Symposium, ASTM STP 891, American
Society for Testing and Materials, Philadelphia, PA.
Eisle, P.J. and R. Hartung. 1976. The Effects of
Methoxychlor on Riffle Invertebrate Populations and
Communities. Trans. Am. Fish. Soc. 105:628-633.
***Emans, H.J.B., E.J. v.d.Plassche, J.H. Canton, P.C.
Okkerman, and P.M. Sparenburg. 1993. Validation of
Some Extrapolation Methods Used for Effect Assess-
ment. Environ. Toxicol. Chem. 4:155-166.
Fairchild, J.F., F.J. Dwyer, T..W. La Point, S.A. Burch, and
C.G. Ingersoll. 1993. Evaluation of a Laboratory-gener-
ated NOEC for Linear Alkylbenzene Sulfonate in
Outdoor Experimental Streams. Environ. Toxicol. Chem.
12:1763-1775.
-30-
-------
Fairchild, J.F., T.W. La Point, J.L. Zajicek, M.K. Nelson, F.
J. Dwyer, and P.A. Lovely. 1992. Population-,
Community-and Ecosystem-level Responses of Aquatic
Mesocosmsto Pulsed Doses of aPyrethroid Insecticide.
Environ. Toxicol. Chem. 11:115-129.
Ferraro, S.P., R.C. Swartz, F.A. Cole, and D.W. Schults.
1991. Temporal Changes in the Benthos Along a Pollu-
tion Gradient: Discriminating the Effects of Natural Phe-
nomena from Sewage-industrial Wastewater Effects.
Estuarine, Coastal and Shelf Science 33:383-407.
***Forbes, V.E. and M.H. Depledge. 1992. Predicting
Population Response to Pollutants: Significance of Sex.
Fund. Eco/. 6:376-381.
Franco, P.J., J.M. Giddings, S.E. Herbes, LA. Hook, J.D.
Newbold, W.K. Roy, G.R. Southworth, and A.J. Stewart.
1984. Effects of Chronic Exposure to Coal-derived Oil on
Freshwater Ecosystems: I. Microcosms. Environ.
Toxicol. Chem. 3:447-463.
Frithsen, J.B., D. Nacci, C. Oviatt, C.J. Strobel, and R.
Walsh. 1989. Using Single-species and Whole Ecosys-
tem Tests to Characterize the Toxicity of a Sewage
Treatment Plant Effluent, pp. 231-250. ln:G.W. Suter
II and M.A. Lewis, eds., Aquatic Toxicology and Environ-
mental Fate: Eleventh Volume, ASTMSTP1007, Amer-
ican Society forTesting and Materials, Philadelphia, PA.
Gearing, J.N. 1989. The Role of Aquatic Microcosms in
Ecological Research as Illustrated by Large Marine Sys-
tems, pp. 411-448. In: Ecotoxicology: Problems and
Approaches (S.A. Levin, M.A. Harwell, J.R. Kelly and
K.D. Kimball. Springer Verlag, New York, NY.
Geckler, J.R., W.B. Horning, T.M. Nieheisel, Q.H.
Pickering, E.L. Robinson, and C.E. Stephan. 1976.
Validity of Laboratory Tests for Predicting Copper
Toxicity in Streams. EPA 600/3-76-116. Cincinnati, OH.
Giddings, J.M., and P.J. Franco. 1985. Calibration of Lab-
oratory Toxicity Tests with Results from Microcosms and
Ponds, pp, 104-119. In: T.P. Boyle, ed., Validation and
Predictability of Laboratory Methods for Assessing the
Fate and Effects of Contaminants in Aquatic
Ecosystems, STP 865. American Society for Testing
and Materials, Philadelphia, PA.
Giddings, J.M., P.J. Franco, S.M. Bartell, R.M. Cushman,
S.E. Herbes, L.A. Hook, J.D. Newbold, G.R. Southworth,
and A.J. Stewart. 1984. Effects of Contaminants on
Aquatic Ecosystems: Experiments with Microcosms and
Outdoor Ponds. Oak Ridge National Laboratory, Oak
Ridge, TN.
Giesy, J.P. and P.M. Allred. 1985. Replicability of Aquatic
Multispecies Test Systems, pp. 187-247. In: J. Cairns
Jr., ed. Multispecies Toxicity Testing. Pergamon Press,
New York, NY.
Giesy, J.P., Jr., H.J. Kania, J.W. Bowling, R.L. Knight, S.
Mashburn, and S. Clarkin. 1979. Fate and Biological
Effects of Cadmium Introduced into Channel
Microcosms. EPA 600/3-79-039. Duluth, MN.
Gonzalez, M.J. and T.M. Frost. 1994. Comparisons of
Laboratory Toxicity Tests and a Whole-lake Experiment:
Rotifer Responses to Experimental Acidification. Ecologi-
cal Applications 4(1 ):69-80.
***Grothe, D.R., K.L. Dickson, and D.K. Reed-Judkins
(eds.). 1996. Whole Effluent Toxicity Testing: An Evalu-
ation of Methods and Prediction of Receiving System
Impacts. SETAC Press, Pensacola, FL.
Hansen, S.R. and R.R. Garton. 1982. Ability of Standard
Toxicity Tests to Predict the Effects of the Insecticide
Diflubenzuron on Laboratory Stream Communities. Can.
J. Fish. Aquat. Sci. 39:1273-1288.
Havas, M. and T.C. Hutchinson. 1982. Aquatic
Invertebrates from the Smoking Hills, N. W. T.: Effect of
pH and Metals on Mortality. Can. J. Fish. Aquat. Sci
39:890-903.
Herbold, B., A.D. Jassbay, and P.B. Moyle. 1992. Status
and Trends Report on Aquatic Resources in the San
Francisco Estuary. San Francisco Estuary Report, CA.
Hitchock, S.W. 1965. Field and Laboratory Studies of DDT
on Aquatic Insects. Conn. Ag. Exp. Bull. (New Haven)
668:1-32.
Kersting, K. and R. van Wijngaarden. 1982. Effects of
Chlorpyrifos on a Microecosystem. Environ. Toxicol.
Chem. 11:365-372.
. Larnberson, J.O., T.H. DeWitt, and R.C. Swartz. 1992.
Assessment of Sediment Toxicity to Marine Benthos.
pp. 183-211. In: G.A. Burton, Jr., ed. Sediment Toxicity
Assessment, Lewis Publishers, Boca Raton, FL.
LaPoint, T.W. 1994. Interpreting the Results of Agricultural
Microcosm Tests: Linking Laboratory and Experimental
Field Results to Predictions of Effect in Natural Ecosys-
tems, pp. 83-94. In: l.R. Hill, F. Heimbach, P.
Leeuwangh, and P. Matthiesen, eds., Freshwater Field
Tests for Hazard Assessment of Chemicals. Lewis Pub-
lishers, Boca Raton, FL.
LaPoint, T.W. 1995. Signs and Measurements of
Ecotoxicology in the Aquatic Environment, pp. 13-24.
-31-
-------
In: D.J. Hoffman, B.A. Rattner, G.A. Burton, Jr., and J.
Cairns, Jr., eds., Handbook of Ecotoxicology. Lewis Pub-
lishers, Boca Raton, FL.
LaPoint, T.W., M.T. Barbour, D.L. Burton, D.S. Cherry,
W.H. Clements, J.M. Diamond, D.R. Grothe, M.A. Lewis,
O.K. Reed-Judkins, and G.W. Saalfeld. 1996. Field as-
sessments, pp. 191-228. In: D.R. Grothe, K.L. Dickson,
and O.K. Reed-Judkins, eds., Whole Effluent Toxicity
Testing: An Evaluation of Methods and Prediction of
Receiving System Impacts. SETAC Press, Pensacola,
FL.
LaPoint, T.W., J.F. Fairchild, E.E. Little, and S.E. Finger.
1989. Laboratory and Field Techniques in
Ecotoxicological Research: Strengths and Limitations.
pp. 239-255. In: A. BoudouandF. Ribeyre, eds, Aquatic
Ecotoxicology: Fundamental Concepts and Methodolo-
gies, II. CRC Press, Inc. Boca Raton, FL.
Larsen, DP., F. deNoyelles Jr., F. Stay, and T.
Shiroyama. 1986. Comparisons of Single-species, Micro-
cosm and Experimental Pond Responses to Atrazine
Exposure. Environ. Toxicol. Chem. 5:179-190.
Leeuwangh, P., T.C.M. Brock, and K. Kersting. 1994. An
Evaluation of Four Types of Freshwater Model Ecosys-
tem for Assessing the Hazard of Pesticides. Human and
Experimental Toxicology 13:888-899.
Little, E.E., F. J. Dwyer, J.F. Fairchild, A.J. DeLonay, and
J.L Zajicek. 1993. Survival of Bluegill and Their Behav-
ioral Responses During Continuous and Pulsed Expo-
sures to Esfenvalerate, a Pyrethroid Insecticide. Environ.
Toxicol. Chem. 12:871-878.
•"Livingston, R.J. and D.A. Meeter. 1985. Correspon-
dence of Laboratory and Field Results: What Are the
Criteria for Verification? pp. 76-88. In: J. Cairns, Jr.,
ed., Multlspecies Toxicity Testing. Pergamon Press, New
York, NY.
Long, E.R. and P.M. Chapman. 1985. A Sediment Quality
Triad: Measures of Sediment Contamination, Toxicity
and Infaunal Community Composition in Puget Sound.
Marine Pollution Bulletin 10:405-415.
***Luoma, S.N. 1995. Prediction of Metal Toxicity in Nature
from Toxicity Tests: Limitations and Research Needs.
pp. 610-659. In: A. Tessierand D. Turner, eds., Metal
Speclation andBioavailability in Aquatic Systems. John
Wiley & Sons, Ltd., New York, NY.
Luoma, S.N. and J.L. Carter. 1993. Understanding the
Toxicity of Contaminants in Sediments: Beyond the Tox-
icity Tests-based Paradigm. Environ. Toxicol. Chem.
12:793-796.
Luoma, S.N. and K.T. Ho. 1993. Appropriate Uses of Ma-
rine and Estuarine Sediment Toxicity Tests, pp. 193-
225. In: P. Calow, ed., Handbook of Ecotoxicology.
Blackwell Scientific, Oxford, U.K.
***Marcus, M.D. and L.L. McDonald. 1992. Evaluating the
Statistical Basis for Relating Receiving Impacts to
Effluent and Ambient Toxicities. Environ. Toxicol. Chem.
11: 1389-1402.
Marshall, J.S. 1978. Field Verification of Cadmium Toxicity
to Laboratory Daphnia Populations. Bull. Environ.
Contam. Toxicol. 20:387-393.
Mayer, F.L. and M.R. Ellersieck. 1986. Manual of Acute
Toxicity: Interpretation and Database for 410 Chemicals
and 66 Species of Freshwater Animals. Reference
Source Publication 160. U.S. Fish and Wildlife Service,
Dept. of Interior, Washington D.C.
McBride, G.B., J.C. Loftis, and N.C. Adkins. 1993. What Do
Significance Tests Really Tell Us about the Environ-
ment? Environ. Manage. 17:423-432.
Moore, M.V., and R.W. Winner. 1989. Relative Sensitivity
of Ceriodaphnia dubia Laboratory Tests and Pond Com-
munities of Zooplankton and Benthos to Chronic Copper
Stress. Aquat. Toxicol. 15:311-330.
***Mount, D.I. 1985. Scientific Problems in Using
Multispecies Toxicity Tests for Regulatory Purposes.
pp. 13-18. ln:J. Cairns, Jr., ed., Multispecies Toxicity
Testing. Pergamon Press, New York, NY.
Mount, D.I. 1995. Development and Current Use of Single
Species Aquatic Toxicity Tests, pp. 97-104. In: J.
Cairns, Jr. and B.R. Niederlehner, eds., Ecological
ToxicityTesting, Lewis Publishers, Boca Raton, FL.
Mount, D.I. and T.J. Norberg-King, eds. 1985. Validity of
Effluent and Ambient Toxicity Tests for Predicting Biolog-
ical Impact, Scippo Creek, Circleville, Ohio. EPA 600/3-
85-044. Duluth, MN.
Mount, D.I. and T.J. Norberg-King, eds. 1986. Validity of
Effluent and Ambient Toxicity Tests for Predicting Bio-
logical Impact, Kanawha River, Charleston, West Vir-
ginia. EPA 600/3-86-006. Duluth, MN.
Mount, D.I., T.J. Norberg-King and A.E. Steen. 1986a.
Validity of Effluent and Ambient Toxicity Tests for Pre-
dicting Biological Impact, Naugatuck River,
Waterbury,Connecticut.EPAQOO/8-86-005. Duluth,MN.
Mount, D.I., N.A. Thomas, T.J. Norberg, M.T. Barbour, T.H.
Roush and W.F. Brandes. 1984. Effluent and Ambient
Toxicity Testing and Instream Community Response on
-32-
-------
the Ottawa River, Lima, Ohio.
Duluth, MN.
EPA 600/3-84-080.
Mount, D.I., A.E. Steen and T.J. Norberg-King, eds. 1985.
Validity of Effluent and Ambient Toxicity Testing for Pre-
dicting Biological Impact on Five Mile Creek, Birming-
ham, Alabama. EPA 600/8-85-015. Duluth, MN.
Mount, D.I., A.E. Steen and T.J. Norberg-King, eds. 1986b.
The Validity of Effluent and Ambient Toxicity Tests for
Predicting Biologicallmpact, Back River, Baltimore Har-
bor, Maryland. EPA 600/8-86-001. Duluth, MN.
Mount, D.I., A.E. Steen and T.J. Norberg-King, eds. 1986c.
Validity of Ambient Toxicity Tests for Predicting Biologi-
cal Impact, Ohio River, near Wheeling, West Virginia.
EPA 600/3-85-071. Duluth, MN.
Mount, D.R., K.R. Drottar, D.D. Gulley, J.P. Fillo, and P.E.
O'Neil. 1992. Use of Laboratory Toxicity Data for Evalu-
ating the Environmental Acceptability of Produced Water
Discharge to Surface Waters, pp. 175-185. In: J.P. Ray
and F.R. Engelhardt, eds., Produced Water. Plenum
Press, New York, NY.
Neuhold, J.M. 1986. Toward a Meaningful Interaction Be-
tween Ecology and Aquatic Toxicology, pp. 11-21. In:
T.M. Poston and R. Purdy, eds., Aquatic Toxicology and
Environmental Fate, ASTM STP 921. American Society
for Testing and Materials.
Niederlehner, B.R., K.W. Pontash, J.R. Pratt, and J.
Cairns, Jr. 1990. Field Evaluation of Predictions of Envi-
ronmental Effects from a Multispecies-Microcosm Tox-
icity Test. Arch. Environ. Contam. Toxicol. 19:62-71.
Niederlehner, B.R., J.R. Pratt, A.L. Buikema, Jr., and J.
Cairns, Jr. 1985. Laboratory Tests Evaluating the Effects
of Cadmium on Freshwater Protozoan Communities.
Environ. Toxicol. Chem. 4:155-165.
Nimmo, D.R., D. Link, L.P. Parrish, G.J. Rodriguez, and W.
Wuerthele. 1989. Comparison of On-site and Laboratory
Toxicity Tests: Derivation of Site-specific Criteria for
Unionized Ammonia in a Colorado Transitional Stream.
Environ. Toxicol. Chem. 8:1177-1189.
Nimmo, D.R., M.H. Dodson, P.M. Davies, J.C. Greene, and
M.A. Kerr. 1990. Three Studies Using Ceriodaphniaio
Detect Nonpoint Sources of Metals from Mine Drainage.
J. Water. Poll. Contr. Fed. 62:7-14.
Norberg-King, T.J. and D,l. Mount, eds. 1986. Validity of
Effluent and Ambient Toxicity Tests for Predicting Bio-
logical Impact, Skeleton Creek, Enid, Oklahoma.
EPA/600/8-86-002. Duluth, MN.
Obrebski, S., J.J. Orsi, and W. Kimmerer. 1992. Long-term
Trends in Zooplankton Distributions and Abundance in
the Sacramento-San Joaquin Estuary. Interagency Eco-
logical Studies Program forthe Sacramento-San Joaquin
Delta Estuary. Technical Report No. 32.
*** Okkerman, P.C., E. J. V.D.PIassche, H. J.B. Emans, and
J.H. Canton. 1993. Validation of Some Extrapolation
Methods with Toxicity Data Derived from Multiple Spe-
cies Experiments. Ecotox. Environ. Safe. 25:341-359.
***Parkhurst, B.R. 1995. Are Single Species Toxicity Test
Results Valid Indicators of Effects to Aquatic Communi-
ties? pp. 105-121. In: J. Cairns, Jr. and B.R. Nieder-
lehner, eds., Ecological Toxicity Testing, Lewis Publ-
ishers, Boca Raton, LA.
***Parkhurst, B.R. 1996. Predicting Receiving System Im-
pacts from Effluent Toxicity. pp. 309-321. In: D.R.
Grothe, K.L. Dickson, and O.K. Reed-Judkins, eds.,
Whole Effluent Toxicity Testing: An Evaluation of Meth-
ods and Prediction of Receiving System Impacts.
65ETAC Press, Pensacola, FL.
***Parkhurst, B.R., M.D. Marcus, and C.E. Noel. 1990.
Review of the Results of EPA's Complex Effluent
Toxicity Testing Program. Utility Water Act Group,
Washington, D.C.
***Persoone, G. and J. Gillett. 1990. Toxicological Versus
Ecotoxicological Testing, pp. 287-289. In: P. Bourdeau,
E. Somers, G.M. Richardson, and J.R. Hickman, eds.,
Short-term Toxicity Tests for Non-Genotoxic Effects,
John Wiley and Sons Ltd., New York, NY.
***Persoone, G. and C.R. Janssen. 1994. Field Validation
of Predictions Based on Laboratory Toxicity Tests, pp.
379-397. In: I.R. Hill, F. Heimbach, P.I. Leeuwangh, and
P. Matthiessen, eds., Freshwater Field Tests for Hazard
Assessment of Chemicals, Lewis Publishers, Boca
Raton, FL.
***Persoone, G., D. Calamari, and D. Wells. 1990.
Possibilities and Limitations of Predictions from Short-
term Tests in the Aquatic Environment, pp. 301 -312. In:
P. Bourdeau, E. Somers, G.M. Richardson, and J.R.
Hickman, eds., Short-term Toxicity Tests for Non-
Genotoxic Effects, John Wiley and Sons Ltd, New York,
NY.
Pontash, K.W. and J. Cairns Jr. 1991. Multispecies Toxicity
Tests Using Indigenous Organisms: Predicting the Ef-
fects of Complex Effluents in Streams. Arch. Environ.
Contam. Toxicol. 20:103-112.
Pontash, K.W., B.R. Niederlehner, and J. Cairns, Jr. 1989.
Comparisons of Single-species, Microcosm and Field
-33-
-------
Responses to a Complex Effluent. Environ. Toxicol.
Chem. 8:521-532.
Pratt, J.R., J. Mitchell, R. Ayers, and J. Cairns, Jr. 1989.
Comparison of Estimates of Effects of a Complex
Effluent at Differing Levels of Biological Organization.
pp. 174-188. In: G.W. Suter and M.A. Lewis, eds.,
Aquatic Toxicology and Environmental Fate, ASTM STP
1007. American Society for Testing and Materials, Phila-
delphia, PA.
Richardson, B.J. and M. Martin. 1994. Marine and
Estuarine Toxicity Testing: a Way to Go? Additional
sitings from Northern and Southern hemisphere
perspectives. Marine Poll. Bull. 28:138-142.
Roberts, J.R., D.W. Rodgers, J.R. Bailey, and M.A. Rorke.
1978. Polychlorinated Biphenyls: Biological Criteria for
an Assessment of Their Effects on Environmental Qual-
ity. National Research Council of Canada, Ottawa.
Robinson, R.D., J.H. Carey, K.R. Solomon, I. R. Smith,
M.R. Servos, and K.R. Munkittrick. 1994. Survey of
Receiving-water Environmental Impacts Associated with
Discharges from Pulp Mills. 1. Mill Characteristics,
Receiving-water Profiles and Laboratory Toxicity Tests.
Environ. Toxicol. Chem. 13:1075-1088.
Sasson-Brickson, G. and G.A. Burton, Jr. 1991. In Situ and
Laboratory Sediment Toxicity Testing with Ceriodaphnia
dubla. Environ. Toxicol. Chem. 10:201-207.
Schimmel, S.C., G.E. Morrison, and M.A. Heber. 1989a.
Marine Complex Effluent Toxicity Program: Test Sensi-
tivity, Repeatability and Relevance to Receiving Water
Toxicity. Environ. Toxicol. Chem. 8:739-746.
Schimmel, S.C., G.B. Thursby, M.A. Heber, and M.J.
Chammas. 1989b. Case Study of a Marine Discharge:
Comparison of Effluent and Receiving Water Toxicity.
pp. 159-173. In: G.W. Suter, II and M.A. Lewis, eds.,
Aquatic Toxicology and Environmental Fate: Eleventh
Volume, ASTM STP 1007, American Society for Testing
and Materials, Philadelphia, PA.
Sherman, R.E., S.P. Gloss, and L.W. Lion. 1987. A Com-
parison of Toxicity Tests Conducted in the Laboratory
and in Experimental Ponds Using Cadmium and the
Fathead Minnow (Pimephales promelas). Water Res.
1:317-323.
Siefert, R.E., S.J. Lozano, J.C. Brazner, and M.L. Knuth.
1989. Littoral Enclosures for Aquatic Field Testing of
Pesticides: Effects of Chlorpyrifos on a Natural System.
Entomological Soc. Amer., Misc. Publ. 75:57-73.
Slooff, W. 1985. The Role of Multispecies Testing i
Aquatic Toxicology, pp 45-60. In: J. Cairns, Jr., ed"
Multispecies Toxicity Testing, Pergamon Press, New
York, NY.
Slooff, W. and J.H. Canton. 1983. Comparison of the Sus-
ceptibility of 11 Freshwater Species to 8 Chemical Com-
pounds. II. (Semi) Chronic Toxicity Tests. Aquati.
Toxicol. 4:271 -282.
Slooff, W., J.A.M. van Oers and D. de Zwart. 1986. Mar-
gins of Uncertainty in Ecotoxicological Hazard Assess-
ment. Environ. Toxicol. Chem. 5:841-852.
Smith, E.P. 1995. Design and Analysis of Multispecies
Experiments, pp. 73-95. In: J. Cairns, Jr., and B.R.
Niederlehner, eds., Ecological Toxicity Testing, Lewis
Publishers, Boca Raton, FL.
Smith, R. 1994.- Contract Report by EcoAnalysis Inc.,
Ojai, CA. Submitted to the State Water Resources
Control Board, Sacramento, CA .
Sprague, J. 1995. A Brief Critique of Today's Use of
Aquatic Toxicity Tests. Human Ecol. Risk Assess. 1:
167-170.
State Water Resources Control Board. 1990. Water Quality
Control Plan for Ocean Waters of California (California
Ocean Plan). SWRCB Resolution No. 90-27.
Stay, F.S., D.P. Larsen, A. Katko, and C.M. Rohm. 1985.
Effects of Atrazine on Community Level Responses in
Taub Microcosms, pp. 75-90. In: T.P. Boyle, ed.,
Validation and Predictability of Laboratory Methods for
Assessing the Fate and Effects of Contaminants in
Aquatic Ecosystems, ASTM STP 865, American Society
for Testing and Materials, Philadelphia, PA.
Stephenson, R.R., and D.F.. Kane. 1984. Persistence and
Effects of Chemicals in Small Enclosures in Ponds. Arch.
Environ. Toxicol. 13:313-326.
Swartz, R.C., F.A. Cole, J.O. Lamberson, S.P. Ferraro,
D.W. Schults, W.A. DeBen, H. Lee II, and R. J. Ozretich.
1994. Sediment Toxicity, Contamination and Amphipod
Abundance at a DDT-and Dieldrin-contaminated Site in
San Francisco. Environ. Toxicol. Chem. 13:949-962.
Swartz, R.C., W.A. Deben, K.A. Sercu, and J.O.
Lamberson. 1982. Sediment Toxicity and the Distribution
of Amphipods in Commencement Bay, Washington,
USA. Marine Pollution Bulletin 13:359-364.
Swartz, R.C., D. W. Schults, G. R. Ditsworth, W.A. DeBen,
and F.A. Cole. 1985. Sediment Toxicity, Contamination,
-34-
-------
and Macrobenthic Communities near a Large Sewage
Outfall, pp. 152-175. In: T.P. Boyle, ed., Validation and
Predictability of Laboratory Methods for Assessing the
Fate and Effects of Contaminants in Aquatic
Ecosystems, ASTM STP 865, American Society for
Testing and Materials, Philadelphia, PA.
Swartz, R.C., D.W. Schults, J.O. Lamberson, RJ. Ozretich,
and J.K. Stull. 1991. Vertical Profiles of Toxicity, Organic
Carbon, and Chemical Contaminants in Sediment Cores
from the Palos Verdes Shelf and Santa Monica,
California. Marine Environ. Res. 31:215-225.
Underwood, A.J. 1995. Toxicological Testing in
Laboratories Is Not Ecological Testing of Toxicology.
Human Ecol. Risk Assess. 1:178-182.
USEPA. 1984. Ambient Water Quality Criteria for Cad-
mium 1984. EPA 440/5-84-032. Washington, D.C.
USEPA. 1991. Technical Support Document for Water
Quality-based Toxics Control. EPA/505/2-90-001.
Washington, D.C.
USEPA. 1994a. Short-term Methods for Estimating the
Chronic Toxicity of Effluents and Receiving Waters to
Freshwater Organisms. 3rd ed. EPA 600/4-91/002.
Cincinnati, OH.
USEPA. 1994b. Short-term Methods for Estimating the
Chronic Toxicity of Effluents and Receiving Waters to
Marine and Estuarine Organisms. 2nd ed. EPA 600/4-
91/003. Cincinnati, OH.
Van den Brink, P.J., R.P.A. Van Wijngaarden, W.G.H.
Lucassen, T.C.M. Brock, and P. Leeuwangh. 1996. Ef-
fects of the Insecticide Dursban 4E (Active Ingredient
Chlorpyrifos) in Outdoor Experimental Ditches: II.
Invertebrate Community Responses and Recovery.
Environ. Toxicol. Chem. 15:1143-1153.
Van Wijngaarden, R.P.A., P.J. van den Brink, S.J.H.
Crum, J.H. Oude Voshaar, T.C.M. Brock, and P.
Leeuwangh. 1996. Effects of the insecticide Dursban 4E
(active ingredient chlorpyrifos) in outdoor experimental
ditches: I. Comparison of short-term toxicity between the
laboratory and the field. Environ. Toxicol. Chem.
15:1133-1142.
***Waller, W.T., L.P. Ammann, W.J. Birge, K.L. Dickson,
P.B. Dorn, N.E. LeBlanc, D.I. Mount, B.R. Parkhurst,
H.R. Preston, S.C. Schimmel, A. Spacie, and G.B.
Thursby. 1996. Predicting Instream Effects from Wet
Tests, pp. 271-286. In: D.R. Grothe, K.L. Dickson, and
O.K. Reed-Judkins, eds., Whole Effluent Toxicity
Testing: An Evaluation of Methods and Prediction of
Receiving System Impacts. SETAC Press, Pensacola,
FL.
Weiss, C.M. 1976. Field Evaluation of the Algal Assay
Procedure on Surface Waters of North Carolina, pp. 29-
76. In: E.J. Middlebrooks, D.H. Falkenburg and T.E.
Maloney, eds, Biostimulation and Nutrient Assessment,
Ann Arbor Science, Ml.
Yoder, C.0.1991. Answering Some Concerns about Bio-
logical Criteria Based on Experiences in Ohio. pp. 95-
104. In: EPA Water Quality Standards for the 21st Cen-
tury, Proceedings of a Conference. USEPA, Washington,
D.C.
2.0' Bibliography
***Boyle, T.P. 1985. Research Needs in Validating and
Determining the Predictability of Laboratory Data to the
Field, pp. 61-66. In: R.C. Banner and D.J.Hansen, eds.,
Aquatic Toxicology and Hazard Assessment, ASTM STP
891. American Society for Testing and Materials,
Philadelphia, PA.
***Cairns, J. Jr. 1993. Environmental Science and
Resource Management in the 21st Century: Scientific
Perspective. Environ. Toxicol. Chem. 12:1321-1329.
***Cairns, J. Jr., and J.R. Pratt. 1989. The Scientific Basis
for Toxicity Tests. Hydrobiologia 188/189:5-20.
***Kimball, K.D. and S.A. Levin. 1985. Limitations of Lab-
oratory Toxicity Tests: The Need for Ecosystem-level
Testing. 6/osc/ence35:165-171.
***Maltby, L. and P. Calow. 1989. The Application of
Toxicity Tests in the Resolution of Environmental Prob-
lems; Past, Present and Future. Hydrobiologia
188/189:65-76.
***Mount, D.I. 1994. A Comparison of Strengths and Limi-
tations of Limitations of Chemical Specific Criteria,
Whole Effluent Toxicity Testing, and Biosurveys.
Contract report submitted to USEPA Office of
Wastewater Enforcement and Compliance, Washington,
DC.
***Parkhurst, B.R. and D.I. Mount. 1991. The Water
Quality-based Approach to Toxics Control: Narrowing
the Gap Between Science and Regulation. Water
Environ. Tech. 3:45-47.
-35-
-------
Appendix A
Single Species Tests with Effluent
The following consists of an interpretive summary of stud-
ies in which effluents were tested with single species toxic-
ity tests and iri which some ecological survey data were
collected for comparative purposes.
A.1 Dickson et al. (1996)
A study (thesis project of R. Guinn, as summarized by
Dickson et al., 1996) was conducted by the Institute of
Applied Sciences at the University of North Texas to exam-
ine the effects of dechlorinating the effluent from a
wastewatertreatmentfacility (WWTP) on aquatic biological
communities in the West Fork of the Trinity River, Texas.
The WWTP effluent, at its discharge point, constitutes up
to 96% of the river's flow during low flow periods. An ob-
jective of the study was to evaluate the relationships
among effluent toxicity, river water toxicity, and biological
community responses.
Field assessments were performed to determine resident
biota and abiotic factors in the river both upstream and
downstream of the WWTP. Effluent and ambient water
toxfcity were assessed with USEPA's 7-d Ceriodaphnia
survival/reproduction and larval fathead minnow
survival/growth tests. In addition, ambient watertoxicity in
the river was assessed in situ with caged organisms--
fathead minnows and Asiatic clams (Corbicula fluminea).
Two sampling sites (controls) were located upstream and
five sites were downstream of the WWTP outfall. The first
two sites below the outfall were within 1.25 miles of the
discharge point and the remaining three sites were at
various locations 17 miles or less downstream. Ecological
surveys included fish and benthic macorinvertebrate
collections. Ambient water toxicity testing was conducted
with samples collected at all seven sites.
When this study was initiated, the WWTP was chlorinating
its effluent. Effluent and ambient water toxicity testing, as
well as biological sampling, was conducted during this pe-
riod to establish a baseline for comparison with data col-
lected after the implementation of dechlorination. During
this pre-dechlorination period data were collected during
two months (August and October).
With both the larval fathead minnow survival and growth
endpoints, statistically significanttoxicity (compared to the
two upstream sites) was observed in the effluent and in
ambient water from the first two sites downstream of the
WWTP outfall; results were the same in August and Octo-
ber. Dechlorination of the water samples from the two
sites below the outfall removed the toxicity. Statistically
significant toxicity in the larval fathead tests was not
observed in ambient water samples from sites 5, 6, and 7
downstream of the outfall.
Statistically significant toxicity (compared to the upstream
sites) was recorded in the effluent and water samples from
all sites downstream of the WWTP with the Ceriodaphnia
tests (both survival and reproduction). Dechlorination of
the toxic ambient water samples failed to remove the
toxicity, suggesting that other contaminants were causing
the water flea responses. In October, statistically signifi-
cant toxicity (compared to upstream sites) was noted in the
Ceriodaphnia tests with effluent and in water samples from
the two downstream sites nearest the outfall, but not at
sites 5, 6, and 7.
In the biological surveys, no fish were collected at the two
sites below the WWTP outfall. Between 200 and 4,500 fish
were collected at other sites on the river. Fish species
richness, evenness, and diversity were fairly equivalent at
all sites except the two below the outfall. Densities of ben-
thic macroinvertebrates were lower at the two sites below
the outfall than at the two upstream reference sites as well
as at sites 5, 6, and 7, below the outfall.
Based on the data collected during the pre-dechlorination
period, the authors predicted that effluent dechlorination
would remove toxicity to larval fathead minnows and
possibly restore the environment below the WWTP outfall
so that those areas could be; colonized by fish. Because
the toxicity to the water flea could not be totally attributed
to chlorine, the authors suggested that dechlorination
might not alter Ceriodaphnia responses. Potential for
impacts to instream biota was possible due to non-chlorine
contaminants.
Following activation of the dechlorination system WWTP
effluent and river water samples at all seven sites were
collected and tested on a monthly basis for a total of 17
test periods. Dechlorination appeared to remove effluent
and ambient water toxicity when larval fathead minnows
were used to screen samples. Dechlorination did not
remove all of the effluent or ambient watertoxicity detected
with Ceriodaphnia. The TIE Identified disunion as a major
cause of the effluent and ambient water daphnid toxicity.
-36-
-------
During the pre-dechlorination period, caged fathead min-
nows did not survive at river stations 3 and 4, immediately
below the WWTP outfall and approximately one mile down-
stream, respectively. With the exception of one of four
testing periods after implementation of dechlorination, sur-
vival of caged fathead minnows at stations 3 and 4 was
equivalent to all other stations. Juvenile Corbicula were
exposed in situior one month periods on five different test
dates-one pre-dechlorination and four after initiation of
dechlorination. Prior to initiation of dechlorination clam
mortality was 100% at stations 3 and 4, while there was
100% survival at all five of the other stations. Post-
implementation of dechlorination, clam survival at stations
3 and 4 was 100%. However, shell growth was
significantly lower at stations 3 and 4 (compared to all
other stations), suggesting the presence of an effluent
contaminant other than chlorine. The in situ tests support
the results observed in the laboratory toxicity tests with
effluent and ambient water samples.
Following dechlorination, fish were present at all river sta-
tions, supporting the author's prediction of the possibility
of recolonization at sites 3 and 4 with the implementation
of dechlorination. However, in three of four surveys after
dechlorination was initiated, the river station nearest the
outfall was found to have fish assemblages dissimilar to
those of the other stations. Macroinvertebrate surveys
revealed significant improvement in diversity and evenness
at stations 3 and 4 following initiation of dechlorination,
although the total number of organisms was lower com-
pared to the other stations.
In concluding, Dickson etal., (1996) state 1) "The results
of this case study add to the growing weight-of-evidence
to document a relationship between effluent toxicity (even
chronic toxicity) and receiving system impacts for effluent-
dominated systems" and 2) "We believe that establishing
a quantitative relationship between WET test results, ambi-
ent toxicity, and receiving systems effects, as a means for
validating WET test results, is not possible given the meth-
ods, approaches, and resources currently available. How-
ever, we believe the weight-of-evidence strongly supports
that such a qualitative relationship exists."
A.2 Pontash et al. (1989)
These researchers compared microcosm (multiple species
tests consisting of indigenous benthic macroinvertebrates
and protozoans) responses to a complex effluent with
responses observed in short-term estimates of chronic
toxicity (Ceriodaphnia survival and reproduction). The
predictive utility of these tests was evaluated in relation to
observed effects in the stream receiving the complex
effluent.
The results of this study demonstrated .that the
Ceriodaphnia reproduction response successfully esti-
mated no effect concentrations for the assessment of
aquatic community biological responses. Information from
the multiple species tests provided more specific predic-
tions than did the single species test.
The cladoceran reproduction results slightly underesti-
mated the effects of the complex effluent on the receiving
stream. Ceriodaphnia survival results in the laboratory
toxicity tests underestimated instream impairments of the
effluent. Similarfindings had been made by Pontasch and
Cairns (1991) in which laboratory toxicity tests with D.
magna underestimated biological community impairments
in the stream receiving the discharge. Underestimation
refers to the situation in which the laboratory toxicity test
indicates a higher effect concentration than that which
actually causes instream impairments. On the other hand,
Cairns and Cherry (1983) demonstrated, in tests with a
power plant effluent, that single species test results can
effectively predict ecosystem biological responses.
A.3 Niederlehner et al. (1990)
The predictive validity of a microcosm (multiple species)
toxicity test was evaluated by Niederlehner et al. (1990).
The study was conducted on a stream which receives a
complex industrial discharge. A control site was located
immediately upstream of the outfall, site 1 was approxi-
mately five meters downstream of the outfall; sites 2, 3,
and 4 were 0.25, 1.4, and 6.4 km downstream of the
outfall, respectively. Care was taken to select sites with
similar characteristics, especially substrate type. The
concept was to assure that the effluent was the major
variable among the sites. Effluent dilutions at each of
these sites was estimated using electrical conductivity. In
addition to the microbial microcosm test, dilutions of the
effluent also were tested with the 7-d Ceriodaphnia test.
The instream measurements taken at each of the sites as
indicators of biological community health, included species
richness of protozoans and a semi-quantitative survey of
benthic macroinvertebrates.
In both the microcosm and the water flea reproductive
response tests the LOEC and NOEC were 3% and 1%
effluent, respectively. In the field survey, significant effects
on protozoan and macroinvertebrate species richness
were seen at site 1, just below the outfall; estimated
effluent concentration at this site was 14.1%. High per-
centages of chironomid species and low percentages of
mayfly species were seen at sites 1, 2, and 3, but at site
4 the composition of the two groups was similar to the
control site. Generally, chironomids are considered
tolerant and mayfly species intolerant of water pollution.
If the species composition of these two groups is used as
a sensitive indicatorof ecosystem responses, then effluent
effects were seen all the way down to site 3. Estimated
effluent concentration at this site was 3.5%. The
Ceriodaphnia reproduction and microcosm tests estimated
the LOEC to be 3% effluent. Therefore, both tests reliably
predicted instream biological community responses.
-37-
-------
A.4 Diamond et al. (1994)
A rather detailed analysis of this publication is provided
because the ecological survey, as well as other compo-
nents, of this study represent the type of design and
analysis that is to be avoided when attempting to assess
the reliability of extrapolations from laboratory toxicity test
data to instream responses. Ourevaluation of the analysis
of the data presented resulted in a conclusion that the
effluent under study was adversely impacting the stream
and river into which it was discharged. Diamond et al.
(1994) concluded that the effluent was not impacting the
stream.
Diamond and associates conducted a study on a
wastewatertreatmentfacility (WWTP) effluent, the stream
(X-trib) into which the effluent was discharged, and the
South Anna River (in Virginia) into which X-trib discharged.
Chemical specific analyses and USEPA toxicity tests
(USEPA, 1994a) were performed on effluent samples;
stream bioassessments were implemented on X-trib, the
South Anna River, and on two reference streams (other
tributaries to Santa Anna River).
X-trib, the receiving water, was described as heavily
channelized with concrete structures. WWTP effluent com-
prised approximately 98% of X-trib during low flow
periods. The Santa Anna River was described as being
forested over much of its watershed and apparently unim-
paired by anthropogenic influences above its confluence
with X-trib. Two sites on X-trib were selected, one in an
open, sunny area above the WWTP point of discharge and
the other in a shaded area below the point of discharge.
The selection of these two sites appears unfortunate in that
the two sites fail to match in habitat type; therefore, the
primary variable between sites is more than effluent
constituents.
Two reference sites were chosen; one on an open stream
which discharged into Santa Anna River and was to serve
as a matched orcontrol site forthe upstream site on X-trib.
The second reference site was on a shaded stream which
also discharged into Santa Anna River; this site was in-
tended as a control for the lower site on X-trib. The
authors indicated that the reference sites provided
Information on fauna capable of inhibiting X-trib. However,
the authors concluded that the two reference sites on the
Santa Anna River tributaries had better habitats for fish
and macroinvertebrates than the X-trib sites. Therefore,
these sites should be disqualified as reference sites
because habitat differences ratherthan water quality could
accountforbiological community differences. Selection of
such sites reveals questionable study design and
represents a serious flaw in this study. Interpretation of
results are clearly confounded. Four sites were selected
on the Santa Anna River. One site was above the
confluence with the X-trib and the three other sites wen
downstream of the confluence.
Bioassessments focused on benthic macroinvertebrates
and fish populations. Two types of bioassessments were
performed. The first type involved introduced substrate at
the sites. This substrate consisted of rocks collected at the
upstream Santa Anna River site. The authors rationale for
this procedure related to previous impact to X-trib and the
Santa Anna River from toxic substances discharged from
the WWTP. The authors fail to address the question of
why fauna would not have naturally recolonized sites on
X-trib and Santa Anna River if water quality had improved.
From a biological perspective, the existing
macroinvertebrate communities at a given site better repre-
sent water quality over time than introduced fauna. If the
introduced substrate procedure is to be used, information
on response time (to toxic substances and particularly met-
als, which tends to be slow as bioaccumulation occurs) of
the introduced ma'croinvertebrates should have been pro-
vided, but was not. Furthermore, the introduced substrates
were placed at each site for only four weeks;
macroinvertebrate communities tend to respond slowly to
metals and other toxicants which exert effects after
bioaccumulation.
The authors placed much less emphasis on the second
type of bioassessment procedure which was grab samples
at each site. Clearly, however, these resident communities
would be much more representative of bioaccumulative
substances. Sampling was conducted during fall (October)
and spring (April).
Toxicity tests were performed using the 7-d larval fathead
minnow and Ceriodaphnia protocols. Ceriodaphnia tests
were completed on two effluent samples taken in May and
two collected in October. No sample revealed toxicity.
Larval minnow tests were conducted with two effluent sam-
ples collected in October (neither indicated toxicity) and
one sample taken in May. This May sample indicated sig-
nificant toxicity. Unfortunately, two other effluent samples
collected May (afterthe first May sample indicated toxicity)
were not tested with larval fathead minnows. The failure
to follow up on the first indication of toxicity was an experi-
mental error. Furthermore, the very few effluent samples
which were tested do not allow characterization of the
WWTP effluent (WWTP effluents tend to show consider-
able temporal variability). Therefore, the authors' conclu-
sion that the toxicity data indicated the effluent should not
impact the receiving water biota is not supported by data
presented. The seven day tests are not good measures
of bioaccumulative impacts of metals.
Although replicates were included in the ecological survey,
variability among the replicates was not reported and fur-
ther complicates data interpretation.
-38-
-------
Grab sample bioassessment data collected in the fall ex-
plicitly revealed that the X-trib sites did not correspond with
the reference sites. According to several of the
macroinvertebrate indices, the lower X-trib site was im-
paired compared to the upstream site and to the reference
site; fish data also suggested that the lower site was im-
pacted. As, indicated above, the introduced substrate (IS)
data should be Interpreted with caution; nonetheless, even
these data imply that the lower X-trib site was impaired
compared to the site above the discharge point. Perhaps
more importantly, the fall grab sample at Santa Anna River
sites downstream of the confluence with X-trib indicate that
they were impacted. No fish data were reported for the
Santa Anna River.
IS data were presented for only two X-trib sites and two
Santa Anna River sites (those above and below the X-trib
confluence). Failure to include other downstream sites, as
well as the short exposure time (see above) limits the
value of these data. Nevertheless, examination of the
dominant taxa on the IS suggests that water quality at the
upstream Santa Anna River site was better than at the site
below the confluence. Although differences between
means of several of the bioassessment metrics when
comparing the upstream and downstream river sites are
large, they were reported as not being statistically different.
This is likely due to the fact that an analysis of variance
was applied to data from both X-trib and Santa Anna River.
This application does not seem justified given that the
tributary and the river are such different habitats.
Moreover, the two X-trib site macroinvertebrate indices
means were frequently so large that variation in the data
sets masked differences between Santa Anna River sites.
In the spring collections, the macroinvertebrate grab sam-
ples analyzed from the lower X-trib site indicated that it
was impacted compared to both the upstream and refer-
ence sites. IS data from X-trib for the spring sampling
period were not presented. During the spring grab sam-
ples for macroinvertebrate analysis were taken at only two
Santa Anna Riversites, the upstream and the downstream
site nearest the confluence. The absence of data from the
other two Santa Anna River sites further limit this data set.
Although there were few apparent statistical differences
between macroinvertebrate indices from the two sites, bio-
logical community composition indicated that the site below
the X-trib confluence was impacted; the same trend was
noted in the IS data.
Data presented in this publication do not support the
authors' contention that neither X-trib nor the Santa Anna
River are impacted by WWTP effluent constituents. They
attribute the impacts indicated in X-trib by the grab sample
macroinvertebrate data to habitat limitations. If this is actu-
ally the case, one must conclude that their study design
was flawed from the outset. However, their conclusion is
not supported by the differences between the upstream
and downstream sites (as shown in all types of
bioassessment data).
A.5 Birgeetal. (1992)
Birge et al. were involved in a relatively long-term study of
the effluents produced by the Paducah Gaseous Diffusion
Plant (PGDP) and the streams into which these effluents
are discharged, Big Bayou and Little Bayou Creeks. Toxic-
ity, chemical, and bioassessment monitoring were per-
formed. Specifically investigating the relationship between
effluent/ambient toxicity test results and instream biological
responses was not a stated goal of this study, but some
interesting information can be gleaned from their results.
The PGDP has 16 potential discharge points into the two
creeks. The focus was on eight of these effluents because
they constitute continuous discharge to the streams.
Seven-day Ceriodaphniaand larval fathead minnow tests
were conducted with 51 undiluted effluent samples and
with 37 stream samples collected on four different
occasions. Instream biological assessments (primarily.
number of taxa and density) of benthic macroinvertebrates
were performed at three separate times (1987-91) at eight
sites. One of these sites was above discharge points,
three sites were at increasing distances from the last
discharge point, and the other four sites were a gradient
within the spatial range of the several discharge points.
Bioassessment data were collected in 1987 through mid-
1988. Four separate sampling events indicated instream
biological impairment at sites within the range of discharge
points (as compared to the upstream reference site). At
sites below the last discharge point, there appeared to be
progressive recovery as measured by number of taxa and
density of macroinvertebrates. Ecological survey data
collected in 1990 and 1991, but not 1989, were similar to
those collected in earlier years. The toxicity testing data
are summarized below:
Larval Fathead Minnows
Effluent:
Significant mortality in 31/51 samples (61%)
Ambient water downstream:
Significant mortality in 18/37 samples (49%)
Ceriodaohnia
Effluent:
Significant toxicity in 11/51 samples (22%)
Ambient water downstream:
Significant toxicity in 4/37 samples (11%)
-39-
-------
The difference in undiluted effluent and ambient water
toxicity appeared to be primarily a dilution phenomenon.
Generally, effluent toxicity predicted instream toxicity when
dilution was taken into consideration. On a qualitative
basis, instream toxicity reliably predicted instream
biological responses.
A.6 Pratt etal. (1989)
The potential impact of a municipal sewage effluent on
Smith River (Virginia) was evaluated using acute and
chronic single species toxicity tests and a microcosm test
consisting of indigenous microbiota. Effect levels obtained
in the single species and microcosm studies on effluent
were compared with the estimated instream waste con-
centration (IWC) and with results of an ecological survey.
The study consisted of two sites upstream of the
wastewater treatment facility (WWTP) and three sites
below the outfal! of the facility. A survey of benthic
macroinvertebrates and protozoan communities was con-
ducted at each of these sites, Effluent from the WWTP
was tested in the 7-d larval fathead minnow and
Cerlodaphnia tests, as well as in the indigenous species
microcosm test. The microcosm test consisted of micro-
organisms. River water samples collected at one site
above the WWTP outfall and at all sites below the outfall
were tested in the 7-d Ceriodaphnia test, but not the larval
fathead minnow test.
The macroinvertebrate data suggested impairments (com-
pared to the upstream control) at the first two sites below
the WWTP outfall, with recovery at the third site. The
Ceriodaphnia tests did not show significanttoxicity in water
samples collected at the two impacted sites. LOECs were
30% effluentin the microcosm and Ceriodaphniatests and
15% in the larval minnow test. Maximum IWC was esti-
mated to be 9.5% effluent; NOECs in all laboratory tests
were 10% effluent. Therefore, both the effluent and
ambient water single species tests underestimated
instream impacts.
A.7 Crossland et al. (1992)
The toxic fraction (chlorinated ethers) of a petrochemical
manufacturing plant effluent was studied in simulated out-
doorstreams. Four different concentrations of the effluent
extract were tested in the streams; exposure was for 21 to
28 days. Two untreated streams served as controls.
The LOEC and NOEC (Gammaruspuletf in the simulated
streams were 0.86 ug/L and 0.44 ug/L, respectively.
These values were compared to the NOEC from a 7-d
Daphnla magna laboratory test; the reproduction NOEC in
this test was 1.0 pg/L. Although a 21-day Daphnia test
would have been more appropriate, the result from the
single species test was an effective qualitative predictor of
effect concentration in the mesocosm; the Daphnia data
slightly overestimated the artificial stream effect concen-
tration.
A.8 Robinson et al. (1994)
These investigators conducted an examination of the rela-
tionship between environmental responses at 11 pulp mills,
their pulping processes, degree of effluent treatment, and
bleaching technologies. Water samples from upstream
and downstream of the pulp mill discharge points were
screened in the 7-d larval fathead minnow and
Ce/vbdap/7/7/a tests. These data were compared to physio-
logical data collected from fish and benthic
macroinvertebrate data from above and below the dis-
charge points.
At four of 11 pulp mills the benthic macroinvertebrate com-
munities were characterized as highly impacted below the
discharge point compared to upstream sites. Statistically
significant toxicity was detected in water samples down-
stream of all four of these mills in the larval fathead test,
but at only one site in the Ceriodaphnia test. These four
mills only had primary effluent treatment. Although the
larval minnow test reliably indicated instream impacts at
the four sites, the Ceriodaphnia test was less effective.
Neither of the single species tests predicted the physiologi-
cal impairments seen in fish collected below the pulp mill
outfalls. Physiological responses associated with
reproductive dysfunction (decreased sex steroid levels and
gonad size) and other disturbances (increased liver size
and enzyme abnormalities) were observed in fish collected
below pulp mill discharge points regardless of mill process,
bleaching technology, or effluent treatment. This study
represents a case in which the single species tests failed
to predict (i.e., underestimated) instream impacts of
effluents.
A.9 Sasson-Brickson and Burton (1991)
In situ exposures of C. dubia were conducted in a stream
know to be impacted (based on benthic macroinvertebrate
and fish community data) by several effluents. The
C. dubia were in sediment exposure chambers placed in
the stream for 48 h at an impacted site and at a reference
site. Sediments from the impacted and reference sites
also were tested in the laboratory with C. dubia using
sediment solid phase, interstitial water, and elutriate tests.
Both the in situ and laboratory tests indicated statistically
significant sediment toxicity; the responses in the labora-
tory tests were greater than in the in situ exposures. The
authors concluded that the in situ exposures proved to be
sensitive indicators of both degraded and nondegraded
stream conditions. They also implied that the in situ re-
sponses were more reliable than the laboratory responses.
This may not be a valid conclusion since neither the
-40-
-------
laboratory nor the in situ responses were quantitatively
correlated with instream biological community impacts.
A.10 Barbour et al. (1996)
Barbour et al.(1996) summarized studies conducted by
Ohio EPA (see also Yoder, 1991) in which agreement be-
tween data from 48-h C. dubia and fathead minnow
toxicity tests and from biosurveys were analyzed. Toxicity
tests were performed on effluent and in some cases on
mixing zone water samples. These authors surmised that
the Ohio EPA analysis indicates that "the observance of
acute toxicity, or lack thereof, in an effluent and to a lesser
degree in mixing zones is not necessary, reflected by the
instream communities." According to these authors other
impacts often pre-empted or masked effects of toxicity.
The authors concluded, "These results should not be mis-
construed to claim toxicity testing is an invalid assessment
and regulatory tool."
Indeed, caution should be used in making conclusions
from the Ohio EPA data for several reasons. Toxicity tests
were not performed on water samples collected at
biosurvey sites. No information was provided on the
degree of effluent dilution at each of the biosurvey sites.
It appears that toxicity tests were performed on whole
effluent, without dilution series to assess effect concen-
trations. Predicting biological community impacts based
on the results of one toxicity test on effluent (or mixing
zone sample), as the authors indicate, is unsound.
Barbouretal. (1996) also summarized similar studies per-
formed by the Florida Department of Environmental
Protection (DEP). In this project, 48 h toxicity tests with
Ceriodaphnia and Notropis leedsi(a marine minnow) were
performed on effluents from 107 facilities classified into
several industrial categories. Macroinvertebrate surveys
were conducted on streams into which the facilities dis-
charged. Comparisons were made between effluent
toxicity and biosurvey data.
Effluent toxic, stream site impaired = 24.0%;
Effluent toxic, stream site not impaired = 10.7%;
Effluent not toxic, stream site impaired = 41.3%;
Effluent not toxic, stream site not impaired = 24.0%.
Combining data from all facilities the following relationships
between effluent toxicity and instream survey data were
obtained. Toxicity tests reliably "predicted" instream condi-
tions in 48% of the 107 situations. "False positives" were
relatively rare (10.7 %). "False negatives" were much
more common (41.3 %). Florida DEP attributed biological
impairment at a large portion of the "false negative" sites
to non-effluent related factors.
Barbouretal. (1996) concluded that lack of agreement
in this study was not necessarily due to contradiction
between the toxicity testing and biosurveys. This conclu-
sion seems valid since in many cases biological impair-
ment was due to causes other than effluent toxicity. Also,
toxicity tests were performed on only one sample from
each facility (as indicated above, one sample is not likely
to characterize the effluent of a facility). Furthermore, the
same cautions as mentioned above in regard to the Ohio
EPA data apply here. That is, toxicity tests were not
performed on water samples collected at biosurvey sites.
No information was provided on the degree of effluent
dilution at each of the biosurvey sites. It appears that
toxicity tests were performed on whole effluent, without
dilution series to assess effect concentrations.
A.1I1 Mount etal. (1992)
During fossil fuel production water pumped from the
formation is separated and discarded, frequently into
marine or freshwater environments. This fraction, com-
monly termed "produced water" can contain a diverse
array of contaminants including brine, hydrocarbons, heavy
metals, surfactants, and corrosion inhibitors.
Mount and colleagues reported on a series of laboratory
and field studies which were conducted on produced water
from a coal bed methane operation in the Cedar Cove De-
gasification Field of Alabama. The produced waters were
discharged into Little Hurricane Creek. The primary goal
of the studies was to determine the environmental accept-
ability of discharging produced water into this creek.
Toxicity tests we re performed on the produced water using
USEPA's fathead minnow and Ceriodaphnia tests; the
cladocerans proved to be the more useful monitoring tool
in these studies. Concurrent with the laboratory toxicity
tests, a series of instream surveys were performed on Little
Hurricane Creek. Based on these data the authors con-
cluded, "Research conducted at Cedar Cove suggests that
laboratory toxicity tests can be used to predict instream
effects of produced water discharge.
-41-
-------
Appendix B
Single Species Tests with Individual Chemicals
The following consists of an interpretive summary of
studies in which single species tests were used to assess
the toxicity of a single chemical or combination of a small
number of chemicals and predict effect concentrations on
aquatic ecosystem biological responses.
B.1 Organic Chemicals: Pesticides
B.1.1 Hansen and Garton (1982)
These investigators assessed the ability of single species
toxicity test results to reliably predict the effects of the in-
secticide diflubenzuron on complex laboratory stream com-
munities. The single species tests included five "chronic
tests" with five different species, including a 21 -d Daphnia
magna test. The laboratory stream communities were
stocked from a natural source and then exposed to the
pesticideforfive months. Effects on the stream communi-
ties were appraised at the functional group level using bio-
mass and diversity.
For Daphnia, the 21-day LC50 for this pesticide was
0.1 ug/L. Statistically significant effects on invertebrate
shredder, scraper, and collector/gather/filterer functional
groups were evident in the mesocosm after 5 to 7 months
exposure at a nominal concentration of 0.1 pg
diflubenzuron/L. The Daphnia toxicity test results ap-
peared to reliably predict the responses of aquatic inver-
tebrate communities. However, there is uncertainty in
these data. For example, duration of exposure in the
laboratory and field setting were very different and
mesocosm exposures were not analytically confirmed
(dissipation and degradation usually results in lower than
predicted exposure concentrations). LCSOs are not nec-
essarily the optimal predictor tool; however, if environ-
mental concentrations of a chemical approach the LC50
level, biological community impairments are probable.
Another confounding factor in this study was that the
control populations declined during the five month course
of the study.
Although there were uncertainties and confounding factors
in this study, the correspondence between laboratory and
field effect concentration supports the hypothesis that labo-
ratory test results are predictive indicators of direct effects
in the environment. Concentrations below the laboratory-
determined LC50 were not tested in the mesocosms, so
it is not possible to know whether field effect levels were
overestimated.
B.I.2 Baughman etal. (1989)
To evaluate the usefulness of laboratory toxicity tests in
predicting fenvalerate (a pyrethroid insecticide) impacts,
Baughman et al. conducted laboratory and field tests with
the grass shrimp (Palaemonetes pugio). Two types of
laboratory tests were conducted: 96-h static-renewal tests
and 6-h pulse dose exposures. The response (endpoint)
compared between laboratory and field was the LC50.
Response of grass shrimp in the field was similar to the
laboratory toxicity tests (i.e., concentrations which were
shown to produce lethality in the laboratory also caused
mortality in field settings). These results indicated that
physical and chemical factors in natural ecosystems did
not appreciably modify the toxicity of fenvalerate.
In this study, laboratory test data were not extrapolated
across species, but rather to the same species in natural
stream conditions. Although not a powerful support of the
reliability of laboratory single species test results as pre-
dictors of instream biological impacts, this study does show
a correspondence between laboratory and natural ecosys-
tem effect concentrations.
B. 1.3 Clark et al. (1987)
Clark and colleagues scrutinized laboratory toxicity test
results as predictors of effects of fenthion, an
organophosphorus insecticide, on caged animals in field
settings. The laboratory tes;ts were 96-h mortality deter-
minations on a mysid (M. bahia), the pink shrimp (Penaeus
duoraum), the grass shrimp (Palaemonetes pugio), the
sheepshead minnow (C.variegatus). The responses used
for comparisons were 24-h and 48-h LCSOs. Caged
animal tests and environmental chemical studies
(measurements of fenthion) were executed in a bay and
a pond connected to Santa Rosa Sound, as well as in an
estuarine bay.
Results of this study reveal that the laboratory-derived
LCSOs were reasonable predictors of mortality to the same
species in the field, but only when laboratory and field ex-
posure regimes were similar. The laboratory LCSOs were
not effective predictors of sublethal effects. As in many of
the studies summarized above, the findings of this study
disclose that physical and chemical factors in aquatic eco-
systems did not appreciably alter the toxicity of this pesti-
cide. Caging of animals in the field did not allow for
-42-
-------
possible avoidance behavior. The advantages and
limitations of using acute exposure LCSOs as predictors
of instream biological responses were mentioned above.
B. 1.4 Fairchild et al. (1992)
Population, community and ecosystem level responses to
pulse doses of esfenvalerate, a pyrethroid insecticide,
were studied in experimental aquatic mesocosms. Differ-
ent mesocosms were dosed at nominal concentrations of
0,0.25, 0.67, and 1.71 ug/L esfenvalerate (each concen-
tration had triplicated mesocosms). The pulse dosings
were 15 minute applications to achieve the nominal con-
centrations every two weeks for a total of three months.
Static acute (48-h) toxicity tests with the insecticide were
conducted with D. magma and provided an LC50 of 0.27
ug/L esfenvalerate; neither a LOEC or NOEC were
reported. This laboratory effect level was compared to
effect levels seen in the mesocosm portion of the study.
I n the mesocosm component of the study, zooplankton and
benthic macroinvertebrate populations were significantly
decreased at the pulse dose treatment of 0.25 ug/L. There
were also shifts in community composition and dominance
at this treatment level. This was the lowest pulse dose
tested, so an NOEC was not established. The effect con-
centration in the mesocosm was compatible to the labora-
tory generated 48-h LC50 forthis pesticide. With the differ-
ence in exposure patterns and durations in the laboratory
(single species) test and multispecies mesocosm studies,
it is remarkable that a mesocosm effect concentration cor-
responded so well with the laboratory test results.
In the mesocosm, 0.67 ug/L esfenvalerate reduced sur-
vival, biomass, and reproductive success of bluegill sun-
fish. The laboratory LC50 for juvenile bluegills exposed to
esfenvalerate for 96 h ranged from 0.42 to 1.35 ug/L
(Mayer and Ellersieck, 1986). This finding indicates that
a laboratory effect concentration translates reliably into a
field effect level. Also inherent in this observation is that
the complex, multivariate conditions in the mesocosm did
not appreciably modify the toxicity (i.e., bioavailability) of
chemicals seen in highly controlled laboratory studies.
Overall, the results of this study imply that single species
toxicity test results can qualitatively predict effect concen-
trations in more complex, multivariate systems. Another
study (Little et al., 1993) with fenvalerate also indicated
that laboratory determined effect concentrations were
reliable predictors of effect concentrations in natural sys-
tems. Forfenvalerate, and possibly other pesticides which
have relatively short half-lives and may not exist in aquatic
ecosystems for extended periods, acute toxicity test
endpoints may be reliable predictors of biological
community responses.
B.1.5Slooff(1985)
A multiple species microcosm toxicity test was conducted
in the Netherlands to determine the NOEC for the herbi-
cide, dichlorbenil. An NOEC was also determined for
Daphnia magna in the 21 -d short-term estimate of chronic
toxicity.
The microcosm NOEC from a 400 day exposure was
0.3 ug/L. The NOEC, 0.1 ug/L, determined in the Daphnia
test was a qualitatively accurate predictor of the mesocosm
no effect level. In Slooff's review of the literature on
dichlorbenil, the NOEC determined from 167 field expo-
sures of various species was also 0.1 ug/L.
After reviewing other data in the literature and in relation
to these data, Slooff (1985) submits that multiple species
(micro- or mesocosm) toxicity test results are not better
predictors of aquatic ecosystem responses than are single
species toxicity test results. He concludes thatthe multiple
species test results have many uses, but that, at their cur-
rent stage of development, they do not improve predictions
of ecosystem impairments.
B.1.6 Larsen et al. (1986)
Microcosm test data have been proposed as having more
ecological relevance than laboratory indicator species test
results. Larsen and co-workers compared the predictive
reliability of "surrogate" species and mesocosm toxicity test
results with responses in experimental ponds to the herbi-
cide, atrazine. This study compared the responses of algal
tests, a algal microcosm, and experimental ponds exposed
to similar concentrations of the herbicide. Eight different
algal species were included in the indicator species tests.
The endpoints used for comparisons in all three systems
were EC50s (the chemical concentration at which 50% of
the test population exhibits a response).
According to these investigators, the basic similarity
among the EC50 values across test systems suggests that
results from a combination of single species tests or from
the mesocosm provided a reasonable estimate of the
concentration of atrazine that produced similar effects on
the experimental pond. Both the lowest and highest EC50
came from single species tests. These authors conclude
that, "because broad ranges in species sensitivities occur,
use of only a few test species might not offer sufficient
environmental protection." Improvement in predictive
ability occurs when several species are used as test
organisms. Although this study provided valuable
information, the EC50 endpoint may not be the most
realistic response measure to compare tests. One would
predict that a concentration of chemical(s) which is high
enough to affect 50% of a test population, has a high
potential of evoking significant biological community re-
sponses.
-43-
-------
Single species toxicity tests, microcosm, and outdoor ex-
perimental pond exposures have been employed by other
investigators (Stay et al., 1985; de Noyelles and Kettle,
1985) to ascertain the effects of atrazine on algal primary
production. Both the single species and microcosm tests
were predictive of atrazine concentrations which signifi-
cantly reduced production in the outdoor ponds. However,
recovery from atrazine stress was not predicted by the
laboratory tests. In the single species and microcosm tests
there was only limited recovery whereas the pond commu-
nities recovered more quickly because sensitive algal spe-
cies were replaced by algal species more resistant to
atrazine. The ecological significance (over time) of this
shift to more chemically resistant assemblages was not
discussed and is unknown. Composition could change at
all trophic levels due to the shift in algal species. All algal
species were affected by atrazine in all test regimes, but
only the pond study revealed assemblage shifts.
B.I.7 Cross/and (1984)
Studies with the insecticide methyl parathion revealed that
laboratory single species toxicity test results underesti-
mated the secondary effects (indirect effects that are not
represented by direct action of a chemical on an individual
species, but rather result from interrelationships among
components of a biological community) of this pesticide in
outdoor ponds. The concentrations of methyl parathion
eliciting toxicity, and thus decreasing populations, in zoo-
plankton and benthic macroinvertebrates in the outdoor
ponds were reliably predicted by the laboratory single spe-
cies toxicity test results.
The decreased populations of mayfly larvae and daphnids,
cased by methyl parathion, secondarily resulted in blooms
of filamentous algae. Death and decay of the algae in turn
decreased dissolved oxygen resulting in death of fish.
Loss of invertebrate food items also caused reduced fish
populations and smaller sized fish.
B.I.8 Stephensen and Kane (1984)
The fate and biological effects of the insecticides methyl
parathion and linuron in outdoor ponds were studied. The
relative sensitivities (response per concentration) were
similar in both the laboratory and ponds for both pesticides.
Furthermore, the response concentrations determined for
Daphnia magna in the laboratory correlated closely with
effect concentrations in the outdoor pond. The authors
concluded that biotic and abiotic factors existing in ponds
did notalterthe toxicity (i.e., bioavailability compared to the
laboratory tests) of these two pesticides.
B.1.9 Van Wijngaarden etal. (1996)
Using the insecticide Dursban 4E (active ingredient
chlorpyrifos, an organophosphorus pesticide) these inves-
tigators compared the results of laboratory indicator spe-
cies toxicity tests with laboratory tests on indigenous spe-
cies, as well as with data from outdoor mesocosm tests.
Mesocosms were sprayed once with the intent of achieving
nominal chlorpyrifos concentrations of 0.1, 0.9, 6, and
44 ug/L. Analytical measurements of chlorpyrifos were
used to determine exposure and effect concentrations.
Effects in the mesocosms were assessed by sampling
zooplankton and macroinvertebrates; in addition, in situ
cage experiments were performed with several species.
The indicator species, D. magna, was almost as sensitive
to chlorpyrifos as the indigenous speci.es. The difference
between the laboratory EC50s for the daphnid (1.0 ug/L)
and that for the most sensitive indigenous species,
Gammaruspulex(0.8 ug/L) was small, suggesting that the
indicator species was not more sensitive to the insecticide.
Effect concentrations (for nine invertebrate species) deter-
mined in single species laboratory tests were compared to
effect concentrations derived in the mesocosm exposures.
The authors concluded that laboratory single species EC
values were reliable estimators of mesocosm ECs, differ-
ing by less than a factor of three for the seven species
studied. Essentially the same conclusions were reached
when comparing ECs from the laboratory toxicity tests with
those from the cage experiments. These data indicate that
chlorpyrifos bioavailability was not significantly reduced
under the mesocosm conditions.
Although there was considerable spatial and temporal vari-
ation of chlorpyrifos concentrations in the mesocosm expo-
sures, ECs determined underthose conditions were similar
to the laboratory ECs obtained under constant exposure
regimes (i.e., variable and constant exposure regimes led
to comparable effects). According to these authors,
laboratory single species toxicity test results can be used
to estimate direct effects in field populations.
In a subsequent publication, van den Brink et al. (1996)
reported that recovery of invertebrate populations afterthe
single application of chlorpyrifos required three to six
months. The investigators also suggested that "safe" con-
centrations determined in short-term single species labora-
tory toxicity tests are sufficient to protect invertebrate com-
munities.
Several other investigators (Eaton et al., 1985; Brock et al.,
1992; Leeuwangh et al., 1994) also concluded that the
direct effects of chlorpyrifos on aquatic invertebrate com-
munities can be reliably predicted on the basis of
laboratory single species toxicity data; that is, population
responses observed in microcosms and mesocosms were
consistent with laboratory single species toxicity test
results.
B.I.10 Kersting and van Wijngaarden (1982)
The effects of chlorpyrifos were studied in a laboratory
microcosm system. Daphnia magna was the herbivore
component in the microcosm; this species was also the
subject of a laboratory 48-h lethality test. The microcosms
-44-
-------
received a single application of chlorpyrifos, and the re-
sponses were followed for 130 days.
Chlorpyrifos concentration in the Daphnia component of
the mesocosm was 0.5 |jg/L on day 1, decreasing to
0.2 |jg/L by day 7. The laboratory 48-hour LC25 for Daph-
nia was 0.4 ug/L. Although pesticide concentration
decreased rapidly to belowthe LC25, Daphnia populations
in the exposed microcosms decreased 36% and 42% in
the two replicates. Populations recovered within two
weeks. The ecological significance of the magnitude and
duration of population declines is unknown.
Arguably, the laboratory LC25 was an effective predictor
of the population decline in the mesocosm. However,
chlorpyrifos treatment resulted in other biotic and abiotic
changes in the microcosms. The laboratory Daphnia tests
underestimated these other, "secondary" mesocosm ef-
fects.
B.1.11 Siefert et al. (1989)
The effects of chlorpyrifos in natural pond enclosures were
investigated. The pesticide was applied once to sets of
replicate ponds to achieve three different test concentra-
tions; pond concentrations of the pesticide were
analytically monitored. Phytoplankton, periphyton, zoo-
plankton, benthic macroinvertebrates, and fish were
sampled periodically up to 30 d post-application. Limita-
tions in this study include the absence of normal exchange
between the enclosures with the remainder of the pond;
also chlorpyrifos adsorbed to the wall material of the
enclosures (this problem relates mostly to environmental
fate, rate of loss, but also to a decrease in potential
exposure); this difficulty was partially offset by the
monitoring of pesticide concentrations in the water column.
The targeted concentrations were 20, 5, and 0.5 ug/L.
Chlorpyrifos concentrations decreased rapidly after appli-
cation (see above) to 10,1, and 0.2 ug/L by day two post-
application. Cladocerans were the most sensitive of the
zooplankton species, with all five identified species
showing dramatic and statistically significant population
declines at the lowest chlorpyrifos concentrations.
Chironomids were the most sensitive benthic
macroinvertebrates, with 9 of 10 of the identified species
responding to the lowest chlorpyrifos treatment with sta-
tistically significant population declines.
Laboratory determined acute toxicity LC50s (54 species)
from the literature were compared to the pond effect con-
centrations. In general, the single species LC50 values
were higher than the LOEC determined in the pond study
(i.e., LC50s underestimated biological community
responses); LC50sfrom Daphniaand Gammart/swerethe
most accurate forecasters of the pond LOEC. Significant
reductions in growth rates in larval fish, not predicted by
direct effects of chlorpyrifos, were also noted in this study.
The authors attributed these secondary effects to
chlorpyrifos-caused declines of invertebrate forage
organisms.
B.2 Additional Organic Chemicals
B.2.1 Cooper and Stout (1985)
The effects of p-cresol on the biota in outdoor experimental
stream channels (analogs of natural streams) were com-
pared to the results of single species tests with this chemi-
cal. Three hypotheses were tested:
1) The transfer of laboratory acute toxicity test results
to field situations is possible without serious
distortion.
2) Data from single species tests with p-cresol will
yield similar results as multiple species, community
level, tests.
3) Pulsed exposures with short time intervals be-
tween events will produce the same ecological re-
sponses as continuous exposure with the same inte-
gral of exposure (integral of concentration X time).
In regard to the first hypothesis, data from this study
showed that the acute toxicity tests with fathead minnows,
large mouth bass, small mouth bass, damsel fly larvae,
and amphipods estimated survivorship rates consistent
with results of the experimental stream experiments.
These investigators also concluded that results of the sin-
gle species tests were effective predictors of community
level responses. The third hypothesis was found to be
untrue in that the pulse exposure (with same integral) pro-
duced greater impacts than continuous exposure. These
pulse-response results are useful in interpreting data col-
lected in agricultural settings where aquatic communities
may be exposed to pulses of pesticides.
B.2.2 Dorn et al. (1991)
These researchers undertook a project to estimate the
environmental effects of a choloetherf raction from a chem-
ical plant effluent. The chemical plant effluent had been
shown to be toxic to sheepshead minnows (Cyprinodon
varigatus) and a mysid (Mysidopsis bahia). Toxicity
identification evaluation (TIE) procedures demonstrated
that the primary causes of toxicity was a mixture of
pentacholroethers.
To gauge effect concentrations of the chloroether fraction,
laboratory toxicity tests were executed with Daphnia
magna, fathead minnow larvae, and Mysidopsis bahia.
Effect concentrations forthe chloroetherfraction also were
assessed in outdoor artificial streams.
The most sensitive indicator species was the water flea;
the NOEC for this species was 1.0 mg/L. NOECs in the
outdoor streams were 0.44 and 0.26 mg chloroethers/L for
Gamma/usand rainbow trout, respectively. The laboratory
effect concentrations forthe chloroethers in single species
-45-
-------
tests were reliable qualitative predictors of effect
concentrations in the outdoor stream experiments. The
outdoor stream communities were somewhat more sensi-
tive to the toxicants than indicated by the laboratory single
species tests.
In a follow-up study (Crossland et al., 1992), a range of
chloroether fraction concentrations was tested in outdoor
artificial streams. Exposure was for 28 days. Four differ-
ent concentrations were tested; there were no replicate
streams except for the control treatment. Three mesh
bags of macroinvertebrates were introduced into each
stream. However, the number of benthic
macroinvertebrates of a given species was not equivalent
in the different treatment groups at the time of pretreatment
sampling; thus statistical comparisons among the
treatments was not possible. Furthermore, there was
considerable variation among "replicates" within each
stream. Feeding rates of the amphipod, Gammaruspulex,
also were assessed in the artificial streams. These and
other factors render interpretation of macroinvertebrate
data difficult.
Gammarus numbers were significantly reduced at a
chloroether concentration of 0.86 mg/L, but not at
0.44 mg/L (the NOEC). Invertebrate drift (possibly indicat-
ing an unhealthy condition also appeared to be increased
at chloroether concentrations of 0.44 m/L and above. In
laboratory 21-d Daphnia magna tests, the chloroether
NOEC was 1.0 mg/L and the LOEC was between 1 and
2.5 mg/L Although comparable effect concentrations were
seen in the laboratory single species test and the artificial
stream data, the outdoorpopulations were somewhat more
sensitive to the chloroethers.
0.2.3 Fairchild et al. (1993)
Laboratory and field studies were conducted with linear
alkylbenzene sulfonate (LAS, an anionic surfactant), by
Fairchild et al., to evaluate the use of laboratory-generated
NOECs for protecting aquatic ecosystems. Laboratory
toxicity tests included the 7-d fathead minnow test and a
7-d test with the freshwater amphipod, Hyalella azteca. A
series of these tests with exposures to a range of LAS
concentrations resulted in a laboratory estimate of a
NOEC. This laboratory test predicted NOEC was then
tested in the field with a 45-d exposure in outdoor experi-
mental streams (three replicates).
In these experimental streams, exposure to LAS concen-
trations equivalent to the laboratory NOECs, no biological
community impairments were seen as gauged by surveys
of benthic macroinvertebrates, periphyton growth, detritai
processing, and fathead minnow populations. The authors
concluded that their results indicated that the laboratory-
generated NOECfor LAS predicted environmental protec-
tive concentrations. Results of this study do not demon-
strate that concentrations above the laboratory NOEC
would have engendered impacts in the outdoor
experimental streams, but do suggest that the single spe-
cies toxicity test results can be useful tools in predicting
environmentally "safe" concentrations.
B.2.4 Boyle et al. (1985)
These investigators compared the responses (survival and
growth) of bluegill sunfish and large mouth bass exposed
to fluorene (a polynuclear aromatic hydrocarbon) in labora-
tory 30-d partial life cycle tests to the responses of the
same species exposed in outdoor ponds.
The laboratory toxicity test results underestimated the re-
sponse of these fish species in the outdoor ponds. More-
over, the responses in the experimental ponds were more
sensitive to fluorene (e.g., occurred at a lower concentra-
tion than in the laboratory tests). To the contrary, labora-
tory toxicity tests overestimated responses of zooplankton,
phytoplankton, and some insect populations to fluorene.
B.2.5 Giddings and Franco (1985)
The effects of a synthetic coal-derived crude oil were
assessed in outdoor ponds and indoor microcosms. The
results of these tests were compared with data from labo-
ratory single species toxicity tests. Response concentra-
tions were similar in the microcosms and pond studies. A
"safe" exposure concentration for this organic compound
was derived from the pond study. Without an application
factor, the USEPA final acute and chronic values were
higher than this "safe" concentration, whereas the LOEC
of a 28-d D. magna laboratory test provided an effective
prediction of the "safe" concentration.
B.2.6 Crossland and Wolff (1985)
In this study [97] the effects of pentachiorophenol (PCP)
were examined in outdoor experimental ponds. PCP was
repeatedly applied to the subsurface water of three ponds
with the aim of maintaining an average concentration of 50
to 100 ug/L for 30 days. There were also replicate control
ponds. Actual pond concentrations of PCP averaged 19
to 21 ug/L (days 1 through 14) and 60 to 69 ug/L (days 15
through 43). No statistically significant effects were ob-
served on algal, zooplankton, benthic macroinvertebrate,
orfish populations. It should be noted, however, thatthere
was considerable within and between replicate pond vari-
ability. The three lowest laboratory determined PCP LC50
values gleaned from the literature were 52 ug/L (96-h rain-
bow trout), 100 ug/L (8-d snail egg production and
viability), and 130 ug/L (16-day snail egg viability). Since
most of these effect concentrations from the most sensitive
species in the database were greater than PCP
concentrations in the ponds, impacts would not be
predicted. Based on these observations, the authors
contended that a combination of single species toxicity test
results can effectively predict environmentally "safe"
-46-
-------
concentrations for a chemical. The variability of the
treatment concentrations as well as concentration variation
within and between replicate ponds, in addition to the fact
that a pond effect concentration was not established
renders this study inconclusive regarding the accuracy of
single species test results in predicting environmental
impacts.
B.2.7 Giddings et al. (1984)
These investigators (Giddings et al., 1984; Franco et al.,
1984) examined the impacts of phenolic compounds on
biological communities in outdoor ponds. The phenolic
compounds were administered to replicate ponds daily for
56 days; five different treatment levels were compared to
control ponds. A laboratory-generated 28-d test LOEC for
Daphnia magna was a relatively good forecaster of a
phenol effect concentration in the experimental ponds.
However, the most sensitive indices of biological commu-
nity structure/function were affected at phenol concentra-
tions lower than the laboratory chronic LOEC. A 48-h test
LC50 for Daphnia was not a good predictor of pond effects
because much lower concentrations of the phenolic com-
pounds impacted pond communities.
B.2.8 Nimmo et al. (1989)
Acute (96 h) toxicity tests with fathead minnows, Johnny
darters (Etheostoma nigrum), white suckers (Catosotomus
commersoni), as well as acute and 7-d C. dubia toxicity
tests were conducted by Nimmo et al (1989) to evaluate
whether river water (Vrain River in Colorado) ameliorated
toxicity of ammonia compared to laboratory tests in which
well water was used.
For most of the test species, ammonia LCSOs were equiv-
alent in the river water compared to the laboratory well
water. These data illustrated that there was not an amelio-
ration of ammonia by river water; that is, the laboratory test
results did not overestimate toxicity measured instream.
Related to the above observation, laboratory single
species toxicity tests with polychlorinated biphenyls
overestimated the concentration demonstrated in field
studies to decrease diversity in invertebrate populations,
that is the field populations were more sensitive (Roberts
eta!., 1978).
B.2.9 Adams et al. (1983)
The toxicity of a commercial phosphate ester product
(PEP) determined in outdoor tanks and in the laboratory
were compared. The test organisms were D. magna and
fathead minnows. Five concentrations of the PEP were
tested in the outdoortanks, without replicates; the five con-
centrations were tested in a series of tanks with and with-
out sediment. PEP concentrations were analytically moni-
tored and exposure concentration maintained for two
months. Laboratory toxicity tests consisted of 30-d fathead
minnow and 21-d D. magna flow-through tests.
LObCs forfathead survival in the lab, outdoor no sediment
tank, and outdoor sediment tank were 410, 826, and
545 ug/L, respectively. The laboratory tests overestimated
the toxicity of PEP, but estimates were within an order of
magnitude of one another. In the Daphnia tests, the repro-
duction LOECs for the lab, outdoor no sediment tank, and
outdoor sediment tank were 100, 136, and 226 ug/L, re-
spectively. The laboratory water flea tests gave a fairly
reliable qualitative estimate of the PEP LOEC.
A major limitation of this study was that the outdoortanks
were not ecosystem surrogates; they did not contain other
biological communities, but only the test species. Small
sample sizes and the lack of replication were among the
other factors which compromise the reliability of data gen-
erated in this study.
B.3 Metals
B.3.1 Canfield et al. (1994)
This group evaluated the potential impacts of past mining
activities on the Clark Fork River (Montana) aquatic eco-
system using a benthic invertebrate community assess-
ment, chemical analyses on sediment, and laboratory
whole-sediment toxicity tests with an amphipod, Hyalella
azteca, a midge, Chironomus riparius, a cladoceran, Daph-
nia magna, and larval rainbow trout, Oncorhynchus
mykiss. The study included six sites in the Clark Fork
River watershed, one control/reference station on an
uncontaminated tributary and five sites downstream of past
mining areas.
Sediment concentration of metals (especially copper) were
high at Clark Fork sites 1 through 4. A metals con-
centration gradient from the most upstream sites to the
most downstream site was observed. The control site on
the tributary had the lowest sediment metal concentration.
The authors cautioned that there were many confounding
factors (including a possible sampling bias) influencing the
benthic invertebrate data which rendered interpretation
difficult. The authors pointed out that many chironomids
are tolerant of degraded conditions. Furthermore, the
percentage of the Chironomidae community comprised of
Tanypodinae (considered to be relatively pollution tolerant)
was much higher at sites 1,2, and 3 (upstream sites) than
at the control and downstream sites.
The amphipod tests revealed a gradient of toxicity, being
highest at the most upstream site and lowest at the control
site. Therefore, the results of the laboratory single species
toxicity tests were consistent with sediment metal concen-
trations and the distribution of chironomids. The investiga-
tors concluded that chemical analyses, laboratory toxicity
tests, and aquatic community evaluations all provided evi-
dence of metal-induced degradation to benthic populations
in the river.
-47-
-------
B.3.2 Burton et al. (1987)
The Clark Fork River was also the subject of another
study. This group evaluated a battery of aquatic toxicity
tests including the 7-d Ceriodaphnia test and 12 microbial
enzyme activity assays. Results of the laboratory toxicity
tests were compared to instream parameters including
diatom diversity and density, as well as metal concentra-
tions (In both water column and sediment). Data were
collected at 13 sites along the river and one control site.
As in the previous study (Canfield et al., 1994), sites were
on a downstream gradient below an area with past mining
activities. Both Ceriodaphnia and microbial tests were
conducted in the laboratory with water samples from the
river sites.
Ceriodaphnia survival ® = 0.94 and 0.93) and neonate
production® = 0.93 and 0.92) showed statistically signif-
icant (p<0.001) positive correlations with diatom density
and diversity, respectively. Survival ® = -0.92 and -0.94)
and neonate production ® = -0.92 and -0.94) were nega-
tively correlated with water column copper and zinc, re-
spectively.
These data suggest the Ceriodaphnia toxicity test results
were effective predictors of instream metal contamination
and of diatom population variations. Laboratory microbial
enzyme assays for galactosidase, glucosidase, and pro-
tease activities also showed statistically significant nega-
tive correlations with diatom populations in the river (i.e.,
low diatom diversity was associated with high enzyme
activity).
B.3.3 Clements and Kiffney (1994)
These researchers attempted to assess impacts of metals
from a mining site discharging into the Arkansas River (in
Colorado). Three sites were selected: one site upstream
of the mining operation discharge and two downstream of
the discharge. Whether caution was taken in the selection
of these sites to assure similar substrates, as well as other
physical and chemical conditions is unclear. The intent
was that the upstream site would serve as a reference
point. The second site was 6 km downstream and the third
site was 45 km downstream of the mining operation input;
the third was conceived as a site to represent biological
community recovery. Unfortunately, two creeks
discharged into the Arkansas River below the input from
the mining operation, confounding interpretation of data
collected at site 2. Furthermore, site 3 was below a town
and no attempt was made to account for toxic inputs from
the town or other sources (only metal analyses were
performed on water samples).
Water samples collected at each site were screened with
the 7-d Ceriodaphnia test, neonate production being the
endpoint. Benthicmacroinvertebratesbioaccumulation of
metals was assessed. Benthic invertebrate community
structure was surveyed by determining the number of taxa
and the number of individuals in each taxa. All assess-
ments were conducted in fall and spring, except toxicity
testing at site 2 was performed only with a spring water
sample.
In the fall water samples, zinc concentrations were highest
at site 1, upstream of the mine input. Water sample toxicity
was highest at site 3 and lowest at site 2 O'ust below input
from the mining operation) during the fall. The pattern of
toxicity in these fall water samples did not correspond to
heavy metal concentrations in the same samples. The
causes of toxicity in the fall water samples are unknown
because Toxicity Identification Evaluations (TIEs) or or-
ganic chemical analyses were not performed.
In the spring, all heavy metal concentrations were lowest
in water samples from the upstream "control" site and high-
est at site 2, immediately below the mining operation input.
Cadmium, copper, and zinc concentrations at site 2 were
5,8, and 5.5 fold higher, respectively, than at the reference
site. The Ceriodaphnia test was conducted only with a
water sample from site 2 and the results mirrored the in-
tense metal contamination.
Neither the number of invertebrate taxa nor the number of
individuals within taxa showed the site nearest to the
mining operation input to be the most impacted. Only the
bioaccumulation data and changes in the composition of
dominant macroinvertebrate groups suggested that site 2
to be the most impaired compared to the other two sites.
Interpretation of data collected in this study is difficult for
several reasons. It is not clear how carefully the three sites
were matched in terms of substrate and other physi-
cal/chemical factors. In the spring when the other data
were more understandable toxicity testing was incomplete
and organic chemical analyses were not performed.
Stream flow and rainfall conditions were not included in the
manuscript, and these factors could influence the
measurements and interpretation of results. The authors
counsel that the different approaches used in their study
provided divergent information regarding metal impacts,
so they recommend an integrated approach to assessing
impacts on streams. While an integrated approach to
assessing impacts on aquatic ecosystems should be
supported, design of this study was not optimal and, thus,
the results were inconclusive.
B.3.4 Niederlehner et al. (1985)
Several researchers have counseled that single species
toxicity tests lack many important interactive characteristics
of multivariate, complex ecosystems and, therefore, may
not be accurate predictors of biological community
responses. Niederlehner et al. (1985)stated that "A
multispecies or microcosm test incorporate some of the
emergent properties of communities of ecosystems and
-48-
-------
serve as an intermediate between the simplicity of the
single species toxicity tests and the unreproducible com-
plexity of the environment."
These researchers scrutinized the responses of protozoan
communities to cadmium exposures. Effects of cadmium
were evaluated by observing colonization of the
protozoans in polyurethane foam (PF), islands for 28 d.
Exposures were in duplicate tubs using five different
cadmium concentrations.
From the experiments, NOECs for protozoan colonization
impairment ranged from 0.8 to 9.5 ug Cd/L. In the ambient
water quality criteria document for cadmium (USEPA,
1984) chronic values (ChV) adjusted for hardness range
from 0.14 (Daphnia magna) to 15.04 ug/L (fathead
minnow) (selecting values from studies with hardness
equivalent to the range seen in the microcosm study).
Cladoceran chronic values range from 3.9 ug Cd/L for
Ceriodaphnia reticulata to 0.14 ug Cd/L for D. magna
Overall, the data in the Niederlehner et al (1985) report
suggest that the laboratory single species toxicity test
results underestimate field effects.
It is not evident that the microcosm results were a better
predictor of a safe cadmium concentration since the
chronic values from 15 of the 16 species listed in USEPA's
criterion document were within the range of NOECs noted
in the mesocosm study. Arguably, this conclusion that the
protozoan microcosm "tests were comparable to traditional
single species tests in time and expense required, but had
the advantages of utilizing indigenous organisms and
including processes characteristic of communities, but not
single species." is highly questionable. The microcosm
tests were 28 d exposures.
B.3.5 Moore and Winner (1989)
These investigators conducted a study in outdoor ponds
in Ohio to ascertain the effects of various concentrations
of copper on zooplankton and benthic macroinvertebrates.
Laboratory 7-d Ceriodaphnia toxicity tests were conducted
to evaluate the ability to predict effect levels of copper on
pond invertebrate communities.
The results of the Ceriodaphnia tests predicted the effects
of copper on pond populations of Daphnia ambigua, but
underestimated the impacts of copper on other important
species, such as rotifers, copepods, mayfly juveniles, and
chironomids.
B.3.6 Geckler et al. (1976)
The results of laboratory chronic toxicity tests in which
fathead minnows, green sunfish, and longearsunfish were
exposed, in separate tests (i.e., not a multiple species
test), to various concentrations of copper were compared
to responses offish in a natural stream. Effects were seen
at a somewhat lower copper concentration in the stream
than predicted by the laboratory toxicity tests. The authors
concluded that, "Agreement between the predictions from
laboratory toxicity tests and the observed field effects is
surprisingly close considering the measurement errors
involved." Similarly, laboratory toxicity test results provided
reasonable estimates of the metal concentrations which
impacted crustaceans inhabiting tundra ponds (Havas and
Hutchinson, 1982).
8.3.7 Giesy et al. (1979)
Giesy et al. (1979) studied the effects of different cadmium
concentrations in outdoor experimental stream channels.
The results show that single species toxicity test results do
not predict secondary effects (cf., Grassland above) in
aquatic ecosystems. The primary direct effect of cadmium
in these channels was on crayfish; however the direct ef-
fect of cadmium on crayfish could have been measured in
the lab. In this mesocosm study, the crayfish was a "key-
stone" species. The decrease in crayfish population
greatly influenced community structure including
macrophytes, insects, and clams. These secondary effects
were not predicted by laboratory toxicity tests, therefore,
underestimating biological community impacts.
B.3.8 Marshall (1978)
Marshall (1978) compared the short-term (7 to 9 d) toxicity
of cadmium to laboratory and natural populations of Daph-
nia galeata in Lake Michigan. As well as controls, there
were four different exposure concentrations for both the
laboratory and field populations. Results of this investiga-
tion indicated that the characteristics of Lake Michigan
water did not appreciably alter the responses of Daphnia
to cadmium. Furthermore, responses to cadmium were
equivalent in the laboratory and in the lake experiments.
This is in contrast to the study by Sherman et al. (1987)
which demonstrated that laboratory toxicity test results with
cadmium on fathead minnows could not be used to extrap-
olate to field situations unless hardness and pH in the labo-
ratory tests are equivalent. In general, laboratory toxicity
test results underestimated field effects of cadmium.
B.4 Miscellaneous
B.4.1 Boelteretal. (1992)
Ambient water samples from streams receiving discharges
of coproduced brine (water that is extracted along with
petroleum products from underground deposits) from an
oil field in Wyoming were collected and tested for toxicity
(Boelteretal., 1992). The7-d Ceriodaphnia test was one
of the testing procedures.
Exposure to water samples collected downstream, but not
upstream, of the oil field discharges significantly reduced
Ceriodaphnia survival and neonate production. Application
of TIE procedures to toxic samples signified that toxicity
could not be attributed to nonpolarorganics, heavy metals,
or hydrogen sulfide. TIE results along with analytical
-49-
-------
chemistry data established that the cause of toxicity was
sodium, potassium, bicarbonate, and carbonate ions.
Concentrations of these ions were sufficiently high to be
toxic to many aquatic organisms.
This study is one of many studies which illustrate that the
Ceriodaphnia test in combination with TIE and analytical
chemistry procedures have effectively identified causes
and sources of toxicity in surface waters, storm waters, and
effluents.
B.4.2 Gonzalez and Frost (1994)
The responses of two rotifer species, Keratella cochlea
and K. taurocephata, to low pH were compared in labora-
tory toxicity tests and in a natural lake (Little Rock Lake in
Wisconsin). This lake, formed by seepage, consists of two
basins which were separated by a vinyl curtain. One of the
basins was acidified overtime, whereas the other was not
modified. Populations of the two rotifer species in the two
basins were compared through time. Short term (30-d to
96-h) laboratory toxicity tests were conducted with each of
the species using watersamples from the two basins of the
lake.
The authors concluded that the laboratory tests were not
predictive of results obtained in the lake component of the
study and recommended caution when extending results
from laboratory studies to natural ecosystems. More spe-
cifically, the authors suggested that laboratory tests did not
explain the population increase of K. taurocephala in the
acidified basin. However, the laboratory tests did reveal
that K. cochlea is very sensitive to low pHs, whereas K.
taurocephalawas much less sensitive. K. cochlea essen-
tially disappeared from the acidified basin while the
population of K. taurocephala in that basin increased.
Thus, field observations were not necessarily at odds with
the laboratory toxicity tests. Furthermore, the population
of K. taurocephala in the reference basin remained very
low throughout study. The K. taurocephala population
increase in the acidified basin appeared to be due to a
reduction of predators, this reduction being caused by the
low pH. The laboratory tests with the rotifers would not
predict effects on predators.
Other aspects of this study complicate interpretation of the
data and acceptance of the author's conclusions. These
factors include the absence of replication in the field com-
ponent of the study and differences in the two basins. The
acidified basin underwent thermal stratification and be-
comes anoxic where as the reference basin did not.
B.4.3. Other S.tudies
Other investigators (Hitchock, 1965; Eisle and Hartung,
1976; Weiss, 1976; Cairns et al., 1982; Grassland and
Hillaby, 1985) have examined the correspondence of labo-
ratory indicator species toxicity test results and biological
community responses; the comparisons generally support
a good qualitative adequacy of the single species test re-
sults as predictors of instream responses.
Some studies (Carlson et al., 1986; Nimmo et al., 1990)
were not specifically designed for examining the reliability
of the single species test results in predicting aquatic eco-
system responses, but provided qualitative indications of
a good correspondence.
-50-
-------
Appendix C
Single Species Tests with Ocean Water or Sediment
C.1 Swartz et al. (1994)
Sediment toxicity, as assessed with the amphipod,
Eohaustoriusestuarius, sediment chemical analyses, and
the abundance of benthic amphipods were examined
along a gradient in the Lauritzen Channel and adjacent
areas of Richmond Harbor, California. Dieldrin and DDT
were formulated at a facility on Lauritzen Channel from
1945 to 1966.
Objectives included: 1) Examination of the relationship
between sediment contamination by DDT and dieldrin,
sediment toxicity to Eohaustorius, and the field abundance
of amphipods at nine sites in the Lauritzen
Channel/Richmond Harbor area; 2) Identification of the
lowest DDT and dieldrin concentrations associated with
effectson amphipod survival in laboratory toxicity tests and
effects on abundance of amphipods in the field; and 3)
evaluation of the relative contributions of DDT, dieldrin,
PAHs, PCBs, and metals to sediment toxicity, and on
amphipod abundance in the study area.
Sediment contamination by both dieldrin and the sum of
DDT and its metabolites was positively correlated with
sediment toxicity and negatively correlated with the abun-
dance of amphipods in the study area; DDT (plus its me-
tabolites) was the dominant toxicological factor. These
researchers concluded, "Correlations between toxicity,
contamination, and biology indicate that sediment toxicity
to Eohaustorius estuarius, Rhepoxynius abronius, or
Hyallella azteca in laboratory tests provide reliable evi-
dence of biologically adverse sediment contamination in
the field."
In five other studies (Swartz et al., 1985, 1986, 1991;
Ferraro et al., 1991; Hake et al., 1994;) statistically signifi-
cant positive correlations were found between the sum of
DDT plus its metabolites in sediment and mortality of am-
phipods in laboratory sediment toxicity tests. Statistically
significant negative correlations were seen between sedi-
ment toxicity and amphipod abundance in field sediment
samples (i.e., high sediment toxicity related to low abun-
dance). Thus, the weight-of-evidence from these studies
suggests that significant toxicity in laboratory sediment
toxicity tests provides a reliable qualitative prediction of
benthic biological community responses.
C.2 Chapman et al. (1987)
These reseachers conducted an investigation in the San
Francisco Bay area which involved measurements of sedi-
ment contamination by: chemical analyses; toxicity through
sediment toxicity tests (mortality of the amphipod,
Rhepoxynius abronius, larval development of the mussel,
Mytilis edulis, behavior of a clam, Macoma balthica, and
reproduction of the copepod, Tigriopus californicus; and
benthic infaunal community structure through taxonomic
analyses of macroinfauna).
Sediment samples were collected atthree stations at each
of three sites in the San Francisco Bay: Islais Waterway,
Oakland, and San Pablo Bay. Chemical analyses
indicated that the Islais Waterway site was more contami-
nated by a number of potentially toxic substances than the
Oakland site, while the latter site was more contaminated
than the San Pablo Bay site.
Benthic community analyses, as well as toxicity test results
(especially the mussel larvae, amphipod, and clam behav-
iortests) suggested that the rank of pollution-induced deg-
radation was: Islais Waterway > Oakland > San Pablo
Bay. Moreover, there was concordance among the three
synoptic measurements. The authors argue that all three
types of assessment are critical for assessing pollution-
induced degradation of aquatic biological communities.
C.3 Swartz et al. (1985)
Sediment toxicity, chemical contamination and
macrobenthic community structure were examined at
seven stations along a gradient northward from Los An-
geles County Sanitation Districts' sewage outfalls on the
Palos Verdes Shelf and compared to control conditions in
Santa Monica Bay. Sediment toxicity was assessed with
laboratory toxicity tests utilizing the amphipod,
Rhepoxynius abronius.
Significant reductions in macrobenthic species richness,
density, biomass, and infaunal indices occurred at the
three stations which also showed significant toxicity in the
laboratory tests. There was a close inverse relationship
between sediment toxicity and benthic community mea-
surements. The authors concluded that sediment toxicity
tests can be useful in predicting benthic community im-
pacts, but cautioned that the amphipod test is not particu-
-51-
-------
larly sensitive. Moreover, absence of statistically signifi-
cant toxicity in this test should not be interpreted as evi-
dence of a healthy benthic community (i.e., the test yields
many false negatives).
C.4 Long and Chapman (1985)
To assess biological community effects of sediment con-
tamination Long and Chapman advocate the use of a Sedi-
ment Quality Triad (chemical, toxicity, and benthic infaunal
data). The authors contend that too much emphasis is
placed on the determination of distribution and concentra-
tion of chemicals in the designation of problem areas or
"hotspots." They further assert that chemical data alone
provide little or no information regarding the possible bio-
logical significance of such chemical accumulations. The
objective of this publication was to determine the corre-
spondence among measures of the three components of
the Triad; data from several studies on Puget Sound,
Washington were used.
Toxicity data were derived from six different laboratory
sedimenttests (amphippd lethality, oligochaete respiration,
oyster larval abnormality, fish cell effects, and polychaete
life-cycle effects). Data from these tests were combined
into a toxicity summary index. Four indices were
concluded to be effective indicators of benthic community
health. All four indices represent percent contribution of
specific taxonomic groups to the total benthic community-
contribution of echinoderms (pollution sensitive, so high
percentage represents healthy community); contribution
of arthropods (many are pollution sensitive, so higher
percentage represents healthy community); contribution
of phoxocephaiid amphipods (pollution sensitive, so higher
percentage represents healthy community); and
contribution of polychaetes and molluscs (many are
relatively pollution tolerant, so high percentage can
represent impacted community).
Using the above indicators of benthic community health,
the toxicity tests summary index was a reliable predictor
of biological community impacts. In fact, good overall
correspondence among the three components of the Triad
was observed. On astation-by-station basis, the chemical
data alone were not always reliable indicators of biological
effects.
C.5 Becker et al. (1990)
Laboratory sediment toxicity tests and benthic
macroinvertebrate assemblage surveys were conducted
at 43 stations in Commencement Bay, Washington; there
were four reference sites in Carr Inlet. The toxicity tests
included the amphipod (Rhepoxynius abronius) mortality
test, the oyster (Crassostrea gigas) larval development
test, and the Microtox™ test. A numerical classification
analysis was applied to the benthic assemblages data.
Sediment samples were also subjected to chemical anal-
yses for organic compounds and metals.
Toxicity test results and benthic assemblages alterations
were inversely related, whereas toxicity was positively cor-
related with chemical concentrations. This suggests that
most biological effects resulted from chemical toxicity.
That is, the laboratory toxicity tests were reliable qualitative
predictors of biological community responses.
To evaluate the correspondence between toxicity test re-
sults and alterations of benthic assemblages, three types
of comparisons were made. Concordance was first deter-
mined; this is a measure of agreement between results of
toxicity tests and macroinvertebrate surveys (i.e., both
show statistically significant effects or both show no statisti-
cally significant effects). Statistical significance of concor-
dance was evaluated using a binomial test and an
expected level of concordance of 0.5 (i.e., that for random
agreement). Of the 47 stations the benthic assemblages
at 19 were deemed altered. Concordance was 60% (not
significant) with the amphipod test, 81 % (p<0.001) with the
oyster larval test, and 68% (p<0.01) with the Microtox™
test. Sensitivity of the toxicity tests was represented as the
percentage of stations with altered benthic assemblages
that also revealed statistically significant toxicity. Sensitiv-
ity was 84%, 68%, and 42%, respectively for the
Microtox™, oyster larval, and amphipod, tests. Efficiency
of the toxicity tests was determined as the percentage of
tests which identified only those stations with altered ben-
thic assemblages. Efficiency was 81 %, 57%, and 50% for
the oyster larval, Microtox™, and amphipod tests, respec-
tively. The authors concluded that the laboratory sediment
tests, especially the oyster larval tests, were reasonable
predictors of altered benthic assemblages.
C.6 Swartz et al. (1982)
The toxicity of 175 sediment samples from Commence-
ment Bay was measured in the laboratory Fthepoxynia
abronis survival test. The relationship between these
toxicity test results and benthic community data from these
sites was explored. Benthic community data exhibited a
negative correlation (decreased amphipod density and
species richness with higher levels of toxicity) with
laboratory sediment toxicity. The authors concluded that
the correlation between laboratory and field results indi-
cated that the sediment toxicity tests were reliable predic-
tors of biological community responses.
C.7 Schimmel et al. (1989a,b)
Studies were conducted to assess the relationship be-
tween effluent and ocean water toxicity. The estimates of
chronic toxicity were made from 1982 to 1984 at seven
locations along the Atlantic and Gulf Coasts with effluent
and ocean water samples (USEPA, 1994b).
-52-
-------
Effluent dilutions at various locations in receiving waters
were estimated with dye studies so that effect concentra-
tions could be compared. Data presented by these inves-
tigators reveal that effluent toxicity reliably reflected receiv-
ing water toxicity (effect concentrations in effluent and
ocean water samples with equivalent dilution corre-
sponded). The results of these studies signify that the
ocean receiving waters had little affect on the toxic-
ity/bioavailability of chemicals in the effluent. The "missing
link" in these investigations was establishing a connection
between marine water toxicity and biological community
responses.
C.8 Frithsen et al. (1989)
Using indicator species toxicity tests and an ecosystem
survey, a four month study was conducted to evaluate the
toxicity of a sewage effluent. Effluent discharge was into
Narragansett Bay. Effluent toxicity was evaluated with the
sea urchin, Arbacia punctulata sperm cell test. Ecological
effects of the effluent were assessed in mesocosms con-
sidered by the authors to be functional analogs of shallow,
unstratified coastal systems such as Narragansett Bay.
The sewage effluent consistently tested toxic in the sea
urchin test, with the average EC50 being 1.1% effluent.
Little information could be gleaned from the mesocosm
data due to several problems. There were unexplained
effects on phytoplankton and organic carbon loading which
lead to hypoxia. Toxicity measured in the mesocosm did
not correlate with that in the sewage effluent. There was
incomplete mixing of effluent in the mesocosms. Signifi-
cant toxicity was detected in the control mesocosms. Tox-
icity in all mesocosms was highly variable and not related
to effluent toxicity. Because of the confounding factors, a
conclusion that the mesocosm data failed to confirm
laboratory effluent toxicity data would be inappropriate.
The study was inconclusive.
-53-
-------
Appendix D
Strengths and Limitations of Single Species Toxicity Tests
D.1 Strengths of Single Species Tests
There is no instrument that can measure or predict how
organisms will respond to a toxic chemical(s). Further-
more, chemical analyses of effiuentorambientwatersam-
ples do not yield information on toxicological additivity,
bioavailability, synergistic, or cumulative effects. Many
wastewater and ambient waters are complex, containing
constituents that interact and that differ in toxicity; there-
fore, single chemical standards are important, but of limited
value in protecting water quality.-
1) Single species tests integrate additivity and cu-
mulative interactions of chemicals.
2) Single species tests provide a direct measure of
chemical bioavailability.
3) Single species tests measure responses to toxi-
cants for which there are no chemical-chemical-
specific water quality standards.
4) Single species tests have provided reliable esti-
mates of concentrations (for many different types
of chemicals) which cause effects in aquatic eco-
systems.
5) Because they are highly standardized with specific
quality assurance and control requirements, single
species tests provide reliable, repeatable, and
comparable results with good precision compared
to other types of chemical and biological tests.
6) Single species tests provide an early warning signal
so that actions can be taken to minimize significant
ecosystem impacts (especially with regard to the
discharge or release of toxic chemicals).
7) Single species tests can be performed relatively
rapidly and inexpensively. This allows for the ac-
cumulation of a data set which better characterizes
the wastewater or ambient water system.
D.2 Limitations of Single Species Tests
Several definite and potential limitations of single species
toxicity tests have been identified.
1) Results of a single test do not characterize the
duration, orf requency of toxicity in wastewater or
ambient waters. Instream exposure reflects ambi-
ent water/effluent characteristics overtime (days,
weeks), whereas exposure in the laboratory re-
flects the characteristics of ambient water or
wastewater in a grab sample or composite sam-
ple of one day.
2) Results of a test or tests with an effluent do not
allow for assessment of cumulative effects of
toxic substances from different sources in aquatic
ecosystems.
3) The range of sensitivities (to toxic substances) of
organisms and functions in aquatic ecosystems
may not be encompassed by single species tests.
4) Effects due to bioaccumulation/bioconcentration,
delayed, or secondary effects are not measured.
5) Results of single species tests may underesti-
mate ecosystem community responses because
of the multiple stressors acting on natural popula-
tions and communities. Single species tests
include limited range of endpoints (responses)
compared to aquatic ecosystems.
6) Results of single species tests may not be pre-
dictive of trophic interactions and ecosystem
operational processes (tests do not incorporate
aquatic ecosystem complexity).
7) Physical and chemical, as well as biotic factors,
in aquatic ecosystems could modify (increase or
decrease) bioavailability or toxicity compared to
laboratory tests. The highly controlled exposure
regimes in the laboratory may not reflect the
multivariant and complex exposure conditions in
natural settings.
8) Single species tests tend to use non-indigenous
species that may not represent local biota.
9) Single species toxicity test results fail to account
for indirect effects of contaminants.
10) Single species tests tend to use genetically
homogenous laboratory populations
-54-
&V.S.
GOVERNMENT PRINTING OFFICE: 1999 - 550-101/2OOM
------- |