Chapter 17: Workgroup Summary Report on Methodological Uncertainty in Conducting Sediment Ecological Risk Assessments with Contaminated Sediments


EPA/600/A-96/098
CHAPTER 17: WORKGROUP SUMMARY REPORT ON METHODOLOGICAL
UNCERTAINTY IN CONDUCTING SEDIMENT ECOLOGICAL RISK ASSESSMENTS
WITH CONTAMINATED SEDIMENTS
Keith R. Solomon, Gerald T. Ankley, Renato Baudo, G. Allen Burton, Christopher
G. Ingersoll, Wilbert Lick, Samuel N. Luoma, Donald D. MacDonald, Trefor B.
Reynoldson, Richard C. Swartz, and William Warren-Hicks
17.1 INTRODUCTION
In the following chapter, a range of issues related to the uncertainty associated
with sediment ecological risk assessments (SERA) are described including an
evaluation of: (1) uncertainty associated with the overall SERA framework, (2) the
effects of false positive and false negative errors associated with sediment toxicity
tests, (3) spatial and temporal distributions of sediment contamination, (4)
sampling errors, and (5) uncertainties associated with transport and fate models.
Chapter 18 describes the uncertainty associated with specific measurement
endpoints commonly used in SERAs and discusses approaches for addressing
these sources of uncertainty.
The goal of any uncertainty analysis is to describe and interpret knowledge
limitations that may be present in the measurement endpoints used to conduct a
SERA analysis, for the purpose of incorporating estimates of uncertainty into
management decisions. A number of viewpoints were discussed at the Workshop
for defining uncertainty, two of which are described below.
In the first viewpoint, uncertainty is considered to be composed of two
components: (1) measures of bias (i.e., consistent deviation of measured values
from the true value) and (2) measures of precision (i.e., measure of agreement
among replicable analyses of a sample). Accuracy is the combination of bias and
precision for a procedure which reflects the closeness of a measured value to a
true value. Figure 17.1 presents a visual interpretation of these components of
uncertainty (Jessen 1978). Note that bias and precision are independent. For
example, a method could have low bias and low precision (Figure 17.1 b), or high
bias and low precision (Figure 17.1c). Either combination leads to a decline in the
overall confidence in the measurement.

-------
%	F
4 %	I
Strictly defined estimates of accuracy are limited to formal experiments such as
inter-laboratory testing of a blind (but known) chemical concentration. In contrast,
many sediment surveys are conducted without the benefit of knowing the "true"
value (i.e., accuracy of sediment toxicity tests with field-collected sediments). In
these cases, estimates of field precision are limited and a weight-of-evidence is
used as a surrogate to estimating bias. Strictly defined, precision is the observed
variance of repeated measurements conducted under the same conditions (e.g.,
the variance associated with repeated ponar grabs at the same location). In
practice however, biological and chemical properties are very dynamic, making
rigorous estimates of bias and precision difficult to obtain (see Section 17.2.4.3).
In the second viewpoint, uncertainty can be evaluated in the context of expert
judgement and opinion in the analysis of uncertainty. While determination of
accuracy and precision of management tools provides direct information for
evaluating uncertainty, many methods are not amenable to this type of
assessment. For this reason, less quantitative methods are often used to evaluate
uncertainty, such as expert judgement (see Chapter 18). Although we may not
have definitive numerical measurements on the ecological relevance of a specific
measurement endpoint, a well-designed expert opinion survey can be used to
generate knowledge relevant to the issue. Similar to numerical analysis, the larger
the opinion survey the greater the information we have to assess uncertainty. In
practice, expert opinion may be more available than well conducted numerical
analyses of uncertainty and can be a useful source of information. A large
statistical literature is available on methods for generating expert opinion in a
formal analysis of uncertainty (i.e., Bayes theory; Chapter 13). Bayes theory can
be used to combine both subjective and quantitative sources of information in a
decision making process. In the following sections, we use both of the viewpoints
described above to discuss the sources of uncertainty and the implication of
uncertainty in SERAs. We encourage scientists and policy makers to consider
uncertainty in risk-based decisions. We hope that by addressing the uncertainty
issues, decision makers will have valuable information available for weighing the
various options available for risk reduction.

-------
Figure 17.1.

-------
17.2 UNCERTAINTY IN THE RISK ASSESSMENT FRAMEWORK
Guidelines have been developed for ecological risk assessments to promote
consistency in analysis (Chapter 1). These guidelines also allow for the
establishment of quality standards and consistent terminology for assessments.
Consistency in the use of guidelines can help inform all stakeholders as to the
relative degree of confidence and scientific knowledge under which the decision
was made (Russell and Gruber 1987). Several guidelines are in use with varying
degrees of consistency. Many of these methods are based on similar procedures
and principles; therefore, the USEPA Framework for Ecological Risk Assessment
(USEPA 1992a) was used in this chapter as a guideline (Chapter 1; Figure 17.2).
The ecological risk assessment
framework as used in this chapter (Figure 17.2) has five major areas: (1) problem
formulation, (2) exposure characterization, (3) effects characterization, (4) risk
assessment, and (5) risk management. Each of these areas is discussed in more
detail below.

-------
17.2.1 Problem Formulation
Problem formulation is the planning or experimental design stage of the overall risk
assessment process. In this sense it is similar to posing a question such as: "Is
the mean number of species in the benthic community at the exposed area
different from the reference area?" Uncertainty in the formal statistical sense is of
lesser importance at this stage of the process; however, there is qualitative
uncertainty in the appropriate choice of assessment endpoints {objectives or
purposes of the risk assessment) and measurement endpoints (indicators or tools
used to evaluate risk or effects). The best way to reduce these initial uncertainties
in the problem formulation is to involve all interested parties through stakeholder
input. This involves asking all interested parties {including the risk managers, the
scientific community, and the public) to define the problem in the form of a
concise narrative. Once the problem has been identified, appropriate assessment
and measurement endpoints can be selected (Figure 17.2). Discussions at the
Workshop focused primarily on uncertainty in relation to evaluating effects of
chemical stressors. However, non-chemical stressors could be a dominant
process influencing the system (e.g., habitat disturbance) or other chemical or
non-chemical stressors potentially will interact to produce perturbations (e.g,
ammonia and dissolved oxygen in the lower Fox River, Ankley et al. 1992;
temperature and metals in the Clark Fork River, Kemble et al. 1994).
In the case of retrospective risk assessments (in some instances termed impact
assessments), identification of the stressor(s) is a potential source of uncertainty.
Identification of stressors should be part of the problem formulation stage and
typically consists of: (1) a survey of the natural and anthropogenic stressors which
may be associated with the test area, (2) an assessment of the available data
relative to quality control and quality assurance, and (3) hypothesizing potential
stressors. In some cases, it may be necessary to make use of physical and
chemical separation techniques {e.g., toxicity-based fractionation methods) to
identify specific classes of contaminant stressors. For example, extraction of pore
wafer from sediment may allow partial identification of potential chemical stressors
through toxicity identification evaluation (TIE) methods which use
physico-chemical manipulations to affect the toxicity of specific contaminants of
concern (Chapter 18).
17.2.2 Exposure Characterization

-------
The analysis phase of the risk assessment framework consists of two activities:
characterization of exposure and characterization of effects (Figure 17.2). The
purpose of characterization of exposure is to predict or measure the spatial and
temporal distribution of a stressor and its co-occurrence or contact with the
ecological components of concern (USEPA 1992a). These uncertainties may
influence the planning, execution, or interpretation stage of the exposure
characterization. Primary sources of uncertainty in characterizing exposure
include: (1) laboratory imprecision, (2) matrix interference errors, (3) sample
location biases, (4) sample collection and handling errors, (5) phase distribution of
stressor, (6) contamination of the sample with other stressors, (7) spatial and
temporal heterogeneity of the stressor, (8) references for comparison of stressor
levels, (9) substrate type and interactions with stressor, (10) life history of the
organism, (11) non-equilibrium of chemical stressor between sediment and water,
(12) response model prediction error, (13) data transformation and normalization
errors, (14) exposure pathway analysis errors, and (15) fate analyses errors.
17.2.3	Effects Characterization
The purpose of characterization of effects is to identify and quantify the adverse
effects resulting from exposure to a stressor (USEPA 1989). The 15 areas of
uncertainties listed above for exposure characterization may also influence the
planning, execution, or interpretation stage of the effects characterization.
Additional sources of uncertainty in effects characterization may include: (1)
effects of non-contaminant stressors in toxicity tests, (2) reference comparisons to
toxicity or receptor distributions, (3) laboratory to field extrapolations, (4)
interpretation and definition of natural variability, (5) differences in receptor
species sensitivity, (6) differences in physical alterations of the sediment, and (7)
differences in stressor-response relationships.
17.2.4	Risk Characterization
Risk characterization may either be prospective (e.g., product hazard assessment
as described in Chapters 3 and 4) or retrospective (e.g., impact hazard assessment
as described in Chapters 6 and 7). Risk characterization at the organismal level
has traditionally been done by comparison of the concentration of the stressor(s)
found in the environment to the responses reported for that stressor(s) in the
laboratory, field, or by use of the literature. This risk characterization can be
performed as described in the following sections:

-------
17.2.4.1 Use of Quotients for Risk Assessment
Risk quotients are simple ratios of exposure and effects. For example:
Traditionally, the quotient method has been used to compare the effect
concentrations for the most sensitive species of concern to the average, median,
mean, or highest exposure concentration. In addition, these exposure
concentrations may be compared to an effect concentration derived from toxicity
tests. This assessment can be made more conservative by the use of safety
(application) factors, such as division of the effect level by a number such as 20
(CWQG 1987). Use of safety factors allows for unquantified uncertainty in the
effect and the exposure estimations or measurements. Because this uncertainty is
unknown and unquantifiable, substantial errors are possible, both in
underestimating or overestimating the risk.
In the absence of sufficient information from toxicity tests, these risk assessments
may be underprotective. Conversely, where a wide range of toxicity data is
available, the variation in receptor response may be well defined and further use of
safety factors may be overprotective. Use of the quotient approach is acceptable
for early tiers of the risk assessment, but the approach fails to consider the range
of variation which may exist in terms of exposures and susceptibility (i.e., Chapter
5 dealing with dredging assessments). Recently a method was proposed for using
quotients in Tier I risk assessments, which included the incorporation of
uncertainty in both the numerator and denominator of the quotient equation
(Parkhurst et al. 1995).
17.2.4.2 Probabilistic Risk Assessment
A second approach for evaluating risk is to express the results of a refined risk
characterization analysis as a distribution of toxicity values rather than a single
point estimate (i.e., Chapters 3 and 4 dealing with product assessment). For
example, this approach has been proposed or is now being used by: the Dutch
government (Health Council of the Netherlands 1993); Cardwell et al. (1993);
Graney et al. (1994); Solomon et al. (1996); and Klaine et al. (1996). A major
advantage of the probability approach is that it uses all relevant single species

-------
toxicity data, and when combined with exposure distributions, allows for
quantitative estimation of the risks to receptors. However, the approach is only
valid if endpoints used in the assessment are similar. For example, survival data
would not be expected to be protective of reproductive effects. The degree of
overlap of the exposure curve (drawn as a log-Pearson Type III distribution;
McBean and Rovers 1992) with the effects curve can be used to estimate the
probability that a certain percentage of receptors may be adversely affected for a
percentage of occasions (i.e., Figure 17.3). A similar approach has been used in
the derivation of USEPA Water Quality Criteria (Stephan et al. 1985). With the
use of overlapping distributions, there is an implicit assumption that protecting a
certain percentage of species for a certain proportion of occasions will also
preserve ecosystem structure and function.
Although this approach to risk characterization takes into account much of the
variability with regard to the range of susceptibility in receptor species, it still
embodies several uncertainties and limitations. For example, the choice of
protection level (e.g., 90% of species) may not be socially acceptable. Some may
view 90% as being overprotective, whereas others may find this level of risk
unacceptable, especially if the 10% of potentially affected species includes
endangered species or other organisms of ecological, commercial, or recreational
importance (see Chapter 11). Additionally, risks of persistent, bioaccumulative
chemicals to species at the top of the food chain may not be sufficiently
addressed by this approach (Graney et al. 1994, Section 18.5). In the situation
where there is a desire to protect more sensitive receptors, these species could be
identified and appropriate mitigation measures taken.
A further issue requiring consideration in probablistic risk assessments is the
number of data points required to define the distribution of receptor species for
either acute or chronic effects. Additional test species and endpoints beyond
those now applied for SERAs may be needed (Burton and Ingersoll 1994). In
addition, there is a need for methods such as those proposed by Parkhurst et al.
(1995) for calculating the degree of risk associated with exposures to multiple
chemicals.
17.2.4.3 Retrospective Risk Assessment
Risk assessments based on measurement of current conditions are considered

-------
retrospective and typically do not forecast the expected change in risk due to
remedial or mitigatory options or changes in the ecosystem in the future.
Retrospective risk assessments rely on a number of techniques discussed in more
detail in Chapter 5 (dredging assessment) and Chapters 6 and 7 (site clean-up
assessment). These assessments may include measurement endpoints such as
sediment toxicity tests, assessments of structural or functional changes in the
benthic communities, cellular and molecular effects in the receptor species, or the
presence of tissue residues of contaminants of concern (Chapter 18).
The use of multiple lines of evidence (weight-of-evidence) is particularly important
in retrospective risk assessment (USEPA 1992b) and may also be useful in
prospective analysis. For example, if the probability that an effect on benthic
community structure the result of exposure to a chemical stressor is made more
certain if the concentration of the stressor in the area is high enough to have
caused the observed effect and also results in overt toxicity in laboratory toxicity
testing.

-------
Figure
17.3 Graphical representation of the use of probabilistic risk assessment with
sediments. Cumulative frequency distributions of concentrations of stressors in
sediments are compared with distributions of sensitive benthic organisms. Arrows
show probabilities of not exceeding 10th percentile sensitivity concentrations for
acute and chronic endpoints at three sites (adapted from Solomon et al. 1996).
17.2.5 Risk Management
The outcome of all risk management actions should either be the acceptance or
the reduction of the risk. Risk reduction involves many potential actions which
range from the technical through the socio-economic to the political. In
undertaking risk management, it is necessary to:
•	Decide which risks must be managed and in what priority. This requires
that some method for measuring and comparing risks must be available
(e.g., Chapter 11 on ecological relevance),
•	Maximize the reduction of risk for the available resources (this implies that a
system must be in place for assessing the degree of risk reduction and for
measuring its cost).
17.2.5.1 Uncertainty in Prioritizing the Risks
In general, the first step in ranking risks for management involves evaluation of the
harmful effects of the action associated with the production or release of the
stressor. In the case of human health, this response may be expressed as a
numerical risk. Even though the risk assessment process may have limitations,

-------
estimates of relative risk may be comparable if similar processes are used to derive
the risks. An additional difficulty is presented by unquantifiable risks. This applies
particularly to environmental risk which may have measurement endpoints of an
aesthetic nature such as reduced days of recreational fishing or reduced view.
Endpoints of this type cannot be quantified in the same terms as, for example, fish
mortality.
Harwell et al. (1992) proposed a method for evaluating and prioritizing risk to
human health and the environment. The system is based on recognition of the
issues raised earlier in this document including:
•	Acknowledging that ecosystems are diverse;
•	Knowing that ecosystems respond to stress differently and that this
response is governed by the type of ecosystem and the type of stressor;
•	Recognizing that a wide range of temporal, organizational, and spatial scales
are involved;
•	Knowing that the measurement endpoints are relevant to the selected
assessment endpoints;
•	Knowing the normal baseline behavior of the ecosystem;
•	Having good extrapolation techniques from laboratory and field
measurement endpoints to the selected assessment endpoints; and;
•	Considering uncertainty in all of these issues.
The risks to be prioritized are then separated into a series of components which
are ranked as follows:
•	The potential magnitude of the risk. Magnitude is ranked on an ordinal
scale of 5 ranging from low to high as follows: Low < Medium < High <
Very High < Extremely High.
•	The geographic extent of the risk. Extent is ranked on an ordinal scale of 3
ranging from low to high as follows: Local < Regional < Biosphere.
•	The recovery time. Recovery time is ranked on an ordinal scale of 3 ranging
from low to high as follows: Short (years) < Medium (decades) < Long
(centuries).
These scores can then be combined and used for ranking purposes. However, as
these ranks are based on expert assessment, they are subject to uncertainty and
bias. As suggested above for problem formulation, uncertainty of qualitative

-------
assessments may be reduced by involving expert opinion polls and the
stakeholders in the process.
17.2.5.2 Uncertainty in Assessing Risk Reduction Strategies
Many options for risk reduction may be available to the risk manager; however
there are generally two types of tools — technological and regulatory.
Technological tools for risk mitigation include a wide range of procedures, many of
which are specific to the situation. In the case of sediments which are
contaminated by effluent discharges, further treatment of the effluent before
release is commonly applied in industrial settings. In the case of in situ
contamination of sediments, many cleanup and disposal options are available
(Francinques et al. 1985; IJC 1988} once sediment has been identified as
containing chemicals at concentrations posing a problem. The sediment can either
be removed, stabilized, capped, treated in situ, or "no-action" may be taken
(Lynam 1987; Grigalunas and Opaluch 1989). The remediation procedure or
combination of procedures chosen is specific to the study area and depends on
ecological, chemical, physical, engineering, economic, human health, and political
considerations. Furthermore, source control and continued monitoring must be
included with any remediation effort to avoid creation of new problem areas.
The regulatory tools which may be used for risk mitigation, in increasing order of
effectiveness are as follows:
•	The provision of better information and communication to prevent misuse of
stressors that may contaminate sediments.
•	Better control of discharges and releases of stressors to levels which are
judged to present a tolerable risk to the benthic community.
•	Restrict the use and application of the stressor.
•	Impose a total manufacturing ban on the stressor.
Uncertainties exist which affect the selection of technological or regulatory tools
that should be used to mitigate risks. Uncertainty of knowledge {which
technological options are available) is best addressed through expert opinion
surveys and stakeholder consultations. Uncertainties in the degree of risk
reduction are best assessed by reiterating the risk assessment procedure for all the
appropriate exposure reduction strategies and then ranking these in terms of both
the reductions in risk and the uncertainty in achieving these reductions. This

-------
matrix will allow informed choices to be made and the "trading off" of costs with
risk reductions and the uncertainties of achieving these reductions,
17.2.5.3 Uncertainty in assessing societal values
If ecosystems are viewed as providing services to society, these services can be
assessed to have an economic value. All components of the ecosystem can be
assigned an economic value; however, this view has been criticized, particularly in
the assigning of value to concepts such as species richness and diversity. In
assigning economic value to ecological services, the implication is that these
services, including physical capital (equipment and technology) or human capital
(knowledge and skills) are interchangeable and can be traded in the same way as
these commodities, for example, writing off the loss of a species for an increase in
corn production. In addition, assignment of economic value is often restricted to
only a few components of the system at risk and may ignore temporal and spatial
interconnectedness of organisms, populations, and ecosystems. Uncertainty with
respect to economic issues which should be considered are as follows (Harwell et
al. 1992):
•	Sustainability: Irreversible resource damage will undermine the
sustainability of ecosystems (and by extension, human society).
Thus, irreversible damage to an ecosystem should not be
economically discounted over a period of years (as in the amortization
of equipment and capita! resources), as this devalues the importance
of long-term environmental problems. Thus, regulatory agencies or
politicians may relegate a problem to a lower level of importance
because the effect will only be felt at some time in the future (e.g.,
global warming).
•	Willingness to pay: This assumes that market prices can be used to assess
the tastes and preferences of society. The problem with this
approach is individuals and society may enjoy the services provided
by ecosystems (i.e., clean air, water, weather control, food chain
maintenance, furnishing genetic diversity) without understanding
them or even having knowledge of their existence. Thus, their
willingness to pay for them or assign values may be incorrect and
inappropriate relative to their real ecological value.
•	Multipliers:	Economic analysis of benefits always includes multipliers
(e.g., developing a subdivision results in jobs for construction

-------
workers and a demand for building materials). Multipliers
should also be used in the risk side of the risk:benefit equation.
For example, the loss of a benthic community may result in
losses to fisheries, transportation, fishing equipment
manufacturers, the accommodation industry and marina
operations.
17.3 SPECIAL ISSUES OF UNCERTAINTY IN SEDIMENT ECOLOGICAL RISK
ASSESSMENT
17.3.1 Decision Making With Sediment Toxicity Measurement Endpoints:
Exploring the Effects of False Positive and False Negatives
Sediment risk assessments can be used in a variety of applications, including the
assessment of relative risk between an impacted site and a reference site or the
reduction in risk associated with a remediation action. A variety of chemical and
biological measurement endpoints can be used in the assessment, including
laboratory sediment toxicity tests and sediment quality guidelines (see Sections
18.2, 18.3, and 18.7). For example, sediment toxicity tests are now used to
evaluate the relative difference in organism survival or growth between sediments
from reference areas and dredged material (Chapter 5 on dredging assessment;
USEPA-USCOE 1991,1994). Test endpoints such as mortality or growth of
organisms exposed in the laboratory to field-collected sediments is assumed to
reflect the response of organisms in the field exposed to dredged material.
A key issue in the use of sediment endpoints within a regulatory or programmatic
environment is the level of confidence in the results of this assessment. Using the
above example, in dredging there is uncertainty in determining: (1) the probability
of stating that the reference area and the dredged material are different with
respect to toxicity when in fact they are the same (false positive) and, (2) the
probability of stating that the reference area and the dredged material are the
same, when in fact they are different (false negative). The power of the test is
the probability of stating that the reference area and dredged material are the
same when they are the same. In most regulatory applications we are only
interested in a single-directional test: whether the toxicity of the dredged material
is greater than the reference are toxicity (e.g., we ignore any information showing
the reference are toxicity is greater than the dredged area). This type of statistical

-------
approach to environmental decision-making achieves the goal of environmental
protection. However, some problems do arise. For example, if enough samples
are taken the reference and dredged areas can always be shown to be statistically
separable. The degree of separation in toxicity can be very small, but statistically
evident. We could frame an alternative null hypothesis to detect a difference of an
ecologically significant magnitude. While this has scientific appeal, the degree of
difference representing an ecologically significant result could be long debated.
In classical statistical terms, the chance of false positive decisions is termed a
Type I error (cc), the chance of false negative decisions is considered a Type II
error (S), and power is 1-13. Investigations typically focus on establishing rigorous
Type I errors. For example, risk managers are often willing to risk a 5% chance
that a Type I error occurs. However, Type II errors are often ignored, or no
definitive Type II decision criteria are established. In classical hypothesis testing,
balancing the potential occurrence of false positive and false negative results is a
function of the number of samples collected and the variance of the sample mean.
Type I and Type II errors are mathematically linked, for a fixed sample size and
variance, so establishing a in the decision criteria determines S (see Steel and
Torrie 1980, for a discussion of this topic).
The interrelationship of Type I and Type II error require consideration of the
relative importance of the risks associated with making false positive and false
negative decisions. Risk assessment usually focuses on reducing the risk of false
positive results associated with statistical analysis (by establishing a low a level).
However, from an environmental protection perspective, emphasis should be
placed on reducing the risk of making false negative decisions {i.e., falsely
concluding that an area is not contaminated when it actually is). For example, an
environmentally conservative approach would emphasize identifying small
differences between the reference and test areas. Therefore, it would be desirable
to have a high chance of classifying a site as clean, when it is clean (high power),
and a small chance of falsely classifying the reference and test site are the same
when they are different (small 8). In this conservative approach, one would rather
make an error in judging a clean site as contaminated, than misclassify a
contaminated site as clean.
In contrast, an alternative approach would be to classify the test site different

-------
from the reference site only when the data provide a large degree of confidence in
the decision. Therefore, one would want to reduce the error in stating that the
reference site and test site are different, when they are not. This is accomplished
by establishing a small a, with a higher probability of false negative results.
17.3.1.1 Case Study: Inter-laboratory Variability
The chance of false positive and false negative results is a function of the number
of samples and the variance of the test endpoint. As an example, we will examine
variability in sediment toxicity tests. While many sources of error are associated
with these tests (see Sections 18.2 and 18.3), the following example focuses only
on uncertainty associated with inter-laboratory variability. Inter-laboratory
variance (i.e., round-robin or ring testing) has been extensively studied in
whole-effluent toxicity testing (Warren-Hicks and Parkhurst 1992; Parkhurst et al.
1992) and we will draw on this earlier work in this analysis. Data for the analysis
are obtained from a round-robin study of whole-sediment toxicity tests (USEPA
1994a; ASTM 1995a-e; Burton et al. 1996).
A key issue in the use of any method is the number of tests required for a
specified decision criteria. In this example, Burton et al. (1996) reported mean
survival of Chironomus tentans in 10-day whole-sediment toxicity tests. Data
were generated from eight laboratories, each of which tested split samples of
field-collected sediment using the toxicity test method described in USEPA
(1994a). For one of the sediments evaluated in the study, the mean survival
among the eight laboratories was 76% with a standard deviation of 27% (resulting
in a coefficient of variation of 37%, which is considered acceptable
inter-laboratory precision; Burton et al. 1996). The data consisted of survival
measurements reported by each of eight laboratories. From this information, the
number of laboratories needed to achieve a specified decision criteria can be
estimated based on pre-specified probabilities of either false positive or false
negative results. For any one laboratory, the reported survival response was the
mean of eight replicate tests, each replicate test consisting of 10 organisms. If the
replicate data were available, we would have calculated the number of replicates
required by each of the eight participating laboratories for prespecified decision
criteria (the data are not available at the time of this analysis). For discussion
purposes only, we use the laboratory mean data and present an analysis of
inter-laboratory variability. The methods for estimating intra-laboratory replicates

-------
is consistent with the following discussion.
Suppose that an investigator is faced with determining the sediment toxicity of a
potentially impacted site. Also, assume that 90% survival has been established as
the acceptable control response. Given that the investigator has no prior
knowledge of the site toxicity, an assumption was made that the sediment is
about as toxic as that in the above referenced Burton et al. (1996) study.
Alternatively, the investigator could conduct a pilot study of the site instead of
making this assumption. Given these data, the investigator wishes to determine
the number of laboratories necessary to achieve a specified decision criteria based
on the chance of false positive and negative results. [Note: A somewhat related
concept is minimum detectable difference (MDD; USEPA-USCOE 1991, 1994).
The MDD is generally used to establish the minimum difference detectable
between a control and response solution for a single toxicity test, given a fixed
sample size and variance. This concept may be adaptable to our example by
evaluating the MDD between a reference site and a dredge site, for a fixed number
of laboratories with known inter-laboratory variance,]
The following equation provides a means of determining the desired number of
laboratories, while balancing the chance of false negative and false positive test
results:
where:
N = the number of laboratories required to meet specified levels of Type I
and Type II errors
Z1-B = the critical value of 1-S for the normal distribution (e.g, 1.64
for a 1-sided test with a B= 0.05 error probability),
Z1-o = the critical value of 1-a for the normal distribution (e.g, 1.64
for a 1-sided test with a a = 0.05 error probability),
Cs = the specified standard (i.e., 90% survival)
jx1 = the average percent survival across laboratories.
Figure 17.4 presents a plot of the power of a 1-sided test of the null hypothesis:
HO: Cs = ^1,
against the alternative hypothesis:
H1: Cs > y1; [Note: because we are only concerned if the toxicity of

-------
the site is less than the standard, a 1-sided test of the hypothesis is
appropriate].
Notice that fixing either the power of the test (1-B), or the Type I error rate (a)
establishes the other. For example, with a false positive and false negative error
probability of 5%, 33 laboratories are required for testing. With a false positive
and false negative error rate of 10%, 20 laboratories are required for testing.
17.3.1.2 Summary
The above example demonstrates a method for estimating the number of
laboratories given Type I and Type II errors. The choice of how much error is
acceptable is up to the investigator. The investigator should carefully consider the
relative merits and interpretations of Type I and Type II errors, when evaluating
the results from any sediment measurements used to establish the possibility of
contamination.

-------
Figure 17.4

-------
17.3.2 Uncertainty in Estimating Spatial and Temporal Distributions of
Contaminants in Sediment
Sediments may be highly variable on both a spatial and temporal basis. Therefore,
replicate samples need to be collected at each site to determine variance in
sediment characteristics. Sediment should be collected with as little disruption as
possible; however, subsampling, compositing, or homogenization of sediment
samples may be required for some experimental designs (e.g., USEPA 1994a,b;
ASTM 1995a-e; Environment Canada 1996a,b). Sampling locations might be
distributed along a known pollution gradient, in relation to the boundary of a
disposal area, or sampling locations may be identified as being contaminated in a
reconnaissance survey. These comparisons can be made in both space and time.
In pre-dredging studies, a sampling design can be developed to assess the
contamination of samples representative of the project area to be dredged
(Chapter 5 on dredging assessment). Such a design should include subsampling
cores taken to the project depth.
When dealing with a given sampling area (i.e., river, lake, estuary), the appropriate
sampling design is of importance since the goal might be to describe existing
conditions for the entire area by collecting a discrete number of samples. The
choice of the sampling scheme is also important for the intended data
manipulation. Fewer samples are needed if the objective of the study is to just
describe the average conditions over the entire area. On the other hand, if the
sampling is to be used to draw a map of distribution (i.e., highlight point sources,
trends of distribution, location and area of contamination), the choice of the
sampling net (regular, random, or fixed grid) may dictate which type of mapping
system has to be used (Baudo 1990). In addition, if temporal variability is
expected, sampling should be repeated as many times as possible to reduce this
source of uncertainty. Sampling frequency is particularly critical since the timing
of the successive samples can only be used to evaluate the change on the
selected time scale.
17.3.2.1 Estimation and Measurement of Magnitude of Uncertainty
The overall uncertainty of sampling depends on several factors including sample:
(1) type, (2) volume, (3) equipment, (4) handling, (5) number, and (6) replicates.
Sample type: Sediment is a complex mixture of solid, aqueous, and gaseous

-------
phases, in addition to biotic compartments. Hence, study objectives must clearly
indicate which type of medium is to be sampled. For example, different methods
may be needed to sub-sample pore water in sediment or to sample benthic
organisms in sediment. On the other hand, the objective of the study may be to
sample the whole "active" layer (e.g., to calculate a diversity index) or sample the
vertical micro-structure of some sediment characteristic (e.g., the vertical profiles
for redox or oxygen). Once the type of sample required has been identified, the
choice of the sampling gear must be made accordingly. It is often difficult to
determine the appropriate depth of sediment to sample (e.g., "How deep must a
core be? How deep is the bioturbation? Where is the boundary between the oxic
and the anoxic layers?"). A common mistake is to sample at the maximum depth
in the sediment which will potentially provide an unrealistic estimate of exposure
(i.e., sampling below the biologically active zone). Finally, the performance of the
selected sampling gear is not constant (depending on the kind of substrate and the
operating conditions, including skill of operators). In most cases, when sampling
is made without actually seeing what is being collected, the uncertainty can be
assessed only after the sample is recovered (e.g., via visual observation of core
length, texture, or color) or processed in the laboratory. As a consequence, the
degree of uncertainty is different when sampling is done blind or is done visually
by checking the performance of the sampling gear (Tables 17.1 and 17.2).
Sample volume: Some variables may exhibit pseudo-continuous distribution in
space (e.g., grain size where particles are sorted according to the hydraulics of the
system); whereas, other variables often have a pronounced patchiness (e.g.,
benthic invertebrate distributions). Hence, the sample volume should account for
the known or estimated local micro-spatial heterogeneity (both horizontal and
vertical) at the scale of the sampling tool. Larger samplers will provide an
"averaged" sample with an
increased uncertainty of measuring smaller-scale heterogeneity. If variables with
substantially different distributions have'to be measured, repeated sampling with
different samplers should be considered. Sub-sampling of sediments is often
required; therefore, samples should be thoroughly homogenized before splits are
made (USEPA 1994a; ASTM 1995a). The amount of sediment required for
analyses can range from a few micrograms (e.g., CHN analysis) to several
kilograms (e.g., radioactive isotopes, laboratory toxicity tests). Hence, the
minimum amount of sample needed to perform all analyses must be estimated

-------
before choosing the sampling equipment and may limit the number of
measurements performed on a sample. For example, if a coring tube is used to
sample sediment, the diameter and the thickness of sections need to provide
enough material for all of the planned analyses. The same sample is typically used
for more than one analysis. Hence, a compromise would be to use the original
sections for measurements requiring small aliquots and to use pooled sections for
measurements requiring larger aliquots. It should be noted that this procedure
potentially limits subsequent statistical analysis of data (e.g., correlations, principal
component analysis, cluster analysis) which can be done only with paired data. A
special case where the sample volume is particularly critical is evaluating
pore-water composition of sediment. In this case, the water content of the
sediment must be estimated in advance to be sure enough water can be extracted
from each sample.
In summary, to increase confidence and lower uncertainty, the sample volume
must be large enough to provide a representative sample for each measurement
endpoint of interest, enough material to perform all analyses plus further
measurements that may be needed (i.e., Toxicity Identification Evaluation; Chapter
16). Often these two requirements are difficult to achieve due to lack of
knowledge and the need to minimize the sampling effort. However, it should be
kept in mind that collection of additional sample will result in a different sample.
Sampling equipment (dredges, grabs, corers): All types of sampling equipment
vary in performance and each device has specific advantages and disadvantages
(i.e., size, weight, triggering mechanisms; ASTM 1995b). Few comparisons have
been reported dealing with sampling efficiency between types of equipment
(Baudo 1990); however, different types of dredges, grabs, and corers provide
unique types of samples. Thus, choice of equipment for both whole-sediment and
for pore-water sampling depends on the study objectives, measurement endpoint
of interest, characteristics of the study area, sediment type and compactness, and
the presence of interfering flora or fauna (e.g., roots, shells), (Baudo 1990,
Murdoch and MacKnight 1991, Adams 1991, ASTM 1995b). Moreover,
allowance should be made for the different performance of the same sampler
depending on the environment in which it is used (e.g., soft bottom or sand). The
equipment can alter or contaminate the sample (e.g., metal or plastic in contact
with sample; cleaning of the sampler) or the equipment may produce artifacts due
structural limitations (e.g., washing out of finer material, gas or temperature

-------
changes). Mudroch and MacKnight (1991) and ASTM (1995b) provide additional
details regarding sampling equipment.
In addition to dredges, grabs, and corers, a number of non-conventional sampling
tools have been applied for specific purposes including "peepers" (dialysis
chambers for collection of pore water), sedimentation traps, and artificial
substrates (for benthic colonization studies). Although these and other
non-conventional tools may provide useful information, lack of standardization may
lead to a high uncertainty with their use. To summarize, any sampling equipment
may introduce a marked uncertainty since it is largely unknown whether a
representative sample has been collected.
Sample handling: Techniques of sample conservation and manipulation should be
carefully examined using specific equipment to prevent not only the contamination
of the samples, but also possible alterations (Murdoch and Bourbonniere 1991;
ASTM 1995b) . Alteration of the sediment usually remains undetected unless
specific studies are designs are used (e.g., repeated measures at different times or
after each sampling step). Hence, identification of new handling artifacts may
make results of previous studies questionable. In order to minimize uncertainty, a
consensus method should be established and followed; however, periodic revisions
may be required.
Number of samples: Switzer (1979) pointed out a common statistical problem: the
estimation of the required number of sampling points can be determined
statistically only after the data have been gathered, or if some estimate of crucial
sediment characteristics are obtained in preliminary studies. A number of
approaches can be used to estimate the minimum number of samples needed to
estimate an average value for the measurement of interest (Baudo "1990).
Information needed to determine the number of samples includes heterogeneity in
physical and chemical data (Kratochvil and Taylor 1981; Sokal and Rohlf 1981;
Hakansson and Jansson 1983). The required number of samples also depends on
the distribution of the data (normal, Poisson, negative binomial distributions). A
detailed description of the statistical properties of these distributions can be found
elsewhere (Bliss and Fisher 1953; Sokal and Rohlf 1981).
For the definitive sampling of an area, additional considerations including

-------
directionality and point sources may dictate the sampling plan. The sampling
strategies in these cases can be classified in three main types {Hakanson and
Jansson 1981; Baudo 1990): (1) a deterministic system, with a sampling design
based on previous information and varying density; (2) a stochastic system, when
the sampling stations are randomly selected; and (3) a regular grid system, with
the sampling stations randomly or deterministically selected. Advantages and
disadvantages of each method are discussed in Baudo (1990).
Repetitive sampling: More often than not, only one sample per station is
collected. Repetitive sampling allows for an estimate of the local spatial or
temporal heterogeneity (USEPA 1994a,b; ASTM 1995a-e). An accurate
estimation of the sampling variability (within the station and among stations) is
needed to avoid false positive or false negative decisions and assumes an even
greater importance when assessing spatial or temporal variability. The obvious
disadvantages of repeated sampling are the increased cost and time though the
increased costs of collecting two or more samples from the same station can be
quite low, especially if a multi-sampler can be used. However, it will be much
more expensive to perform all of the analyses on each sample. Alternatively, a
representative measure could be made on all sample pairs to evaluate local
heterogeneity (i.e., CHN analysis which may be related to distributions of metals
and organic contaminants). An extrapolation to biological variables may be more
difficult. Samples for biological measures are often pooled to provide an average
of the local variability. For example, if the purpose of the study is to conduct a
reconnaissance survey to identify contaminated areas for further investigation, the
experimental design might include collection of just one composited sample from
each area to allow for sampling a larger area. The lack of replication at an area
usually precludes statistical comparisons (e.g., ANOVA), but these surveys can be
used to identify contaminated areas for further study or these data can be
evaluated using statistical regressions (ASTM 1995a; USEPA 1994a,b). In other
instances, the purpose of the study might be to conduct a quantitative sediment
survey to determine statistically significant differences between control (or
reference) sediments and test sediments from one or more areas. The number of
replicates/site should be based on the need for sensitivity or power. In a
quantitative survey, field replicates (separate samples from different grabs
collected at the same area) would need to be taken at each site. Separate
subsamples from the field replicates might be used to determine within-sample

-------
variability or for comparisons of test procedures (e.g., comparative sensitivity
among test organisms), but these subsamples cannot be considered true field
replicates for statistical comparisons among areas.
17.3.2.2 Accounting for and Reducing Uncertainty
MacKnight (1991) identified the following factors as the most important to
identifying sampling options: (1) purpose of sampling, (2) study objectives, (3)
historical data and other available information, (4) bottom dynamics at the
sampling area, (5) size of the sampling area, and (6) available funds vs. estimated
(real) cost of the project. Factors 1 and 2 are obviously critical, and must be
agreed upon in advance between managers and scientists involved in the project.
This could easily be the most important source of uncertainty, since "there is no
one formula for design of a sediment sampling pattern which would be applicable
to all sediment sampling programs" (MacKnight 1991). Since inadequate
strategies and unclear goals of sediment sampling are among the most important
sources of uncertainty, the early involvement of managers in the sampling plan
should be sought.
In addition, to reduce the sampling bias, the proposed project should be peer
reviewed by scientists with expertise in each one of the fields of study covered by
the project. This peer review should evaluate the adequacy of the sample media,
sample volume, sampling equipment, sample handling, samples number,
repetitions. Information related to Factors 3 (historic data) and 4 (dynamics of
bottom sediment) listed above are often not available to assist in the selection of
sampling areas (number and location) with the required degree of confidence.
This information is needed to assure an unbiased assessment of Factor 5 (size of
the sampling area) and Factor 6 (costs). In any case, the choice of sampling plan
should support a statistical evaluation of the sampling variability (both spatial and
temporal). Hence, a pilot study is usually needed to established local spatial and
temporal heterogeneity before the definitive sampling is performed. Whenever
feasible, in situ tools should be used in the pilot study to estimate distributions in
relevant physical, chemical, and biological measurements (e.g., sediment
compactness, echo-sounding, pH, oxygen, redox, biotic communities). The
extrapolation to field conditions from data gathered on samples transferred to the
laboratory is always subject to uncertainty. For this reason, there is a need for
developing in situ techniques for measuring chemistry, toxicity testing, and benthic

-------
community. Alternately, uncertainty associated with sampling could be
substantially lowered by visually checking the sampling sites (e.g., by using divers,
submersible cameras, manned vehicles; Tables 17.1 and 17.2).
17.3.2.3	Interpretation of Uncertainty In Relation to Decision Making
The number of samples determine the cost of the study (assuming all samples are
used for analysis). Hence, there is a need to limit the number as much as possible
and still retain confidence that the final results will be both sound and defensible.
Too few samples will result in large variability and may result in the need for
additional sampling; whereas, too many samples will result in a waste of
resources. On the other hand, the "uncertainty" associated with measurement
endpoints outlined in Chapter 18 should be weighed relative to the cost-benefit
analysis for the potential remedial options. An overestimate of the actual
contamination, both in terms of concentration or the distribution in the study area
may lead to an inflated cost for remediation. An underestimate could result in a
wrong choice (no remedial action) or in limited intervention.
17.3.2.4	Summary
Uncertainty in sampling is typically unaccounted for in most risk assessments.
This uncertainty can result from either poor knowledge of the real performance of
the samplers or from an inadequately designed sampling plan. In addition, both
systematic and random errors can occur and usually remain unknown or
undetected. Furthermore, the uncertainty associated with sampling is much
higher in cases when the actual sampling is done under "blind" conditions (i.e.,
sampling from a boat; Table 17.1) compared to when the operators can see the
sampling medium (shallow areas sampled by hand or with visual aids; Table 17.2).
The uncertainty in describing temporal variability is usually greater since it
compounds the uncertainty in spatial heterogeneity with uncertainty in repeated
sampling. Tables 17.1 and 17.2 provide an indication of overall uncertainty in
sampling and assume: standardized procedures are followed, the reliability of the
selected sampling method is considered, and allowances are made for systematic
or random errors. Sampling uncertainty has components which can be reduced
by: (1) planning the sampling according to the existing information, (2) using
appropriate collection and handling methods, (3) conducting pilot studies, and (4)
measuring the different sources of variability in the definitive study.

-------
Table 17.1 Degree of uncertainty in "
blind sampling"


Knowledge
Systematic errors
Random errors
1 Sample type
L
L
L
| Sample volume
L
L
M
1 Sample equipment
L
H
H
1 (non conventional)
H
H
H
Sample handling
L
L
M
Sample number
H
H
H
Sample repetitions
H
L
H
L= Low, M = Medium,
H = high


Table 17.2 Degree of uncertainty in visual sampling

Knowledge
Systematic errors
Random errors
Sample type
L
L
L
Sample volume
L
L
L
Sample equipment
I
L
L
(non conventional)
H
H
H
Sample handling
L
L
M
Sample number
H
L
H
Sample repetitions
H
L
L
L= Low, M = Medium, H = high

-------
17.3.3 Error and Uncertainty in Models Applied in Sediment Ecological Risk
Assessments
Exposure models are used to predict concentrations of contaminants in the
physical environment (sediments, interstitial water, and overlying water) and
concentrations in organisms as a function of space and time (Chapter 15). Two
types of system-level models are typically applied to evaluate contaminated
sediments: (1) contaminant transport and fate models and (2) food chain models.
Sections 8.5 and 8.6 discuss uncertainty associated with specific models used to
evaluate bioavailability of contaminants associated with sediments. System-level
models should be able to predict contaminant concentrations as affected by
natural or anthropogenic events. The goal of modeling should be to develop a
predictive model (i.e., a model which is based on parameters which can be
measured accurately in the laboratory or by means of simple field tests).
Computations should then be based on these parameters with, ideally, no
calibration or fine-tuning. In this way, one can have confidence in the predictions
and can also use the model to evaluate different environmental conditions and
different systems, again with little or no additional calibrating.
17.3.3.1 Model Calibration
When a model is calibrated, the calibration is only valid for the data used in the
calibration. In order to illustrate this point, consider predicting contaminant
concentrations in a lake over 25 years. During this time, a few large storms can
occur and these storms may be responsible for most of the sediment and
contaminant transport. If the model is calibrated to data taken in an "average"
year (i.e., when the large storms did not occur), then parameters and results will
be incorrect because they did not include extreme events. If, on the other hand,
the model is calibrated to data taken in a very stormy time, the parameters and
extrapolated results will also be incorrect. A predictive model needs to be based
on the concept that the future is statistically the same as the past. What is
known is only that events of a certain magnitude have a certain probability of
occurring. Predictive models should then be able to predict the most probable
result and the probabilities of the results of different sequence of events.
As an example of model variability, consider the prediction of PCB half-life in Lake
Ontario made by three independent groups (Limnotech, Manhattan College, and
University of Toronto). All of these models are based on the concept of a

-------
well-mixed sediment layer (Table 17.3; Ziegler and Connolly 1995). The effect of
this layer on sediment fluxes is dependent on the thickness of the layer (typically a
poorly defined characteristic). Because of this, estimates of half-life differ by
almost one order of magnitude, from 3 years to 25 years.
Therefore..(INFORMATION FENDING FROM LICK).
Table 17.3 Results of three different predictions of half-life of PCBs in Lake
Ontario.
Investigators
Assumed thickness of layer
PCB half-life
Limnotech
15 cm
25 yrs
Manhattan College
8 cm
15 yrs
University of
Toronto
0.5 cm
3 yrs
17.3.3.2Errors and Uncertainties in Contaminant Transport and Fate Process
Models
In order to quantitatively understand and predict the environmental effects of
contaminated sediments, especially as influenced by natural large episodic events
or by remedial actions, a knowledge of the transport and fate of the sediments and
the contaminants associated with these sediments is necessary (Chapter 15).
Some of the more significant processes that need to be understood and quantified
include: (1) the resuspension, erosion, and sediment bed dynamics; (2) sorption of
contaminants to particles and colloids; (3) flocculation, settling speeds, and
deposition rates of particles and floes; (4) hydrodynamics including currents and
wave action; (5) air-water exchange of contaminants; (6) biochemical reactions
and degradation of contaminants; and (7) the inputs of contaminants from the
surrounding land, atmospheric, and point discharges.
The process most relevant to estimates of contaminant fluxes at the
sediment-water interface are processes 1 to 4 listed above. These sediment-water
exchange processes are important because these factors control phase
distributions, bioavailability, and contaminant concentrations in the sediments and
overlying water (Chapter 15). The flux of contaminants to surface waters from
the surrounding land resulting from non-point discharges is also not well
quantified. Point discharges are generally better known and controlled.

-------
17.3.3.3 Relevance to Decision Making
In some form or another, models are always used in organizing and interpreting
data and, therefore, in decision making. These models may be simple conceptual
models or they may be complex models involving many physical, chemical, and
biological processes described by large numbers of differential equations. The
solutions to these models may be simple estimates or large arrays of numbers.
Models should help in making decisions (e.g., selection of remedial options and
understanding the effects of these remedial actions). Because of errors in models,
more complex models are not necessarily more accurate or more helpful than
simple models. However, simple models are often based on larger numbers of
assumptions and may thus be inaccurate. Although complex models may address
these assumptions more appropriately, they require more input information which
makes them less useful. In some cases, complex models may have more
assumptions (i.e., input parameters) which, if not verified, lead to higher
uncertainty. In all modeling, the potential errors of the model should be
understood and quantified. This information needs to be transmitted to the risk
manager both at the problem formulation stage and at the risk management stage
(Figure 17.2). Because of natural variability, models should give the most
probable outcome of a sequence of events as well as the probabilities.
17.4 CONCLUSIONS AND RECOMMENDATIONS
•	Guidelines have been developed for conducting ecological risk assessments to
promote consistency in design, analysis, and interpretation of data. These
guidelines also allow for the establishment of quality standards and consistent
terminology for assessments. Consistency in the use of guidelines can help
inform all stakeholders as to the relative degree of confidence and scientific
knowledge under which a decision was made. Additionally, the best way to
reduce initial uncertainties in the risk assessment is to involve all interested
parties through stakeholder input and expert opinion surveys. This involves
asking all interested parties (including the risk managers, the scientific
community, and the public) to define the problem in a form of a concise
narrative. Once the problem has been identified, appropriate assessment and
measurement endpoints can be selected and applied.
•	Sources of uncertainty unique to characterizing exposure or effects in SERAs
include: (1) sample location, collection and handling errors; (2) spatial and

-------
temporal heterogeneity of the stressor, (3) references for comparison of
stressor levels, (4) substrate type and interactions with stressor, (5)
non-equilibrium of chemical stressor between sediment and water, (6) effects
of non-contaminant stressors in toxicity tests, and (7) laboratory to field
extrapolations,
•	Risk characterization requires consideration of the relative Importance of the
errors associated with making both false positive and false negative
determinations. Risk assessment usually focuses on reducing the risk of false
positive results associated with statistical analysis (by establishing a low a
level). However, from an environmental protection perspective, emphasis
should also be placed on reducing the risk of making false negative decisions
(i.e., falsely concluding that an area is not contaminated when it actually is),
•	Sediment sampling has uncertainty components which can be reduced by
planning the sampling program according to the existing information, using
appropriate collection and handling methods, conducting pilot studies, and
measuring the different sources of variability in the definitive study.
Additionally, uncertainty associated with sampling of sediment can be
substantially lowered by visually checking the sampling sites.
•	Exposure models are used to predict concentrations of contaminants in the
physical environment and concentrations in organisms as a function of space
and time. The models should be predictable of contaminant concentrations as
affected by natural or anthropogenic events. The goal of modeling should be to
develop a predictive model (i.e., a model which is based on parameters which
can be measured accurately in the laboratory or by means of simple field tests).
Computations should then be based on these parameters with, ideally, no
calibration or fine-tuning. In this way, one can have confidence in the
predictions and can also use the model to evaluate different environmental
conditions and different systems, again with little or no additional calibration.
•	The outcome of all risk management actions should either be the acceptance or
the reduction of the risk. Risk reduction involves many potential actions which
range from technical through socio-economic to the political. In undertaking
risk management, it is necessary to decide which risks must be managed and in
what priority.
•	Strategic actions for addressing uncertainty in SERAs are listed in Section 18.8.
17.5 REFERENCES

-------
Adams, D.D. 1991. Sampling Sediment Pore Water. In A. Mudroch and S.C.
MacKnight, eds., Handbook of techniques for aquatic sediments sampling. CRC
Press, Ann Arbor, Ml, pp. 171- 202.
American Society for Testing and Materials. 1995a. Standard test methods for
measuring the toxicity of sediment-associated contaminants with freshwater
invertebrates. E1706-95b. In Annual Book of ASTM Standards, Vol. 11.05,
Philadelphia, PA. pp. 1204-1285.
American Society for Testing and Materials. 1995b. Standard guide collection,
storage, characterization, and manipulation of sediments for toxicological testing.
E1391-94. In Annual Book of ASTM Standards, Vol. 11.05, Philadelphia, PA. pp.
835-855.
American Society for Testing and Materials. 1995c. Standard guide for conducting
10-day static sediment toxicity tests with marine and estuarine amphipods. E
1367-92. In Annua! Book of ASTM Standards, Vol. 11.05, Philadelphia, PA. pp.
767-792.
American Society for Testing and Materials. 1995d. Standard guide for designing
biological tests with sediments. E1 525-94a. In Annual Book of ASTM Standards,
Vol. 11.05, Philadelphia, PA. pp. 972-989.
American Society for Testing and Materials. 1995e. Standard guide for
determination of bioaccumulation of sediment-associated contaminants by benthic
invertebrates. E1 688-95. In Annual Book of ASTM Standards, Vol. 11.05,
Philadelphia, PA. pp. 1140-1189.
Ankley, G.T., K. Lodge, D.J. Call, M.D. Balcer, L.T. Brooke, P.M. Cook, R.G.
Kreis, A.R. Carlson, R.D. Johnson, G.J. Niemi, R.A. Hoke, C.W. West, J.P. Giesy,
P.D. Jones, and Z.C. Fuying. 1992. Integrated assessment of contaminated
sediments in the Lower River and Green Bay, Wisconsin. Ecotoxicol. Environ.
Safety 23:46-63.
Baudo, R. 1990. Sediment sampling, mapping, and data analysis. In R. Baudo, J.P.
Giesy and H. Muntau, eds., Sediments: Chemistry and toxicity of in-place
pollutants. Lewis Publishers, Chelsea, Ml, pp. 15-60.
Bliss, C.I. and R.A. Fisher. 1953. Fitting the negative binomial distribution to
biological data. Biometrics 9:176-200.
Burton, G.A., Jr. and C.G. Ingersoll. 1 994. Evaluating the toxicity of sediments.
The ARCS assessment guidance document. EPA/905-B94/002, Chicago, IL.
Burton, G.A., T.J. Norberg-King, C.G. Ingersoll, G.T. Ankley, P.V. Winger, J.

-------
Kubitz, J.M. Lazorchak, M.E. Smith, I.E. Greer, F.J. Dwyer, D.J. Call, K.E. Day, P.
Kennedy, and M. Stinson. 1996. Interlaboratory study of precision: Hyalelfa
azteca and Chironomus tentans freshwater sediment toxicity assays. Environ.
Toxicol. Chem.: In press.
Canadian Water Quality Guidelines. 1987. Task force on water quality guidelines
of the Canadian council of resource and environment ministers. Ottawa, ON.
Cardwell, R.D., B.R. Parkhurst, W. Warren-Hicks, and J.S. Volosin. 1993.
Aquatic ecological risk. Water Environ. Technol. 5:47-51.
Environment Canada. 1996a. Biological Test Method: Test for growth and survival
in sediment using the freshwater amphipod Hyalella azteca. Environment Canada,
Ottawa, Ontario. Technical report number pending: In press.
Environment Canada. 1996b. Biological Test Method: Test for growth and survival
in sediment using larvae of freshwater midges (Chironomus tentans or Chironomus
riparius), Environment Canada, Ottawa, Ontario, Technical report number
pending: In press.
Francinques, N.R. Jr., M.R. Palermo, C.R. Lee, R.K. Peddicord. 1985.
Management strategy for disposal of dredged material: Contaminant testing and
controls. Miscellaneous Paper D-85-1, U.S. Army Engineer Waterways Experiment
Station, Vicksburg, MS.
Graney, R.L., A. Maciorowski, K.R. Solomon, H. Nelson, D. Laskowski and J.L.
Baker. 1994. Report of the aquatic risk assessment and mitigation dialogue
group. SETAC Foundation for Education, Pensacola, FL.
Grigalunas T.A. and J.J. Opaluch. 1989. Economic considerations of managing
contaminated marine sediments. In Contaminated marine sediments -Assessment
and remediation. National Research Council, National Academy Press, Washington,
DC, pp. 291-310.
Hakanson, L. and M. Jansson. 1983. Principles of lake sedimentology.
Springer-Verlag, Berlin, 316 p.
Harwell, M.A., W. Cooper and R. Flaak. 1992. Prioritizing ecological and human
welfare risks from environmental stresses. Environmental Management,
16:451-464.
Health Council of the Netherlands. 1993. Ecotoxicological risk assessment and
policy-making in the Netherlands - dealing with uncertainties. Network,
6(3)/7(1):8-11

-------
International Joint Commission. 1988. Options for the remediation of
contaminated sediments in the Great Lakes: Sediment Subcommittee and its
Remedial Options Work Group to the Great Lakes Water Quality Board Report to
the International Joint Commission. Windsor, Ont.
Jessen, R.J. 1978. Statistical Survey Techniques. Wiley, New York.
Kemble, N.E., W.G. Brumbaugh, E.L. Brunson, F.J. Dwyer, C.G. Ingersoll, D.P.
Monda, and D.F. Woodward. 1994. Toxicity of metal-contaminated sediments
from the upper Clark Fork River, MT to aquatic invertebrates in laboratory
exposures. Environ. Toxicol. Chem. 13:1985-1997.
Klaine, S. J., G.P. Cobb, R.L. Dickerson, K.R. Dixon, R.J. Kendall, E.E. Smith and
K.R. Solomon. 1996. An ecological risk assessment for the use of the biocide,
dibromonitrilopropionamide (DBNPA) in industrial cooling systems. Environ.
Toxicol. Chem. 15:21-30.
Kratochvil, B. and J.K. Taylor. 1981. Sampling for chemical analysis. Anal. Chem.
53: 924A-938A.
Lynam W.J,, A.E. Glazer, J.H. Ong, and S.F. Coons. 1987. An overview of
sediment quality in the United States. EPA-905/9-88-002, Washington, DC.
MacKnight, S.D. 1991. Selection of bottom sediment sampling stations. In A.
Mudroch and S.C. MacKnight, eds., Handbook of techniques for aquatic sediments
sampling. CRC Press, Boca Raton, FL, pp. 17-28.
McBean, E.A. and F.A. Rovers. 1992. Estimation of the probability of
exceedance of a contaminant concentration. Ground Water Monitoring Review
12:115-119.
Mudroch, A. and S.D. MacKnight. 1991. Bottom sediment sampling. In A.
Mudroch and S.C. MacKnight, eds.,. Handbook of techniques for aquatic
sediments sampling. CRC Press, Boca Raton, FL, pp. 29-95.
Mudroch, A. and R.A. Bourbonniere. 1991. Sediment sample handling and
processing. In A. Mudroch and S.C. MacKnight, eds., Handbook of techniques for
aquatic sediments sampling. CRC Press, Boca Raton, FL, pp. 131-169.
Parkhurst, B.R., W. Warren-Hicks, and L.E. Noel. 1992. Performance
characteristics of effluent toxicity tests: summarization and evaluation of data.
Environ. Toxicol. Chem. 11:771-791.
Parkhurst, B.R., W. Warren-Hicks, T. Etchison, J.B. Butcher, R.D. Cardwell and J.
Voloson. 1995. Methodology for aquatic ecological risk assessment. Report

-------
prepared for the Water Environment Research Foundation, Alexandria, VA,
RP91-AER-1 1995.
Russell, M. and M. Gruber. 1987. Risk assessment in environmental
policy-making. Science 236:286-290.
Solomon, K.R., D.B. Baker, P. Richards, K.R. Dixon, S.J. Klaine, T.W. La Point,
R.J. Kendall, J.M. Giddings, J.P. Giesy, L.W. Hall, Jr., C.P. Weisskopf, and M.
Williams. 1996. Ecological risk assessment of atrazine in North American surface
waters. Environ. Toxicol. Chem. 1 5:31-76.
Sokal, R.R. and F.J. Rohlf. 1981. Biometry. Freeman and Co., New York, 859 p.
Steel, R.G. and J.H. Torrie. 1980. Principals and Procedures of Statistics, McGraw
Hill, NY.
Stephan, C.E., D. I. Mount, D.J, Hansen, J.H. Gentile, G.A. Chapman, and W.A.
Brungs. 1985. Guidelines for deriving numerical national water quality criteria for
the protection of aquatic organisms and their uses. PB85-227049, National
Technical Information Service, Springfield, VA.
Switzer, P. 1975. Statistical considerations in network design. Water Resour. Res.,
15: 1512-1516.
U.S. Environmental Protection Agency. 1989. Assessing human health risks from
chemically contaminated fish and shellfish: A guidance manual. EPA
503/8-89-002, Washington, DC.
U.S. Environmental Protection Agency. 1992a. Framework for Ecological Risk
Assessment. EPA/630/R-92/001, Washington, DC.
U.S. Environmental Protection Agency. 1992b. Sediment classification methods
compendium. Sediment Oversight Technical Committee. EPA 813-R-92-006,
Washington, DC.
U.S. Environmental Protection Agency. 1994a. Methods for measuring the toxicity
and bioaccumulation of sediment-associated contaminants with freshwater
invertebrates. EPA/600/R-94/024, Duluth, MN.
U.S. Environmental Protection Agency. 1994b. Methods for measuring the toxicity
of sediment-associated contaminants with estuarine and marine amphipods.
EPA/600/R-94/025, Duluth, MN.
U.S. Environmental Protection Agency and U.S. Army Corps of Engineers. 1991.
Evaluation of dredge material proposed for ocean disposal. EPA-503/8-91/001,

-------
Washington, DC.
U.S. Environmental Protection Agency and U.S. Army Corps of Engineers. 1994.
Evaluation of dredged material proposed for discharge in inland and near coastal
waters (draft). EPA-823-B-94-002, Washington, DC.
Warren-Hicks, W., and B.R. Parkhurst. 1992. Performance characteristics of
effluent toxicity tests: Variability and its implications for regulatory policy.
Environ. Toxicol. Chem. 11:793-804.
Ziegler and Connolly. 1995. (Citation pending from Lick)

-------
NHEERL-COR-2012A
TECHNICAL REPORT DATA
(Please read instructions on the reverse before compt^i—1
1. REPORT NO.
EP A/600/A-96/098
2.
3.
4. TITLE AND SUBTITLE
Workgroup summary report on methodological uncertainty in conducting
sediment ecological risk assessments with contaminated sediments
5. REPORT DATE
6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
C. Ingersoll1, R. Swartz2, et al.
G. R. Biddinger, T. Dillon, C. G. Ingersoll, editors
8. PERFORMING ORGANIZATION REPORT NO.
9. PERFORMING ORGANIZATION NAME AND ADDRESS
'National Biological Service
2U.S. Environmental Protection Agency, NHEERL, Corvallis, OR
10. PROGRAM ELEMENT NO.
11. CONTRACT/GRANT NO.
12. SPONSORING AGENCY NAME AND ADDRESS
US EPA ENVIRONMENTAL RESEARCH LABORATORY
200 SW 35th Street
Corvallis, OR 97333
13. TYPE OF REPORT AND PERIOD COVERED
Symposium paper
14. SPONSORING AGENCY CODE
EPA/600/02
15. SUPPLEMENTARY NOTES
1996. Proceedings of the 22nd Pellston Workshop, Pacific Grove, CA, April 23-28, 1995 of the Society of
Environmental Toxicology and Chemistry (SETAC)
16. ABSTRACT
This chapter describes the range of issues related to the uncertainty associated with sediment ecological risk
assessments (SERA), including an evaluation of: (1) uncertainty associated with the overall SERA framework; (2) the
effects of false positive and false negative errors associated with sediment toxicity tests; (3) spatial and temporal
distributions of sediment contamination; (4) sampling errors; and (5) uncertainties associated with transport and fate
models.
17. KEY WORDS AND DOCUMENT ANALYSIS
a. DESCRIPTORS
b. IDENTIFIERS/OPEN ENDED TERMS
c. COSAT1 Field/Group
sediments, ecological risk
assessment


18. DISTRIBUTION STATEMENT
19. SECURITY CLASS {This Report)
21. NO. OF PAGES
36
20. SECURITY CLASS [This page)
22. PRICE
EPA Form 2220-1 (Rev. 4-77) PREVIOUS EDITION IS OBSOLETE

-------