EPA/600/R-12/603
April 2013
Web-based Interspecies Correlation Estimation
(Web-ICE) for Acute Toxicity: User Manual
Version 3.2
\
http://www.epa.gov/ceampubl/fchain/webice/
Sandy Raimondo, Crystal R. Jackson, and Mace G. Barren
U.S Environmental Protection Agency
Office of Research and Development
National Health and Environmental Effects Research Laboratory
Gulf Ecology Division
Gulf Breeze, FL 32561
-------
Reference Web-ICE as:
Raimondo, S., C.R. Jackson, and M.G. Barren. 2013. Web-based Interspecies
Correlation Estimation (Web-ICE) for Acute Toxicity: User Manual. Version 3.2,
EPA/600/R-12/603, U. S. Environmental Protection Agency, Office of Research and
Development, Gulf Ecology Division. Gulf Breeze, FL.
Disclaimer:
The information in this document has been reviewed in accordance with U.S.
Environmental Protection Agency policy and approved for publication. Approval does
not signify that the content reflects the views of the Agency, nor does mention of trade
names or products constitute endorsement or recommendation for use.
Note about Version Control:
Web-ICE updates may vary among versions by website structure and function, models
resulting from updated databases, or both. A change in the first integer in the website
version signifies a major architectural change of the site. A change in the second integer
represents an update to an existing module. A change in the third integer represents a
change to models in one or more modules. The user manual will represent all versions
represented by the first two integers.
-------
Contents
Introduction 1
Model Development 2
I. Toxicity Databases 2
II. Model Development 3
III. Model Validation 3
Using the Web-ICE Program 4
I. Working with Web-ICE Aquatic or Web-ICE Wildlife Modules 5
Selecting Model Taxa 5
Estimating Toxicity 6
II. The Species Sensitivity Distribution (SSD) Module 8
III. The Endangered Species Module 11
Producing an Endangered Species Toxicity Report 11
IV. Accessing Model Data & Chemical Information 13
Mode of Action (MOA)-specific models 13
Guidance for Model Selection and Use 14
I. Statistical Definitions 14
II. Selecting a Model with Low Uncertainty 15
Rules of Thumb 15
Surrogate Species Selection: An Example 16
III. Evaluating Model Predictions 16
IV. Selecting Predicted Toxicity Values for SSDs 17
V. Applying Web-ICE in Ecological Risk Assessment (ERA) 17
Acknowledgements 18
References 19
Appendix 1. Number of Models by Version 21
11
-------
Introduction
Information on the acute toxicity to multiple species is needed for the assessment
of the risks to, and the protection of, individuals, populations, and ecological
communities. However, toxicity data are limited for the majority of species, while
standard test species are generally data rich. To address data gaps in species
sensitivity, the Interspecies Correlation Estimations (ICE) application was developed by
the U.S. Environmental Protection Agency (US EPA) and collaborators to extrapolate
acute toxicity to taxa with little or no acute toxicity data for a chemical of interest,
including threatened and endangered species (Asfaw et al. 2003). Web-based
Interspecies Correlation Estimation (Web-ICE) provides interspecies extrapolation
models for acute toxicity in a user-friendly internet platform.
ICE models estimate the acute toxicity (LC50/LD50) of a chemical to a species,
genus, or family with no test data (the predicted taxon) from the known toxicity of the
chemical to a species with test data (the surrogate species). ICE models are least
square regressions of the relationship between surrogate and predicted taxon based on
a database of acute toxicity values: median effect or lethal water concentrations for
aquatic species (EC/LC50; ng/L) and median lethal oral doses for wildlife species
(LD50; mg/kg bodyweight). Experimental or estimated (e.g., QSAR) acute toxicity fora
surrogate species may be used to estimate toxicity when there is an existing ICE model
between the surrogate and taxa of interest (e.g., species-species; species-genus;
species-family).
In addition to direct toxicity estimation, Web-ICE develops Species Sensitivity
Distributions (SSDs) from multiple surrogate and predicted species. SSDs are
cumulative distribution functions of toxicity values for multiple species and are used to
estimate a hazard level [hazardous concentration (HC) or hazardous dose (HD)] that is
protective of most test species (e.g., 95%) by estimating the concentration or dose at a
corresponding percentile (e.g., 5th) of the distribution (de Zwart2002). SSDs generated
in Web-ICE are log-logistic cumulative distribution functions of toxicity developed from
simultaneously estimated toxicity values to all predicted species available using up to 25
surrogates. ICE-generated SSD hazard levels have been shown to be within an order of
magnitude of measured HC5s (Dyer et al. 2006, Dyer et al. 2008) and HD5s (Awkerman
et al. 2008, 2009) and provide additional information for ecological risk assessment.
The Web-ICE Endangered Species module simultaneously estimates toxicity to
taxa representing threatened or endangered species using up to 25 surrogates. This
module batch processes toxicity values for endangered species from all species, genus,
and family level models available for the selected endangered species or taxa and
entered surrogates. The list of threatened and endangered species was obtained from
the US Fish and Wildlife Service Threatened and Endangered Species module of
Environmental Conservation Online System (http://ecos.fws.gov/tess_public; Accessed
August 2007), which was linked to Web-ICE species, genus, and family model
databases for aquatic organisms (not currently available for algae) and wildlife. Users
may predict to all available endangered species within a broad taxonomic groups (e.g.,
Fishes) or a particular species (e.g., Atlantic Salmon, Salmo salaf).
-------
This manual provides step-by-step instructions for using Web-ICE, as well as
information on the databases, model development, model validation, and user guidance
on model selection and interpretation. User guidelines outlined in the Guidance for
Model Selection and Use section of this manual are recommended to ensure high
confidence and low uncertainty in model predictions used in risk assessment.
Model Development
I. Toxicity Databases
Separate acute toxicity databases are maintained for aquatic animals
(vertebrates and invertebrates), aquatic plants (algae), and wildlife (birds and
mammals). Open-ended toxicity values (i.e. > 100 mg/kg or < 100 mg/kg) and duplicate
records among multiple sources are not included in any of the databases. Attributes for
and the number of models developed from each database are listed for the different
versions in Appendix 1.
The aquatic animal database is composed of 48 or 96-hr EC/LC50 values based
on death or immobility. This database is described in detail in the Aquatic Database
Documentation found on the Download Model Data page of Web-ICE and describes the
data sources, normalization (US EPA 1986), and quality and standardization criteria for
data used in the models. Data used in model development adhered to standard acute
toxicity test condition requirements of the American Society for Testing and Materials
(ASTM 2007, and earlier editions) and the US EPA Office of Prevention, Pesticides, and
Toxic Substances (e.g., US EPA 1996a).
The algal toxicity database is described in Appendix B of the Aquatic Database
Documentation. Algae data are 72 or 96-hr EC50. Validity of each record was evaluated
based on coherence to data quality criteria found within standard methods guidelines
(ASTM 2011, OECD 1996, US EPA 1996b). Models derived from this database predict
toxicity to a species or genus from a surrogate species or genus. Family level models
were not developed for algae because there were limited families that had two or more
species, which is a requirement for development of higher taxa models.
The wildlife database includes 96-hr LD50 values for terrestrial birds and
mammals collected from the open literature (Hudson et al. 1984; Shafer and Bowles
1985, 2004; Shafer et al. 1983; Smith 1987) and from datasets compiled by
governmental agencies of the United States (US EPA) and Canada (Environment
Canada; Baril et al. 1994; Mineau et al. 2001). Data were standardized by using only
data for adult animals and chemicals of technical grade or formulations with > 90%
active ingredient. Models derived from this database predict toxicity to a species or
family from a surrogate species. Genus level models were not developed for wildlife
because there were limited genera that had two or more species, which is a requirement
for development of higher taxa models.
-------
II. Model Development
Models are only developed for species within the same database (i.e. there are
no fish to algae models or algae to bird models, etc.). Where more than one toxicity
value is available for a species and chemical, the geometric mean of the values is used
in model development. In cases where the range of minimum and maximum values for a
chemical and species is greater than 10-fold, all data records for that chemical are
removed for that species due to its high variability. Models are least squares regressions
such that:
Logio(predicted toxicity) = a + b*Logio(surrogate toxicity)
where a and &are the intercept and slope of the line, respectively. Within a database, all
species are paired with each other by common chemical. Three or more common
chemicals per pair are required to develop a model. Genus and family-level models are
similarly developed by pairing each surrogate species or genus (algae only) with each
genus or family by common chemical. A genus or family requires unique toxicity values
for two or more species within the taxon. In cases where a surrogate species is
compared to its own genus or family, toxicity values of the surrogate are excluded from
the values used to represent the higher taxonomic level. Only models with a significant
relationship (p-value < 0.05) are included in Web-ICE. More details of model
development and validation are found in Raimondo et al. (2007, 2010).
III. Model Validation
The uncertainty of each model is assessed using leave-one-out cross-validation
(Insightful 2001). In this method, each pair of acute toxicity values for surrogate and
predicted taxa are systematically removed from the original model. The remaining data
are used to rebuild a model and estimate the toxicity value of the removed predicted
taxa toxicity value from the respective surrogate toxicity value. This method is only used
for models with sample size > 4. To maintain uniformity among the large number of
models contained within Web-ICE, the "N-fold" difference of each estimated and actual
value is used to determine the accuracy of the estimated toxicity value. For aquatic
species, inter-laboratory variation of acute toxicity test data for a given species and
chemical can be as great as a 5-fold difference (Fairbrother 2008). For wildlife species,
the average variability of toxicity measurements for a specific chemical and species is
between 4.0 and 6.4-fold (Raimondo et al. 2007). Thus, a 5-fold difference is considered
a good fit of predicted ICE values.
The cross-validation success rate for each model is the proportion of removed
data points that are predicted within 5-fold of the actual value. If the removal of an xy
data pair results in a model that is not significant at the p < 0.05 level, the replicate is not
included in calculating the cross-validation success rate. This is typically only the case
for models with low degrees of freedom (<8) and a p-value between 0.01 and 0.05 in the
original model. There is a strong relationship between taxonomic distance and cross-
validation success rate, with uncertainty increasing with larger taxonomic distance
-------
(Raimondo etal. 2007, 2010). For aquatic fish and invertebrates, models predict within
5-fold and 10-fold of the actual value with 91 and 96% certainty for surrogate and
predicted taxa within the same family, and for 86 and 96% within the same order
(Raimondo et al. 2010). In wildlife species, models predict within 5-fold and 10-fold of
the actual value with 90 and 97% certainty for surrogate and predicted taxa within the
same order (Raimondo et al. 2007). Uncertainty analysis of algal ICE models is ongoing.
Using the Web-ICE Program
The Web-ICE Modules contain models that predict single acute toxicity values to
aquatic vertebrate and invertebrate species, genera, and families; aquatic algae species
and genera; and wildlife (terrestrial birds and mammals) species and families. A Species
Sensitivity Distribution Module uses data for either terrestrial wildlife species or aquatic
species. The aquatic module can combine toxicity values for vertebrates, invertebrates
and algae. These modules batch process species-level toxicity from all entered
surrogates to develop a cumulative probability distribution of toxicity data and generate
a prescribed hazard level. The Endangered Species Modules predict multiple toxicity
values to represent listed species using all available species, genus, or family level
models for the entered surrogates. Modules are accessible either from the home page
or from the blue navigation bar along the left side of the page (Figure 1).
nterspecies Correlation Estimation
You are here EPA Home * Exposure A
^Contact Ui B3Sh
ood Chun * Inter&pecies Correlation Estimation
Tfie web-based Inierspecies Correlation Estimation (Web-ICE"} application estimat
ute toxicity to aquatic and terrestrial organisms for use in risk assessment- Me
Fer to the User Manual for detailed instructions on using Web-ICE.
Species Sensitivity Distribution Module
Endangered Species Module
Figure 1. Home page of Web-ICE program.
-------
I. Working with Web-ICE Aquatic or Wildlife Modules
Selecting Model Taxa
1. From either the home page or the blue navigation bar, click the link for the
module with which you will be working (Aquatic species, genus, or family; Algae
species or genus; Wildlife species or family).
2. You will then be directed to a Taxa Selection Page (Figure 2) which will allow you
to select your surrogate and predicted taxa for the model you want to use.
3. You may search for your surrogate and predicted taxa by either common name or
scientific name by selecting the appropriate option in the Sort by: drop down
menu. The default is set to common name (NOTE: Algae modules contain
scientific names only).
4. From the drop down menus, select the surrogate species and predicted taxon. It
does not matter which you select first; however, the second choice is limited to
the models available for the taxon chosen first.
5. To change any of your selections, press Reset and start again.
6. Click Continue to be directed to the calculator page for toxicity estimation.
If there is not a model for your predicted species of interest, you will need to use
a genus or family-level model to predict toxicity. The available models may be
determined by browsing through the genus and family level modules, or by searching
through the spreadsheets of model information available through the Download Model
Data option on the blue navigation bar. The downloadable Microsoft Excel®
spreadsheets provided for each Web-ICE module may be sorted by surrogate species
or predicted taxa to identify available models.
-------
Unlwd Slitei Eitvi/onmentil ProMciioo Ag«KV
Advanced Search
LEARN THE ISSUES SCIENCE & TECHNOLOGY LAWS & REGULATIONS ABOUT EPA
Interspecies Correlation Estimation
You are here: EPA Home * Exposure Assessment * Food Chain
Exposure Assessment
Models
Web it! Home
Aquatic Species
Aquatic Genus
Aquatic family
Algae Species
Alqae Genus
wildlife species
Wildlife Family
Species Sensitivity
Distributions
Endangered Species
Baste information
Download Model Data
nttct us
i. ; t * Aquatic Species Taxa Selection
Aquatic Species - Taxa Selection Page
Rjnihnw tinul (Onrorhvnrhns mvkissi _jd| Brown Itoul (Salmo trutta]
Sot! By | Common
Resell
Contmuel
Please address all comments and questions to the webmaster
Office of Research and Development National Hearth and fnvnonmenwi Effects Research Laboratory ' Gulf Ecology Division
Figure 2. Taxa selection page for aquatic species.
Estimating Toxicitv
The surrogate and predicted taxa selected from the previous page are listed at
the top of a calculator page (Figure 3). This page is divided into four parts: input (Figure
3A), calculated results (Figure 3B), model statistics (Figure 3C), and model graphic
(Figure 3D; not available for Algae in version 3.2.1). Please refer to the Statistical
Definitions section of this manual for more information on model statistics. The graph
shows the data (e.g., logio(LC50) values) used to develop the model, the regression line
(straight inner line), and 95% confidence intervals (curved outer lines). The surrogate
and predicted taxa are labeled on the X and Y axes, respectively.
1. Enter the surrogate toxicity value in the box located under Surrogate Acute
Toxicity.
2. Select your desired confidence interval (90, 95, or 99%) from the drop down
menu located under Select Confidence Interval (Default is 95%).
3. Press Calculate
4. The calculated values will appear in the three boxes labeled Predicted Acute
Toxicity, Lower Limit and Upper limit.
-------
5. Log-transformed values of the surrogate and predicted toxicity values appear
parentheses in their respective boxes.
in
6. If the entered surrogate toxicity value is outside the range of values used to
develop the model, a pop-up with the warning "This value is outside the x-axis
range for this model. Continue?" will appear. The user may select "OK" to
proceed to calculate the toxicity value or hit cancel to enter another value.
7. To select a different model, select the link to the desired module in the blue
navigation bar.
&EPA
United Stales Environmental Protects Agency
Advanced Search
LEARN THE ISSUES SCIENCE & TECHNOLOGY LAWS & REGULATIONS ABOUT EPA
Interspecies Correlation Estimation G3Contactus @st
You are here: EPA Home » Exposure Assessment.» Food Cham » webict» Aquatic Species Taxa Selection "Calculator
Exposure Assessment
Models
Web-ICE Home
Aquatic Species
Aquatic Genus
Aquatic Family
Algae Genus
Wildlife Species
Wildlife Family
Species Sensitivity
Distributions
Endangered Species
Basic Information
Download Model Data
Calculator - Aquatic Species
Surrogate Species: Rainbow trout (Oncortiynchus mykiss)
Predicted Species: Brown trout (Salmo trutta)
Surrogate Acute Toxicity (log Predicted Acute Toxicity (log
value) value) —^
[i~50 ug L £k 142.71 ug/L(2.I5) LJ.
(2-17) ^
Select Confidence Interval: Lower Limtt Upper Limit
1Q4.1GM9.L 195.65 ug L
C
0.042271
0.970642
17
0.964248
0.000000
Model information
Intercept
slope:
Degrees of Freedom (N-2):
Rl
p-value
Average value of surrogate (log value) 119.80 (2.07)
Minimum value of surrogate (log value): 0.163864 (-0.7S5515)
Maximum value of surrogate (log value): I 7808.08 (4.25)
Mean Square Error (MSEC 0.079728
Sum of Squares (S»»): 38.80
Cross-validation Success (96): 94.73
Taxonomic Distance 2
D
01234
QneoitnjiKttus myViss
Figure 3. Calculator Page.
-------
II. The Species Sensitivity Distribution (SSD) Module
The SSD modules generate SSDs from Web-ICE toxicity values estimated from
one or more surrogate species (up to 25), which simultaneously estimate toxicity to all
possible predicted species with existing Web-ICE models. The SSD is initially generated
using all estimated toxicity values and the entered toxicity of the surrogate species. If
multiple surrogates are used and a predicted value is estimated for one of the surrogate
species, Web-ICE uses the entered value for that species and excludes the predicted
value(s) from the SSD. If more than one surrogate predicts a toxicity value to the same
species, Web-ICE includes only one in the SSD. The default is to use the predicted
value with the smallest confidence interval, but the user may select a different value, as
described below.
Web-ICE uses the SSD described by the logistic distribution function of de Zwart
(2002):
F(C)=1/(1+exp((a-C)/p))
The logic-transformed environmental concentration (or dose) of the evaluated chemical
is represented by C, the parameter a is the sample mean of the log™ -transformed
toxicity values and 3 is defined as VS/rc * o, where o is the standard deviation of the log 10
-transformed toxicity values. The HC/HD level is the percentile of interest (e.g., 5th) of
the described distribution.
Corresponding SSDs are also developed from the upper and lower confidence
limits of the predicted toxicity values and are used to calculate the upper and lower
bounds of the HC/HD value at a given percentile1. For example, the lower bound of the
HC5 is calculated as the 5th percentile of the SSD developed from the estimated lower
confidence limit of each predicted toxicity value. Similarly, the upper bound of an HC5 is
calculated as the 5th percentile of the SSD developed from the estimated upper limit of
each predicted toxicity value.
Generating an SSD:
1. Under the SSD module, select either Aquatic or Wildlife.
2. On the SSD taxa selection page, select your surrogate species from the drop
down menu and click Add to add the species as a surrogate. There are separate
drop down menus for vertebrates/invertebrates and algae; however if both are
selected they will be combined into the same SSD. A maximum of 25 total
surrogates can be selected (Figure 4).
1 It should be noted that the standard approach to calculate confidence intervals of the logistic distribution (de Zwart
2002) could not be computed using the Java script based platform of Web-ICE. The approach applied here uses the
uncertainty in model predictions to estimate the uncertainty of the HC5 estimation. Large or unrealistic confidence
intervals may result if toxicity values with exceptionally large confidence intervals are included in the SSD.
-------
3. To remove a surrogate species from the list after it is added, click Remove next to
the species name.
4. Enter the known toxicity for the surrogate species, click Calculate SSD.
United Slates Environmental Protect ion Agency
LEARN THE ISSUES SCIENCES TECHNOLOGY LAWS & REGULATIONS ABOUT EPA
Advanced Search
/ Index
Interspecies Correlation Estimation QContact us @share
You are here: EPA Home » Exposure Assessment » Food Chain » WeblCE » Species Sensistivity Distributions
"Aquatic Species
Exposure Assessment
Models
Web-ICE Home
Aquatic species
Aquatic Genus
Aquatic Family
Algae Species
Algae Genus
Wildlife Species
Wildlife Family
Species Sensitivity
Distributions
Endangered Species
Basic Information
Species Sensitivity Distributions - Aquatic Species
Multiple Surrogate SSD
Vertebrates & Invertebrates
Toxicity (ug/L)
Bluegill (Lepomis macrochirus) |lOO
Rainbow trout (Oncorhv
Chlorella pyienoidosa
" I Add Soit By I Common Name ~\
Add |
Remove Species
Rainbow trout (Oncorhynchus mykiss)|75
|500
Remove Species
Remove Species
Calculate SSo|
Please address all comments and questions to the webmaster
Office of Research and Development i National Health and Environmental Effects Research Laboratory ' Gulf Ecology Division
Figure 4. SSD taxa selection page.
Working with the SSD Output
1. On the SSD output page, the HC/HD level may be changed from the drop down
box. The hazard level is automatically recalculated if the level is changed. The
default is the HC/HD5 (Figure 5A).
2. If multiple surrogates predict to the same species, all predicted values are shown,
but only one can be included in the SSD. By default, Web-ICE includes the one
with the smallest confidence interval, but this can be changed by the user by
selecting the radio button of the desired value (Figure 5B). The HC/HD value is
automatically recalculated.
3. The warning "Input toxicity is greater (less) than model maximum (minimum)"
indicates if a predicted value was generated from a surrogate species toxicity
value that was outside the range of toxicity values used to generate that model.
-------
4. The user can unmark the box to the left of a predicted species to exclude it
entirely from the SSD, which is automatically recalculated. (NOTE: See Selecting
Predicted Toxicity Values forSSDs in the Guidance for Model Selection and Use
section below for guidance on removing estimated toxicity values).
5. The estimated toxicity values may be sorted by a column of interest by selecting
the sort tab below the heading.
6. Predicted values can be filtered by inputing desired ranges for the lower and
upper bounds for degrees of freedom, R2, p-value, mean square error, cross-
validation success rate, taxonomic distance, slope, or intercept in the Data Filters
box (Figure 5C). Open ended ranges are allowed by only inputting a lower or
upper limit.
7. The user can generate an Excel-friendly output for either all predicted toxicity
values or just the data selected by the radio buttons for inclusion in the SSD by
selecting the desired Provide Copy-friendly Output tab (Figure 5D).
o pp/v
^^f ^»l »» Uii:«! Sum EoviionnwKflt Protection Agency
Advanced Search A / Index
LEARN THE ISSUES SCIENCE S TECHNOLOGY LAWS & REGULATIONS ABOUT EPA ^^^^^^^^^B| SttKH
Interspecies Correlation Estimation
Vou are here: EPA Horw • Exposure Assessn-wm * Food Cham « W*b
-------
III. The Endangered Species Module
Producing an Endangered Species Toxicitv Report
1. Under the Endangered Species module, select either Aquatic or Wildlife.
2. On the taxa selection page, select either the broad taxa of interest (e.g., all
species, fishes) or a particular species of interest from the drop down menu
(Figure 6).
3. Select your surrogate species from the drop down menu and click Add to add the
species as a surrogate. A maximum of 25 species can be selected.
4. To remove a surrogate species from the list after it is added, click Remove next to
the species name.
5. Enter the toxicity for the surrogate species, click Calculate.
6. The output page provides the estimated toxicity for each predicted taxa, the
model level (e.g., species), surrogate, and model information (Figure 7).
7. The user may sort the ICE-estimated toxicity values by each column by selecting
the sort tab below the column heading.
8. Predicted values can be filtered by inputing desired ranges for the lower and
upper bounds for degrees of freedom, R2, p-value, mean square error, cross-
validation success rate, taxonomic distance, slope, or intercept in the Data Filters
box. Open ended ranges are allowed by only inputting a lower or upper limit.
9. The user can generate an Excel-friendly output by clicking on the Provide Copy-
friendly Output option.
11
-------
Interspecies Correlation Estimation L [Contact us 0$hare
You are here: EPA Home » Exposure Assessment » Food Cham » WebKE » Endangered Species H Aquatic Species
Endangered Species Module - Aquatic Species
Step 1: select Taxa of Interest
All Species <• Fishes <~ Amphibians <~ Crustaceans f~ Molluscs
Species
Sort By | Common Name _*_
Step 2: select Surrogate(s)
Surrogate)*}'
Sort By: | Common Name *l
Specie*
uecjill (Lepomis macrochims)
Fathead minnow (Pimephales promolas)
Daphnid (Ddphnta n
Figure 6. Taxa selection page of endangered species module.
LEARN THE ISSUES SCIENCE 1 TECHNOLOGY LAWS 1 REGULATIONS ABOUT EPA
Interspecies Correlation Estimation
You are her*: EPA Home • Exposure Assessment - food Chiin * WebiCE » Aquatic Sptcir. "Rtsults
Endangered Species - Aquatic
Surrogate Species: Fathead minnow (Pimephales promelas),
Rainbow trout (Oncorhynchus tnykiss)
Input Toxlclty SO. 75 iig L
Fillers (Upper and Lower Limns)
Upper Lower Uppe
Degrees o( freedom (N-2) | | Cross-validation Success (W | |
K2 | Taxonomie Distance
- Slop.
ut Error iMSE) | [ Intercept
Ptwde Copy-lnefxfly Output |
Predicted Taxa Model Surrogate
Sort Level Sort
Sort
Estimated 95* df IN .-)K.' p value Mean Cross- Tax. Slopelntercept
TonKity Confidence Sort Sort Son Square validation OKI. Sort Sort
Sori Intervals Error Success 00 Sort
Sort (MSE) Sort
Sort
Shonnose sturgeon species Fathead minnow 10.3} 1.85 - 57.46 2 0.96 0.0196 0.05
(Acipensei (Pimephales
tKevirostrgm) promelas)
75.00 4 1.05 -0.77 • Input tox
-------
IV. Accessing Model Data & Chemical Information
A list of chemicals in the aquatic and wildlife databases is available for download
using the Chemicals in Aquatic and Chemicals in Wildlife links. In the Chemicals in
Aquatic file, the chemical CAS number and associated toxicity values used in each
model are provided. The Chemicals in Wildlife file contains the number of species
present for each chemical. The acute data used to develop the ICE models for wildlife
and algae are not available due to proprietary rights of some information.
Models for all Web-ICE aquatic and wildlife modules are available as a
downloadable Microsoft Excel® spreadsheet on the Download Model Data page. The
data spreadsheets include model parameters (R2, p-value, df, intercept, slope, standard
error of the slope, Sxx, and MSE), general model information (taxonomic distance,
cross-validation success rate), descriptive statistics (average, minimum, and maximum
values of the surrogate species), and critical t-values used to calculate 90, 95, and 99%
confidence intervals (t90, t95, t99). These spreadsheets provide all of the information
that is needed to generate Web-ICE toxicity estimates and confidence intervals, as well
as facilitate the selection of the most robust models.
Using model data provided, users may calculate toxicity as:
Predicted toxicity = 10A(intercept + slope*logio(surrogate toxicity))
And confidence intervals as:
Lower bound = 10A(log10(predicted) -t1_a*V[MSE*(1/n + (Iog10 (x) - x.ave)A2/Sxx) ])
Upper bound = 10A(log10 (predicted) + t1_a*V[MSE*(1/n + (Iog10 (x) -x.ave)A2/Sxx) ])
Where x is the untransformed value of surrogate toxicity, x.ave is the average value of
log-transformed surrogate toxicity values, Sxx is the sum of squared errors of the
surrogate, MSE is the mean square error, n is the sample size, and ti_a is the value of
the t distribution corresponding to the desired level of confidence (i.e., 90, 95, 99%).
Mode of Action (MOA)-specific models
ICE models have been developed using chemicals of just one MOA and are
provided on the Download Model Data page. These models may be used to improve
predictions of models with large taxonomic distance, but may offer limited improvement
of predictions for species pairs that are closely related (Raimondo et al. 2010). The suite
of MOA-specific models differs from models developed using all MOAs and may include
some models for species pairs that were not significant using all data, or may not
include models for species pairs that were developed using all chemicals. Currently,
MOA-specific models are not accessible with the Web-ICE user interface, limiting the
use of these models to calculations performed from data in the spreadsheets external to
Web-ICE.
13
-------
Guidance for Model Selection and Use
I. Statistical Definitions
Several statistics are provided with the models and may be used to evaluate the
accuracy and precision of the estimated value. The following provides basic definitions
of model statistics:
Intercept - The log™ value of the predicted taxon toxicity when the log™ of the
surrogate species toxicity is zero.
Slope - The regression coefficient represents the change in log™ value of the
predicted taxon toxicity for every change in log™ value of the surrogate species
toxicity.
Degrees of Freedom (df, N - 2) - The number of data points used to build the
model minus two. The df is related to statistical power; in general, the higher the
df, the more information is used to develop the model.
R2 - The proportion of the data variance that is explained by the model. The
closer the R value is to one, the more robust the model is in describing the
relationship between the predicted and surrogate taxa.
p-value - The significance level of the linear association and the probability that
the linear association was a result of random data. Models with lower p-values
are more significant. Model p-values of < 0.0001 are reported as 0.00000.
Average value of the surrogate - The average of all toxicity values for the
surrogate species used to develop the model. The first number is the actual value
and the number in parentheses is the log-transformed value.
Minimum value of the surrogate - The lowest toxicity value for the surrogate
species used to develop the model. The first number is the actual value and the
number in parentheses is the log-transformed value.
Maximum value of the surrogate - The largest toxicity value for the surrogate
species used to develop the model. The first number is the actual value and the
number in parentheses is the log-transformed value.
Mean Square Error (MSB) - An unbiased estimator of the variance of the
regression line.
Sum of Squares (Sxx) - Sum of squared errors of the surrogate.
14
-------
Cross-validation Success - The percentage of removed data points that were
predicted within 5-fold of the actual value in the leave-one-out cross-validation.
Models with a cross-validation success of "na" are those that either had df = 1 or
where no significant models were developed when data points were removed.
Higher cross-validation success is an indication of greater model robustness.
Taxonomic Distance - The taxonomic relationship between the surrogate and
predicted taxa. Two taxa within the same genus have taxonomic distance of 1;
within the same family = 2; within the same order = 3; within the same class = 4;
within the same phylum = 5; within the same kingdom = 6; across kingdoms = 7
(algal models only, plants vs. cyanobacteria).
II. Selecting a Model with Low Uncertainty
Rules of Thumb
Model attributes, such as taxonomic distance of the predicted and surrogate
species, model parameters, and cross-validation success rate, should be used to select
models with low uncertainty. The following criteria should be used as a guide to select
more accurate predictions. These values are intended to be for guidance purposes only;
predicted values should be evaluated holistically using best professional judgement.
1. Relatively low mean square error (MSE) (<0.22)
2. Close taxonomic distance (< 3)
3. High cross-validation success rate (> 85%)
4. Narrow confidence intervals
5. High degrees of freedom (df > 8, N > 10)
6. High R2 value (> 0.6)
7. Low p-values(< 0.01)
The best estimations generally occur for surrogate and predicted taxa that are
within the same genus, family, or order and for models with MSE < 0.22 (Raimondo et
al. 2007, 2010). In general, models with more df have greater statistical power and
choosing a model with df greater than 8 is recommended to reduce model uncertainty. A
priori power analysis determined that linear models with df > 8 have enough statistical
power (1-R, > 0.8) to sufficiently increase the chance of finding a significant relationship
within the data. It is also recommended to choose models with p-values < 0.01 to further
reduce the chance of Type I errors in the toxicity estimations.
Cross-validation success rate is a conservative estimate of model uncertainty and
should not be interpreted as an exact estimate of model error. Cross-validation removes
a data point from the original model, potentially causing a large change in models
developed from small datasets. Due to changes in a model (i.e., reduced df, altered
slope/intercept) during this validation process, cross-validation success rate should be
considered only an estimate of generalization error. Particularly for models built from
small datasets, actual error can be expected to be lower than cross-validation error.
15
-------
Surrogate Species Selection: An Example
In an example of how to select a suitable model, Raimondo et al. (2007) outlined
a selection procedure to find an appropriate surrogate species for estimating the toxicity
of a chemical to red-winged blackbird. In the example, toxicity for the chemical of
interest was available for northern bobwhite, mallard, Japanese quail, fulvous whistling
duck, common grackle, and house sparrow, making them all potential surrogates. The
common grackle and house sparrow have the closest taxonomic distance (2, same
family; 3, same order); the other potential surrogates in this example have a taxonomic
distance of 4 (same class). Of the grackle and house sparrow, both have similar MSE
(~0.13), however house sparrow has a higher model R (0.84), higher cross-validation
success rate (95%), and greater degrees of freedom (107) and is the best surrogate for
red-winged blackbird in this example. The grackle would also provide good surrogacy,
with high R2 (0.65), high cross-validation success rate (93%) and good degrees of
freedom (54). If neither of these species were available surrogates, Japanese quail (R2
= 0.79, MSE = 0.15, df = 135, cross-validation success rate = 91 %) would be the next
best surrogate, followed by northern bobwhite (R2 = 0.63, MSE = 0.23, df = 45, cross-
validation success rate = 85%) and mallard (R2 = 0.48, MSE = 0.34, df = 80, cross-
validation success rate = 79%). Although fulvous whistling duck has the highest model
R2, low degrees of freedom (df = 2) and comparatively higher MSE (0.30) do not make it
as suitable of a surrogate as the other species.
III. Evaluating Model Predictions
Uncertainty of model predictions may be evaluated by assessing (1) the
characteristics of the model used in the predictions, and (2) the value of the input data
relative to the data used to generate the model. The former was discussed in the
previous section and the Rules of Thumb should be followed to ensure high confidence
in model selection. Even for robust models, however, model uncertainty increases
outside the range of surrogate species toxicity values that were used to develop the
model.
Uncertainty may be evaluated by reviewing the confidence intervals calculated
with the predicted value. Narrow confidence intervals represent higher confidence that
the model fits through the range of datapoints for the entered surrogate species toxicity.
If the surrogate toxicity value entered into an ICE model is outside the range of
surrogate toxicity data used to generate the model, the warning "This value is outside
the x-axis range for this model. Continue?" will appear to alert the user. This warning
alone does not indicate low confidence in the model estimate, but should be used in
conjunction with the calculated confidence intervals to evaluate the model prediction.
For example, if the upper and lower bounds of the confidence interval are several orders
of magnitude from the predicted value, caution should be used in applying the ICE
estimate in risk assessment.
16
-------
IV. Selecting Predicted Toxicitv Values for SSDs
The SSD modules of Web-ICE automatically predict toxicity values from all
available models for the selected surrogate species simultaneously. The user has the
discretion to remove predicted toxicity values from the SSD to either customize the SSD
for a particular taxa (e.g., birds only, fish only), or to remove predicted toxicity values
with large confidence intervals. If an estimated toxicity value was derived from an input
value that was outside of the range of surrogate species data used to generate the
model from which it was predicted, a warning appears next to the value indicating the
maximum or minimum value of the model. This warning alone does not indicate low
confidence in the model estimate, but should be used in conjunction with the calculated
confidence intervals to evaluate the model prediction.
V. Applying Web-ICE in Ecological Risk Assessment (ERA)
Web-ICE was developed to support both chemical hazard assessment and
ecological risk assessment (ERA) by providing a method to estimate acute toxicity to
specific taxa, such as endangered species, or to a larger number of taxa (species,
genera, families) with known uncertainty. Potential applications of acute toxicity values
generated by Web-ICE include the problem formulation phase of an ERA to screen for
contaminants of potential concern and in the analysis phase to characterize effects to a
larger number of species. The estimation of species-specific toxicity values using Web-
ICE is recommended as an alternative to safety factors typically applied when
extrapolating toxicity or risks to taxa without chemical and species-specific toxicity data.
Another potential application of the chemical and taxon-specific acute toxicity estimates
generated from ICE models includes input into existing exposure and risk models (e.g.,
TREX; EPA 2005). Web-ICE generated toxicity values may also be used in the analysis
of uncertainty and variability in toxicity to ecological receptors in both screening level
and baseline or Tier II ERAs.
In addition to taxa-specific ICE models, Web-ICE can be used to generate SSDs
and estimated 1st, 5th or 10th percentile values of the cumulative distribution of species-
specific toxicity values. These percentile values, expressed as the hazard concentration
(e.g., HC5) or hazardous dose (e.g., HD5), provide an estimate of toxicity at a
prescribed level of species protection with known uncertainty. Hazard concentrations
could be used in ERA in place of species-specific toxicity values or as a component of
the uncertainty analysis (Dyer et al. 2008, Awkerman et al. 2008, 2009, Barren et al.
2012).
17
-------
Acknowledgements
For database development, the authors would like to thank Deborah Vivian (US EPA,
GED), Sonny Mayer (US EPA, retired), Thomas Steeger and Brian Montague (US EPA,
Office of Pesticide Programs), Don Rodier (US EPA, retired), Pierre Mineau, Alain Baril
and Brian Collins (National Wildlife Research Centre, Environment Canada), Chris
Russom and Teresa Norberg-King (US EPA, Mid-Continent Ecology Division),
Christopher Ingersoll and Ning Wang (Columbia Environmental Research Center, U.S.
Geological Survey), and Scott Dyer, Scott Belinger, and Jessica Brill (Procter and
Gamble). Special thanks to Wally Schwab and Derek Lane (Computer Sciences
Corporation) for constructing the website, and to Carl Litzinger (US EPA, Gulf Ecology
Division) and David Owens (Computer Sciences Corporation) for their facilitation of
website development. Also, thanks to our support personnel: Marion Marchetto, Alice
Watts, Nicole Allard, Christel Chancy, Anthony DiGirolamo, Laura Dobbins, Brandon
Jarvis, Sarah Kell, Nathan Lemoine, Cheryl McGill, Michael Norberg, Hannah Rutter.
Peer review and beta testing of the website were contributed by Larry Goodman,
Michael Murrell, Raymond Wilhour, Susan Yee, Jill Awkerman, and Kimberly Nelson
(US EPA, Gulf Ecology Division), Rick Bennet and Dale Hoff (US EPA, Mid-Continent
Ecology Division), Glen Thursby (US EPA, Atlantic Ecology Division), and Anne
Fairbrother (Exponent).
18
-------
References
ASTM (American Society for Testing and Materials). 2007. Standard guide for
conducting acute toxicity tests with fishes, macroinvertebrates, and amphibians. E
729-96(2007). Philadelphia PA.
ASTM. 2011. Standard Guide for Conducting Static Toxicity Tests with Microalgae.
ASTM E1218 - 04e1. ASTM International, West Conshohocken, PA, 2006, DOI:
10.1520/E1218-04E01.
Asfaw, A., M. R. Ellersieck, and F. L. Mayer. 2003. Interspecies Correlation Estimations
(ICE) for acute toxicity to aquatic organisms and wildlife. II. User Manual and
Software. EPA/600/R-03/106. U.S. Environmental Protection Agency, National
health and Environmental Effects Research Laboratory, Gulf Ecology Division, Gulf
Breeze, FL. 14 p.
Awkerman, J., S. Raimondo, and M.G. Barren. 2008. Development of Species
Sensitivity Distributions for wildlife using interspecies toxicity correlation models.
Environ. Sci. Technol. 42 (9): 3447-3452.
Awkerman, J., S. Raimondo, and M.G. Barren. 2009. Estimation of wildlife hazard levels
using interspecies correlation models and standard laboratory rodent toxicity data. J
Toxicol Environ Health, Part A. 72: 1604-1609.
Baril, A., B. Jobin, P. Mineau, and B. T. Collins. 1994. A consideration of inter-species
variability in the use of the median lethal dose (LD50) in avian risk assessment.
Technical Report No. 216. Canada Wildlife Service, Headquarters.
Barren, M. G., C. R. Jackson, J. A. Awkerman. 2012. Evaluation of in silico development
of aquatic toxicity species sensitivity distributions. Aquat Toxicol. 116-117: 1-7.
De Zwart, D. 2002. Observed regularities in species sensitivity distributions for aquatic
species. In Species Sensitivity Distributions in Ecotoxicology, L. Posthuma, G.W.
Suter, T.P.Traas, Eds. Lewis Publishers, Boca Raton, FL. pp133-154.
Dyer, S. D., D. J. Versteeg, S. E. Belanger, J. G. Chaney, and F. L. Mayer. 2006.
Interspecies correlation estimates predict protective environmental concentrations.
Environ. Sci. Technol. 40: 3102-3111.
Dyer, S. D., D. J. Versteeg, S. E. Belanger, J. G. Chaney, S. Raimondo and M. G.
Barren. 2008. Comparison of Species Sensitivity Distributions Derived from
Interspecies Correlation Models to Distributions used to Derive Water Quality
Criteria. Environ. Sci. Technol. 42: 3076-3083.
Fairbrother, A. 2008. Risk Management Safety Factor. In. Encyclopedia of Ecology, vol.
4. S. E. Jorgensen and B. D. Fath (eds.). Elsevier publishing, pp. 3062-3068.
Hudson, R. H., R. K. Tucker, and M. A. Haegele. 1984. Handbook of toxicity of
pesticides to wildlife. U.S. Fish and Wildlife Service, Resource Publ. 153,
Washington D.C. 90 p.
Insightful. 2001. S-plus 6 Guide to Statistics. Volume 1. Insightful Corporation, Seattle,
WA.
Mineau, P., A. Baril, B. T. Collins, J. Duffe, G. Joerman, and R. Luttik. 2001. Pesticide
acute toxicity reference values for birds. Rev. Environ. Contam. Toxicol. 170: 13-74.
19
-------
OECD (Organization for Economic Cooperation and Development). 1996. OECD
Guidelines for the Testing of Chemicals. Freshwater Alga and Cyanobacteria,
Growth Inhibition Test. Paris, France 26p.
Raimondo, S., P. Mineau, and M. G.Barren. 2007. Estimation of chemical toxicity in
wildlife species using interspecies correlation models. Environ. Sci. Technol. 41:
5888-5894.
Raimondo, S., C.R. Jackson, M.G. Barren. 2010. Influence of taxonomic relatedness
and chemical mode of action in acute interspecies estimation models for aquatic
species. Environ Sci Technol. 44: 7711-7716.
Shafer, E. W. Jr. and W. A. Bowles Jr. 1985. Acute oral toxicity and repellency of 933
chemicals to house and deer mice. Arch. Environ. Contam. Toxicol.14: 111-129.
Shafer, E. W. Jr. and W. A. Bowles Jr. 2004. Toxicity, repellency or phototoxicity of 979
chemicals to birds, mammals and plants. Research Report No. 04-01. United States
Department of Agriculture, Fort Collins, CO. 118 p.
Shafer, E. W. Jr., W. A. Bowles Jr. and J. Hurlbut,. 1983. The acute oral toxicity,
repellency and hazard potential of 998 chemicals to one or more species of wild and
domestic birds. Arch. Environ. Contam. Toxicol. 12: 355-382.
Smith, G. J. 1987. Pesticide use and toxicology in relation to wildlife: organophosphorus
and carbamate compounds. Resource Publication 170. United States Department of
the Interior, Washington, DC. 171 p.
US EPA (Environmental Protection Agency). 1986. Quality criteria for water. EPA 440/5-
86-001. Washington, DC.
US EPA. 1996a. Ecological Effects Test Guidelines. OPPTS 850.1075 Fish Acute
Toxicity Test, Freshwater and Marine. EPA 712-C-96-118. Washington DC.
US EPA. 1996b. Ecological Effects Test Guidelines OPPTS 850.5400, Algal Toxicity,
Tiers I and II. EPA712-C-96-164, 11p.
US EPA. 2005. TREX: Terrestrial Residue Exposure model. Office of Pesticide
Programs. U.S. Environmental Protection Agency.
http://www.epa.gov/oppefed1/models/terrestrial/trex_usersguide.htm#content4
20
-------
Appendix 1. Number of Models by Version
Web-ICE 3.2.1 - Release April 2013:
Database
Aquatic animals
Algae
Wildlife
Attributes
Records
5501
1647
4329
Species
180
69
156
Chemicals
1266
457
951
Number of models
Species
780
58
560
Genus
289
44
0
Family
374
0
292
Web-ICE 3.1 - Release January 2010:
Database
Aquatic animals*
Wildlife
Attributes
Records
5501
4329
Species
180
156
Chemicals
1266
951
Number of models
Species
780
560
Genus
289
0
Family
374
292
Aquatic models were reduced between versions 2.0 and 3.1 due to increased data standardization
criteria between versions. Data standardization was increased to ensure model relationships were
reflective of inherent species sensitivity with minimal influence of extraneous variables (e.g. life stage,
test conditions). See the database documentation for details on standardization
(http://v26265ncay507.aa.ad.epa.gov/weblCE/ICE%20Aquatic%20DB%20documentation.pdf)
Web-ICE 2.0 - Release August 2007:
Database
Aquatic animals
Wildlife
Attributes
Records
4706
4329
Species
217
156
Chemicals
695
951
Number of models
Species
1074
560
Genus
481
0
Family
526
292
Web-ICE 1.1- Release July 2007:
Database
Wildlife
Attributes
Records
4329
Species
156
Chemicals
951
Number of models
Species
560
Genus
0
Family
292
21
------- |