EPA/600/R-07/071 August 2007 Web-based Interspecies Correlation Estimation (Web-ICE) for Acute Toxicity: User Manual Version 2.0 \ http://www.epa.gov/ceampubl/fchain/webice/ Sandy Raimondo, Deborah N. Vivian, and Mace G. Barron U.S Environmental Protection Agency Office of Research and Development National Health and Environmental Effects Research Laboratory Gulf Ecology Division Gulf Breeze, FL 32561 ------- Reference Web-ICE as: Raimondo, S., D.N. Vivian, and M.G. Barren. 2007. Web-based Interspecies Correlation Estimation (Web-ICE) for Acute Toxicity: User Manual. Version 2.0. EPA/600/R-07/071. Gulf Breeze, FL. Disclaimer: The information in this document has been reviewed in accordance with U.S. Environmental Protection Agency policy and approved for publication. Approval does not signify that the content reflects the views of the Agency, nor does mention of trade names or products constitute endorsement or recommendation for use. ------- Contents Abstract 2 Introduction 3 Methods 4 I. Database Development 4 Aquatic (Fish and Invertebrates) 4 Wildlife (Birds and Mammals) 4 II. Model Development 5 III. Model Validation 5 Using the Web-ICE Program 7 I. Working with Web-ICE Aquatic or Web-ICE Wildlife 8 Selecting Model Taxa 8 Estimating Toxicity 9 II. The Species Sensitivity Distribution (SSD) Module 10 Generating an SSD: 12 III. Accessing Model Data 13 Guidance for Model Selection and Use 14 I. Statistical Definitions 14 II. Selecting a Model with Low Uncertainty 15 Rules of Thumb 15 Surrogate Species Selection: An Example 16 III. Evaluating Model Predictions 16 IV. Selecting Predicted Toxicity Values for SSDs 17 V. Applying Web-ICE in Ecological Risk Assessment (ERA) 17 Acknowledgements 18 References 18 Appendix 20 I. List of Species in Aquatic Database 20 II. List of Species in Wildlife Database 24 ------- Abstract Predictive toxicological models are integral to environmental risk assessment where data for most species are limited. Web-based Interspecies Correlation Estimation (Web-ICE) models are least square regressions that predict acute toxicity (LC50/LD50) of a chemical to a species, genus, or family based on estimates of relative sensitivity between the taxa of interest and that of a surrogate species. Web-ICE includes a total 2081 models for aquatic taxa and 852 models for wildlife taxa. For aquatic species within the same order, Web-ICE models predict within 5-fold and 10-fold of the actual value with 90% and 95% certainty, respectively. Overall for wildlife species, Web-ICE predicts toxicity within 5-fold of the actual value with 85% certainty and within 10-fold of the actual value with 95% certainty. Models predict within 5-fold and 10-fold of the actual value with 90 and 97% certainty for wildlife surrogate and predicted taxa within the same order. For both aquatic and wildlife taxa, model certainty decreases with increasing taxonomic distance. ------- Introduction Information on the acute toxicity to multiple species is needed for the assessment of the risks to, and the protection of, individuals, populations, and ecological communities. However, toxicity data are limited for the majority of species, while standard test species are generally data rich. To address data gaps in species sensitivity, the Interspecies Correlation Estimations (ICE) software package (version 1.0) was developed by the U.S. Environmental Protection Agency (US EPA) and collaborators to extrapolate acute toxicity to taxa with little or no acute toxicity data, including threatened and endangered species (Asfaw et al. 2003). Web-based Interspecies Correlation Estimations (Web-ICE) expands the fundamental ground work of the original ICE program (Asfaw et al. 2003) as an internet application to include additional chemical and species toxicity data, providing an increased number of interspecies correlations. ICE models estimate the acute toxicity (LC50/LD50) of a chemical to a species, genus, or family with no test data (the predicted taxon) from the known toxicity of the chemical to a species with test data (the surrogate species). ICE models are least square regressions of the relationship between surrogate and predicted taxon based on a database of acute toxicity values: median lethal water concentrations for aquatic species (LC50; ng/L) and median lethal oral doses for wildlife species (LD50; mg/kg bodyweight). ICE models can be used to estimate acute toxicity when a toxicity value for a specific chemical is available for a selected surrogate species or can be estimated (e.g., QSAR), and there is an existing ICE model between the taxa of interest (e.g., species-species; species-genus; species-family). In addition to direct toxicity estimation from a surrogate species to predicted taxa, Web-ICE contains a Species Sensitivity Distribution (SSD) module that estimates the toxicity of all predicted species available for a common surrogate. Acute toxicity values generated by Web-ICE are expressed as a logistic cumulative probability distribution function in the SSD module to estimate an associated Hazardous Concentration (HC) or Hazardous Dose (HD) (Dyer et al. 2006). For example, the HC5 corresponds to the 5th percentile of the log-logistic species sensitivity distribution and is assumed to be protective of 95% of tested species. ICE-generated SSD hazard levels have been shown to be within an order of magnitude of measured HC5s (Dyer et al. 2006, Dyer et al. 2008) and HD5s (Awkerman et al. 2008) and provide additional information for ecological risk assessment. This manual provides step-by-step instructions for using Web-ICE, as well as information on the expanded databases, model development, model validation, and user guidance on model selection and interpretation. User guidelines outlined in the Guidance for Model Selection and Use section of this manual should be followed to ensure high confidence and low uncertainty in model predictions used in risk assessment. ------- Methods I. Database Development Aquatic (Fish and Invertebrates) The aquatic database is in development and is currently composed of 4706 LC50/EC50 values for 217 species and 695 chemicals. The data were compiled from the published peer reviewed literature and databases compiled by the U.S. Geological Survey (Mayer and Ellersieck 1986) and the US EPA, including Mayer (1987), ECOTOX/AQUIRE (US EPA 2006), and the Office of Pesticide Programs (OPP) registrant database. All confidential data were censored before inclusion in the public domain database. Data were used only for tests that adhered to standard acute toxicity test condition requirements of the American Society for Testing and Materials (ASTM 2002, and earlier editions). Data were standardized to similar test conditions and organism life stage to reduce variability. Selection criteria for aquatic test data were as follows: - 96-hr LC50 data for fish and most invertebrates; - 96-hr EC50 for most molluscs - 48-hr EC50 data for daphnids; - Technical chemicals or formulations > 90% active ingredient; and - Water Quality in accordance with ASTM standards (ASTM 2002). Open-ended toxicity values (i.e. > 100 ng/L or < 100|jg/L), toxicity values for larval fish and shrimp, adult ("mature") fish, oysters, shrimp, and blue crabs, and duplicate records among multiple sources were not included in model development. When there was more than one toxicity value from multiple sources for a species and chemical, the geometric mean of the values were used. In cases where the range of minimum and maximum values for a chemical and species were greater than 10-fold, all data records for that chemical were removed for that species due to their high variability. Toxicity test values for metals (e.g. copper) and pentachlorophenol were adjusted to 50 mg/L hardness and pH 6.8, respectively (US EPA 1986). The resulting aquatic database was used to develop models to predict toxicity to a species, genus, or family from a surrogate species (see Appendix I). Wildlife (Birds and Mammals) The wildlife database was comprised of 4329 acute, single, oral dose LD50 values (mg/kg body weight) for 156 species and 951 chemicals. The data were collected from the open literature (Hudson et al. 1984; Shafer and Bowles 1985, 2004; Shafer et al. 1983; Smith 1987) and from datasets compiled by governmental agencies of the United States (US EPA) and Canada (Environment Canada) (Baril et al. 1994; Mineau ------- et al. 2001). Data were standardized by using only data for adult animals and data for chemicals of technical grade or formulations with > 90% active ingredient. Open-ended toxicity values (i.e. > 100 mg/kg or < 100 mg/kg) and duplicate records among multiple sources were not included in model development. When data were reported as a range (ie. 100-200 mg/kg; Hudson et al. 1984) or data were collected from multiple sources for a species and chemical, the geometric mean of the values was used. In cases where the range of minimum and maximum values for a chemical and species were greater than 10-fold, all data records for that chemical were removed for that species due to their high variability. Models derived from this wildlife database may be used to predict toxicity to a species or family from a surrogate species. Genus level models were not developed from the wildlife database because there were limited genera that had two or more species (See Appendix II), which is requirement for development of higher taxa models. II. Model Development Models were developed using least squares methodology in which both variables are independent and subject to measurement error (Asfaw et al. 2003). For species- level models developed from aquatic and wildlife databases, an algorithm was written in S-plus (Insightful 2001) to pair every species with every other species by common chemical. Three or more common chemicals per pair were required for inclusion in the analysis. For each species pair, a linear model was used to calculate the regression equation Logi0(predicted toxicity) = a + b*Logio(surrogate toxicity), where a and bare the intercept and slope of the line, respectively. Genus (aquatic only) and family-level models were similarly developed by pairing each surrogate species with each genus or family by common chemical. Predicted genera and families required unique toxicity values for two or more species within the taxon. Toxicity values for the surrogate species were removed in cases where it was compared to its own genus or family. ICE models were only developed between two aquatic taxa or two wildlife taxa; there are no models to predict toxicity to an aquatic taxa from a wildlife species, or vice versa. Only models that had a significant relationship (p-value < 0.05) are included in Web-ICE. The following summarizes the number of significant models developed from the aquatic and wildlife databases for different taxonomic levels: 1) Aquatic species: 1074 models comparing 105 species to 105 species; 2) Aquatic genera: 481 models comparing 96 species to 33 genera; 3) Aquatic family: 526 models comparing 97 species to 32 families; 4) Wildlife species: 560 models comparing 49 species to 49 species; 5) Wildlife family: 292 models comparing 49 species to 16 families. III. Model Validation The uncertainty of each model was assessed using leave-one-out cross- validation (Insightful 2001). In this method, each pair of acute toxicity values for surrogate and predicted taxa were systematically removed from the original model. The ------- remaining data were used to rebuild a model and estimate the toxicity value of the removed predicted taxa toxicity value from the respective surrogate species toxicity value. This method could only be used for models with degrees of freedom equal or greater than 2 (N > 4). To maintain uniformity among the large number of models contained within Web-ICE, the "N-fold" difference among each estimated and actual value was calculated and used to determine the fitness of the estimated toxicity value. For aquatic species, interlaboratory variation of acute toxicity test data for a given species and chemical can be as great as a 5-fold difference (Fairbrother 2008). For wildlife species, the average range of multiple toxicity measurements for a specific chemical and species was determined to be between 4.0 and 6.4 (Raimondo et al. 2007). Thus, a 5-fold difference was deemed a good fit in the validation analysis of both aquatic and wildlife models. The cross-validation success rate was calculated for each model as the proportion of removed data points that were predicted within 5-fold of the actual value from models that were statistically significant. In cases where the removal of a xy data pair resulted in the development of a model that was not significant at the p < 0.05 level, these replicates were not included in the cross-validation success rate. This is because models that are not significant at the p<0.05 level have a greater risk of Type I error. This was only the case for models with low degrees of freedom (<8) and a p-value between 0.01 and 0.05 in the original model. For wildlife species, cross-validation of models showed predicted toxicity values within 5-fold and 10-fold of the actual values with 85% and 95% certainty, respectively. There was a strong relationship between taxonomic distance and cross-validation success rate, with uncertainty increasing with larger taxonomic distance (Raimondo et al., 2007). Models predict within 5-fold and 10-fold of the actual value with 90 and 97% certainty for surrogate and predicted taxa within the same order; model certainty decreases with increasing taxonomic distance. A more detailed account of model uncertainty as it relates chemical mode of action/class is discussed in Raimondo et al. (2007). ------- Using the Web-ICE Program The primary component of Web-ICE (Web-ICE Modules) contains separate modules that predict acute toxicity to aquatic (vertebrates and invertebrates) species, genera, or families (ICE Aquatic) and wildlife (terrestrial birds and mammals) species or families (ICE Wildlife) (Figure 1). A secondary component, the Species Sensitivity Distribution Module is available for aquatic and wildlife species (Figure 1). Each module is accessible from either the home page or from the blue navigation bar along the left side of the page. Before working with a Web-ICE module, you must first decide if you are going to work with aquatic or wildlife taxa, the program does not contain models that estimate wildlife toxicity from an aquatic surrogate, or vice versa. U.S. Environmental Protection Agsney Interspecies Correlation Estimation Search: <~ All EPA <* This Area You are here: EPA Home » Exposure Assessment w Food Chain » Web-ICE Go] Models Aquatic Species Aquatic Family 4||fifp?i!is? Wildlife Family Distributions Aquatic Basic Information User Manual Download Model Data w, The Web-based Interspecies Correlation Estimation (Web-ICE) application estimates acute toxicity to aquatic and terrestrial organisms for use in risk assessment, P!ease refer to the User^Manual for detailed instructions on using Web-ICE Web-ICE ICE Aquatic Aquatic vertebrates / invertebrates * Species * Genus • Family Modules ICE Wildlife Terrestrial Birds / Mammals * Species * Family Species Sensitivity Distribution Module Please address all comments and questions to the webmaster Office of Research and Development | National Health and Environmental Effects Research Laboratory | Gulf Ecology Diuisic Figure 1. Home page of Web-ICE program ------- I. Working with Web-ICE Aquatic or Web-ICE Wildlife Selecting Model Taxa 1. From either the home page or the blue navigation bar, click the link for the module with which you will be working (Aquatic species, genus, or family; Wildlife species or family). 2. You will then be directed to a Taxa Selection Page (Figure 2) which will allow you to select your surrogate and predicted taxa for the model. 3. You may search for your surrogate and predicted taxa by either common name or scientific name by selecting the appropriate option in the Sort by: drop down menu. The default is set to common name. 4. From the drop down menus, select the surrogate species and predicted taxon. It does not matter which you select first; however, the second choice is limited to the models available for the taxon chosen first. 5. To change any of your selections, press Reset and start again. 6. Click Continue to be directed to the calculator page for toxicity estimation. If there is not a model for your predicted species of interest, you will need to use a genus or family-level model to predict toxicity. The available models may be determined by browsing through the genus (aquatics only) and family level modules, or by searching through the spreadsheets of model information available through the Download Model Data option on the blue navigation bar. The downloadable Microsoft Excel® spreadsheets provided for each Web-ICE module maybe sorted by surrogate species or predicted taxa to identify available models. U.S. EAV/ronmentaJ Protecfi'on Agnmcy Wildlife Species - Taxa Selection Page Figure 2. Taxa selection page ------- Estimating Toxicity The surrogate and predicted species selected from the previous page are listed at the top of a calculator page (Figure 3). This page is divided into four parts: input, calculated results, model statistics, and model graphic . The known toxicity for the surrogate species is entered under Surrogate Acute Toxicity, below which the desired confidence limits can be selected (Figure 3A). Predicted toxicity estimates are displayed under Predicted Acute Toxicity (Figure 3B). The bottom left side of the page contains the model statistics (Figure 3C). Please refer to the Statistical Definitions section of this manual for more specific information. The graph shows the data (LC50/LD50 values) used to develop the model (dots), the regression line (straight inner line), and 95% confidence intervals (curved outer lines) (Figure 3D). The surrogate and predicted taxa are labeled on the X and Y axes, respectively. Both the model statistics and the graph are unique for each model and will change for each surrogate species and predicted taxon. 1. Enter the acute toxicity value in the box located under Surrogate Acute Toxicity (Figure 3A). 2. Select your desired confidence interval (90, 95, or 99%) from the drop down menu located under Select Confidence Interval (Figure 3A). The default for the confidence intervals is 95%. 3. Press Calculate 4. The calculated values will appear in the three boxes labeled Predicted Acute Toxicity, Lower Limit and Upper limit (Figure 3B). 5. Log-transformed values of the surrogate and predicted toxicity values appear in parentheses next to the values. 6. If the entered surrogate toxicity value is outside the range of toxicity values used to develop the model, a pop-up with the warning "This value is outside the x-axis range for this model. Continue?" will appear. The user may select "OK" to proceed to calculate the toxicity value or hit cancel to enter another value. 7. To select a different model, either select the BACK button on the browser or select the link to the desired module in the blue navigation bar on left side of the page. ------- Wildlife Species Surrogate Speacsi Japanese quail (Coturnix japonica} Predicted Species: Northern bob*hit* (Co-iinus vircilnianus) Figure 3. Calculator Page II. The Species Sensitivity Distribution fSSD) Module Species sensitivity distributions (SSDs) are probabilistic models that describe the sensitivity of biological species to a chemical. SSDs generated in Web-ICE are log- logistic cumulative distribution functions of toxicity values for multiple species (de Zwart 2002) and are used to estimate a hazard level (hazardous concentration (HC) or hazardous dose (HD)) that is protective of most test species (e.g. 95%) by estimating the concentration or dose at a corresponding percentile (e.g. 5th) of the distribution (Dyer et al. 2006). The SSD modules for aquatic and wildlife species generate SSDs from Web-ICE estimated toxicity values one or more known toxicity values for surrogate species. Toxicity values for one or more surrogate species are used to simultaneously estimate toxicity to all possible predicted species with existing Web-ICE models. The SSD is then generated using all estimated toxicity values and the entered toxicity of the surrogate species. Toxicity values for multiple surrogate species may be entered (Figure 4). If more than one surrogate species estimates toxicity to the same predicted species, Web- ICE selects the toxicity value with the smallest confidence intervals. If multiple surrogates are used and a predicted value is estimated for one of the surrogate species, Web-ICE uses the entered value for that species and excludes the predicted value. An HC/HD level is automatically calculated from the distribution. The user can deselect toxicity values for predicted species that they wish to exclude from the SSD by clicking on the box to the left of the predicted species (Figure 5), and the associated 10 ------- HC/HD value is automatically recalculated. An HC/HD drop down menu on the output page allows the user to specify the hazard level to calculate. HC1/HD1 corresponds to the 1st percentile, HC5/HD5 corresponds to the 5th percentile, and HC10/HD10 corresponds to the 10th percentile. The default is set to HC5 for aquatic species and HD5 for wildlife species. Web-ICE uses the SSD described by the logistic distribution function of de Zwart F(C) = 1/(1 +exp((a-C)/ (2002): The logio-transformed environmental concentration (or dose) of the evaluated chemical is represented by C, the parameter, a, is the sample mean of the log™ -transformed toxicity values and 3 is defined as VS/rc * o, where o is the standard deviation of the log™ -transformed toxicity values (de Zwart 2002). The HC/HD level is determined as the percentile of interest (e.g. 5th) of the described distribution. Corresponding SSDs are also developed from the upper and lower confidence limits of the predicted toxicity values and are used to calculate the upper and lower bounds of the HC/HD value at a given percentile. For example, the lower bound of the HC5 is calculated as the 5th percentile of the SSD developed from the estimated lower confidence limit of each predicted toxicity value. Similarly, the upper bound of an HC5 is calculated as the 5th percentile of the SSD developed from the estimated upper limit of each predicted toxicity value. U.S. ENVIRONMENTAL PROTECTION AGENCY Interspecies Correlation Estimation Search: <** All £P& EnposureAssessmcra '-""- """• DowrtoMUooelOala Species Sensitivity Distributions - Aquatic Species Multiple Surrogate SSD Surrogate: Sort Byi j Common Name j^J Blue crab {Cslisnectes sapidus) Channel catfish (tctakirus punctatus) |200 Rainbow trout t'Oncorhynchus rnvksss) |250f Remove Species Remove Sptcits [" Catcul'aie &SD Figure 4. SSD taxa selection page. 11 ------- U.S. ENVIRONMENTAL PROTgCTfQN AGENCY Search; * All EPA & This Species Sensitivity Distributions - Aquatic Surrogate Species; Blue crab (Califneetes sapjdus). Channel catftsh (Ictatufus punctatus), Rambow trout (Onco-'hynchus mykiss) Input Toxicity: 150,200,250 30.S8 ug/L 9S% Confidence Interval; ?,4i - S7.S1 Show Data: Farfantepenaeus duorarum jv" Amphfpod Figure 5. SSD output page. Generating an SSD: 1. Under the SSD module, select either Aquatic or Wildlife. 2. On the SSD taxa selection page, select your surrogate species from the drop down menu and click Add to add the species as a surrogate. 3. If desired, select additional surrogate species from the drop down menu and click Add. A maximum of 25 species can be selected. 4. To remove a surrogate species from the list after it is added, click Remove next to the species name. 5. Enter the known toxicity for the surrogate species, click Calculate SSD. 6. On the SSD output page, the HC/HD level maybe changed from drop down box. The hazard level is automatically recalculated if the level is changed. The default is the HC/HD5. 7. The warning "Input toxicity is greater (less) than model maximum (minimum)" indicates if a predicted value was generated from a surrogate species toxicity value that was outside the range of toxicity values used to generate that model. 8. The user can unmark the box to the left of a predicted species to exclude it from the SSD, which is automatically recalculated. (NOTE: See Selecting Predicted Toxicity Values for SSDs in the Guidance for Model Selection and Use section below for guidance on removing estimated toxicity values). 9. The drop down menu in the Show Data column provides additional model information (surrogate, taxonomic distance, cross-validation success rate, degrees of freedom, R2, p-value, or mean square error) for the user to view. 10. The user may sort the ICE-estimated toxicity values by each column by selecting the sort tab below the column heading. 12 ------- III. Accessing Model Data Models for all Web-ICE aquatic and wildlife modules are available as a downloadable Microsoft Excel® spreadsheet under the Download Model Data option on the blue navigation bar. The data spreadsheets include model parameters (R2, p-value, df, intercept, slope, standard error of the slope, Sxx, and MSE), general model information (taxonomic distance, cross-validation success rate), descriptive statistics (average, minimum, and maximum values of the surrogate species), and critical t-values used to calculate 90, 95, and 99% confidence intervals (t90, t95, t99). These spreadsheets provide all of the information that is needed to generate Web-ICE toxicity estimates and confidence intervals, as well as facilitate the selection of the most robust models. The raw data used to develop the ICE models is not available due to proprietary rights of some information. A list of chemicals in the aquatic and wildlife databases with the number of species present for each chemical is available for download using the Chemicals in Aquatic and Chemicals in Wildlife links. Using model data provided, users may calculate toxicity as: Predicted toxicity = 10A(intercept + slope*Logio(surrogate toxicity) And confidence intervals as: Lower bound = 10A(log(predicted) - t1.a*V[MSE*(1/n + (log(x) - x.ave)A2/Sxx) ] Upper bound = 10A(log(predicted) + t^cNlMSE'XI/n + (log(x) -x.ave)A2/Sxx) ] Where x is the untransformed value of surrogate toxicity, x.ave is the average value of log-transformed surrogate toxicity values, Sxx is the sum of squared deviations of the surrogate, MSE is the mean square error, and ti_a is the value of the t distribution corresponding to the desired level of confidence (ie. 90%, 95%, 99%). 13 ------- Guidance for Model Selection and Use I. Statistical Definitions Several statistics are provided with each model and may be used to evaluate the accuracy and precision of the estimated value. These statistics are shown to the left of the graph on the calculator page (Figure 3c) and are provided in the spreadsheet of model information available in the Download Model Data option. The following provides a basic interpretation of model statistics to help guide users in model selection: Intercept - The log™ value of the predicted taxon toxicity when the log™ of the surrogate species toxicity is 0. Slope - The regression coefficient, represents the change in log™ value of the predicted taxon toxicity for every change in log™ value of the surrogate species toxicity. Degrees of Freedom (df, N - 2) - Reflects the number of data points used to build the model. Degrees of freedom are related to statistical power; in general, the higher the degrees of freedom, the more robust the model. R2 - The proportion of the data variability that is explained by the model. The greater the R2 value and the closer it is to one, the more robust the model is in describing the relationship between the predicted and surrogate taxa. p-value - The significance level of the linear association and the probability that the linear association was a result of random data. Models with lower p-values are more robust. Model p-values of < 0.00001 are reported as 0.00000. Average value of the surrogate - The average of toxicity values for the surrogate species used in the model. The first number is the actual value and the number in parentheses is the log-transformed value. Minimum value of the surrogate - The lowest toxicity value for the surrogate species used in the model. The first number is the actual value and the number in parentheses is the log-transformed value. Maximum value of the surrogate - The largest toxicity value for the surrogate species used in the model. The first number is the actual value and the number in parentheses is the log-transformed value. Mean Square Error (MSB) - An unbiased estimator of the variance of the regression line. 14 ------- Sum of Squares (Sxx) - Sum of squared deviations of the surrogate. Cross-validation Success - The percentage of removed data points that were predicted within 5-fold of the actual value. Models with a Cross-validation Success of "na" are those that either had df = 1 or where no significant models were developed when data points were removed. Taxonomic Distance - Describes the taxonomic relationship between the surrogate and predicted taxa. Two taxa within the same genus have taxonomic distance of 1; within the same family = 2; within the same order = 3; within the same class = 4; within the same phylum = 5. II. Selecting a Model with Low Uncertainty Rules of Thumb Model attributes, such as taxonomic distance of the predicted and surrogate species, model parameters (listed below) and cross-validation success rate, should be used to select models with low uncertainty. For best estimates, models should be selected that possess the following: 1. high R2 value (> 0.6) 2. low p-values(< 0.01) 3. high degrees of freedom ( df > 8, N > 10) 4. close taxonomic distance (< 3) 5. high cross-validation success rate (> 85%) 6. Relatively low mean square error (MSE) (<0.22) 7. Narrow confidence bands on the graph The best estimations generally occur for surrogate and predicted taxa that are within the same genus, family, or order and for models with R2 > 0.6 (Raimondo et al. 2007). In general, models with more degrees of freedom (df) have greater statistical power and choosing a model with df greater than 8 are recommended to reduce model uncertainty. A priori power analysis determined that linear models with df > 8 have enough statistical power (1-R, > 0.8) to sufficiently increase the chance of finding significant relationship within the data. It is also recommended to choose models with p- values < 0.01 to further reduce the chance of Type I errors in toxicity estimation. Cross-validation success rate is a conservative estimate of model uncertainty and should not be interpreted as an exact estimate of model error. Cross-validation removes data from the original model, potentially causing a large change in the model for small datasets. Due to changes in a model (i.e. reduced df, altered slope/intercept) during this validation process, cross-validation success rate should be considered only an estimate of generalization error. Particularly for models built from small datasets, actual error can be expected to be lower than cross-validation error. 15 ------- Surrogate Species Selection: An Example In example of how to select a suitable model, Raimondo et al. (2007) outlined a selection procedure to find an appropriate surrogate species to estimate the toxicity of a chemical to red-winged blackbird. In the example, toxicity data for the chemical of interest was available for northern bobwhite, mallard, Japanese quail, fulvous whistling duck, common grackle, and house sparrow, making them all potential surrogates. The common grackle and house sparrow have the closest taxonomic distance (2, same family; 3, same order); the other potential surrogates in this example have a taxonomic distance of 4 (same class). Of the grackle and house sparrow, both have similar MSE (~0.13), however house sparrow has a higher model R (0.84), higher cross-validation success rate (95), and greater degrees of freedom (107), and is the best surrogate for red-winged blackbird in this example. The grackle would also provide good surrogacy, with high R2 (0.65), high cross-validation success rate (93), and good degrees of freedom (54). If neither of these species were available surrogates, Japanese quail (R2 = 0.79, MSE = 0.15, df = 135, cross-validation success rate = 91) would be the next best surrogate, followed by northern bobwhite (R2 = 0.63, MSE = 0.23, df = 45, cross- validation success rate = 85) and mallard (R2 = 0.48, MSE = 0.34, df = 80, cross- validation success rate = 79). Although fulvous whistling duck has the highest model R2, low degrees of freedom (df = 2) and comparatively higher MSE (0.30) do not make it as good of a surrogate as the other species. III. Evaluating Model Predictions Uncertainty of model predictions may be evaluated by assessing (1) the characteristics of the model used in the predictions, and (2) the value of the input data relative to the data used to generate the model. The former was discussed in the previous section and the Rules of Thumb should be followed to ensure high confidence in model selection. Even for robust models, however, model uncertainty increases outside the range of surrogate species toxicity values that were used to develop the model. Uncertainty maybe evaluated by reviewing the confidence intervals calculated with the predicted value. Narrow confidence intervals represent higher confidence that the model fits through the range of datapoints for the entered surrogate species toxicity. If the surrogate toxicity value entered into an ICE model is outside the range of surrogate toxicity data used to generate the model, the warning "This value is outside the x-axis range for this model. Continue?" will appear to alert the user. This warning alone does not indicate low confidence in the model estimate, but should be used in conjunction with the calculated confidence intervals to evaluate the model prediction. For example, if the upper and lower bounds of the confidence interval are several orders of magnitude from the predicted value, caution should be used in applying the ICE estimate in risk assessment. 16 ------- IV. Selecting Predicted Toxicity Values for SSDs The SSD modules of Web-ICE automatically predict toxicity values from all available models for the selected surrogate species simultaneously. The user has the discretion to remove predicted toxicity values from the SSD to either customize the SSD for a particular taxa (e.g. birds only, fish only), or to remove predicted toxicity values with large confidence intervals. If an estimated toxicity value was derived from an input value that was outside of the range of surrogate species data used to generate the model from which it was predicted, a warning appears next to the value indicating the maximum (or minimum) value of the model. This warning alone does not indicate low confidence in the model estimate, but should be used in conjunction with the calculated confidence intervals to evaluate the model prediction. Users should also use the confidence intervals around the HC/HD level to guide the selection of toxicity values to exclude from the SSD. Cases in which the upper bound of the SSD is less than the HC/HD level occur when predicted toxicity values with extremely large confidence intervals are included in the SSD; removal of predicted toxicity with such confidence intervals results in HC/HD values with adequate confidence. Users may also refer to the model information provided by the Show data dropdown menu when selecting data to include in SSDs. V. Applying Web-ICE in Ecological Risk Assessment (ERA) Web-ICE was developed to support both chemical hazard assessment and ecological risk assessment (ERA) by providing a method to estimate acute toxicity to specific taxa (e.g., endangered species) or a larger number of taxa (species, genera, family) with known uncertainty. Potential applications of Web-ICE generated acute toxicity values include the problem formulation phase of an ERA to screen for contaminants of potential concern and in the analysis phase to characterize effects to a larger number of species. The estimation of species-specific toxicity values using Web- ICE is intended to reduce the reliance on safety factors typically applied when extrapolating toxicity or risks to taxa without chemical and species-specific toxicity data. Another potential application of the chemical and taxon-specific acute toxicity estimates generated from ICE models include input into existing exposure and risk models (e.g. TREX; EPA, 2005). Web-ICE generated toxicity values may also be used in the analysis of uncertainty and variability in toxicity to ecological receptors in both screening level and baseline or Tier II ERAs. In the absence of taxa-specific ICE models, Web-ICE can be used to generate SSDs and estimated 1st, 5th or 10th percentile values of the cumulative distribution of species-specific toxicity values. These percentile values, expressed as the hazard concentration (e.g. HC5) or hazardous dose (e.g. HD5), provide an estimate of toxicity at a prescribed level of species protection with known uncertainty. Hazard concentrations could be used in ERA in place of species-specific toxicity values or as a component of the uncertainty analysis. 17 ------- Acknowledgements For database development, the authors would like to thank Sonny Mayer (US EPA, retired), Thomas Steeger and Brian Montague (US EPA, Office of Pesticide Programs), Don Rodier (US EPA, Office of Pollution Prevention and Toxics), Pierre Mineau, Alain Baril and Brian Collins (National Wildlife Research Centre, Environment Canada), Chris Russom and Teresa Norberg-King (US EPA, Mid-Continent Ecology Division), and Christopher Ingersoll and Ning Wang (Columbia Environmental Research Center, U.S. Geological Survey). Special thanks to Wally Schwab and Derek Lane (Computer Sciences Corporation) for constructing the website, and to Carl Litzinger (US EPA, Gulf Ecology Division) and David Owens (Computer Sciences Corporation) for their facilitation of website development. Also, thanks to our support personnel: Marion Marchetto, Anthony DiGirolamo, Brandon Jarvis, Christel Chancy, Nathan Lemoine, Nicole Allard, Laura Dobbins, Cheryl McGill and Sarah Kell. Peer review and beta testing of the website were contributed by Larry Goodman, Michael Murrell, Raymond Wilhour, and Susan Yee (US EPA, Gulf Ecology Division), Rick Bennet (US EPA, Mid- Continent Ecology Division), Glen Thursby (US EPA, Atlantic Ecology Division), and Anne Fairbrother (US EPA, Western Ecology Division). References American Society for Testing and Materials (ASTM). 2002. Standard guide for conducting acute toxicity tests on test materials with fishes, macroinvertebrates, and amphibians. In Annual Book of ASTM Standards. E729-96. American Society for Testing and Materials, West Conshohocken, PA. Asfaw, A., M. R. Ellersieck, and F. L. Mayer. 2003. Interspecies Correlation Estimations (ICE) for acute toxicity to aquatic organisms and wildlife. II. User Manual and Software. EPA/600/R-03/106. U.S. Environmental Protection Agency, National health and Environmental Effects Research Laboratory, Gulf Ecology Division, Gulf Breeze, FL. 14 p. Awkerman, J., S. Raimondo, and M.G. Barren.2008. Development of Species Sensitivity Distributions for wildlife using interspecies toxicity correlation models. Environmental Science and Technology. 42 (9): 3447-3452. Baril, A., B. Jobin, P. Mineau, and B. T. Collins. 1994. A consideration of inter-species variability in the use of the median lethal dose (LD50) in avian risk assessment. Technical Report No. 216. Canada Wildlife Service, Headquarters. De Zwart, D. 2002. Observed regularities in species sensitivity distributions for aquatic species. In Species Sensitivity Distributions in Ecotoxicology, L. Posthuma, G.W. Suter, T.P.Traas, Eds. Lewis Publishers, Boca Raton, FL. pp133-154. 18 ------- Dyer, S. D., D. J. Versteeg, S. E. Belanger, J. G. Chaney, and F. L. Mayer. 2006. Interspecies correlation estimates predict protective environmental concentrations. Environ. Sci. Technol.. 40: 3102-3111. Dyer, S. D., D. J. Versteeg, S. E. Belanger, J. G. Chaney, S. Raimondo and M. G. Barren. 2008. Comparison of Species Sensitivity Distributions Derived from Interspecies Correlation Models to Distributions used to Derive Water Quality Criteria. Environ. Sci. Technol. 42: 3076-3083. Fairbrother, A. 2008. Risk Management Safety Factor. In. Encyclopedia of Ecology, vol. 4. S. E. Jorgensen and B. D. Fath (eds.). Elsevier publishing, pp. 3062-3068. Hudson, R. H., R. K. Tucker, and M. A. Haegele. 1984. Handbook of toxicity of pesticides to wildlife. U.S. Fish and Wildlife Service, Resource Publ. 153, Washington D.C. 90 p. Insightful. 2001. S-plus 6 Guide to Statistics. Volume 1. Insightful Corporation, Seattle, WA. Mayer, F. L. 1987. Acute toxicity handbook of chemicals to estuarine organisms. EPA/600/X-97/332. U.S. Environmental Protection Agency, National health and Environmental Effects Research Laboratory, Gulf Ecology Division, Gulf Breeze, FL. 274 p. Mayer, F. L. and M. R. Ellersieck. 1986. Manual of acute toxicity: Interpretation and data base for 410 chemicals and 66 species of freshwater animals. US Fish and Wildlife Service Resource Publication 160. Washington DC. 579 p. Mineau, P., A. Baril, B. T. Collins, J. Duffe, G. Joerman, and R. Luttik. 2001. Pesticide acute toxicity reference values for birds. Rev. Environ. Contam. Toxicol 170: 13- 74. Raimondo, S., P. Mineau, and M. G.Barren. 2007. Estimation of chemical toxicity in wildlife species using interspecies correlation models. Environ. Sci. Technol. 5888-5894. Shafer, E. W. Jr. and W. A. Bowles Jr. 1985. Acute oral toxicity and repellency of 933 chemicals to house and deer mice. Arch. Environ. Contam Toxicol. 14: 111-129. Shafer, E. W. Jr. and W. A. Bowles Jr. 2004. Toxicity, repellency or phototoxicity of 979 chemicals to birds, mammals and plants. Research Report No. 04-01. United States Department of Agriculture, Fort Collins, CO. 118 p. Shafer, E. W. Jr., W. A. Bowles Jr. and J. Hurlbut,. 1983. The acute oral toxicity, repellency and hazard potential of 998 chemicals to one or more species of wild and domestic birds. Arch. Environ. Contam Toxicol. 12: 355-382. Smith, G. J. 1987. Pesticide use and toxicology in relation to wildlife: organophosphorus and carbamate compounds. Resource Publication 170. United States Department of the Interior, Washington, DC. 171 p. US Environmental Protection Agency (EPA). 1986. Ambient water quality criteria for pentachlorophenol. EPA 440/5-86-009. US Environmental Protection Agency (EPA). 2005. TREX: Terrestrial Residue Exposure model. Office of Pesticide Programs. U.S. Environmental Protection Agency. http://www.epa.gov/oppefed1/models/terrestrial/trex_usersguide.htm#content4 US Environmental Protection Agency (EPA). 2006. ECOTOX Ecotoxicology Database. http://cfpub.epa.gov/ecotox. Duluth MN. 19 ------- Appendix I. List of Species in Aquatic Database The following species were used to develop Web-ICE aquatic species, genus, or family- level models. Invertebrates Haplotaxida Tubificidae Scolecida Capitellidae Aciculata Nereididae Diptera Athericidae Chironomidae Plecoptera Perlidae Pteronarcyidae Amphipoda Gammaridae Hyalellidae Isopoda Asellidae Diplostraca Daphniidae Decapoda Cambaridae Canceridae Annelida Varichaetadrilus pacificus Capitella capitata Neanthes arenaceodentata Neanthes virens Insecta Atherix variegata Chironomus plumosus Paratanytarsus dissimilis Claassenia sabulosa Pteronarcella badia Pteronarcella californica Crustacea Gammarus fasciatus Gammarus lacustris Gammarus pseudolimnaeus Hyalella azteca Caecidotea brevicauda Ceriodaphnia reticulata Daphnia magna Daphnia pulex Simocephalus serrulatus Simocephalus vetulus Orconectes nais Cancer magister Oligochaete Polychaete Polychaete Polychaete Short-horned flies Midge Midge Stonefly Stonefly Stonefly Amphipod Amphipod Amphipod Amphipod Isopod Daphnid Daphnid Daphnid Daphnid Daphnid Crayfish Dungeness crab 20 ------- Nephropidae Ocypodidae Palaemonidae Penaeidae Portunidae Mysida Mysidae Calanoida Acartiidae Temoridae Myioda Myidae Mytiloida Mytilidae Ostreoida Ostreidae Pectinidae Unionoida Unionidae Veneroida Mactridae Basommatophora Physidae Neogastropoda Nassariidae Forcipulatida Asteriidae Plumatellida Lophopodidae Pectinatellidae Homarus americanus Uca pigulator Palaemonetes kadiakensis Palaemonetes pugio Farfantepenaeus duorarum Litopenaeus setiferus Callinectes sapidus Carcinus maenas Americamysis bahia Acartia clausi Acartia tonsa Eurytemora affinis Mollusca Mya arenaria Mytilus edulis Crassostrea virginica Argopecten irradians Actinonaias pectorosa Lampsilis straminea Lampsilis feres Utterbackia imbecillis Villosa iris Villosa vibex Villosa villosa Rangia cuneata Aplexa hypnorum Physella gyrina Nassarius obsoletus Miscellaneous Aster/as forbesi Lophopodella carter! Pectinate/la magnified American lobster Fiddler crab Mississippi grass shrimp Daggerblade grass shrimp Pink shrimp White shrimp Blue crab Green crab Mysid Copepod Copepod Copepod Softshell clam Blue mussel Eastern oyster Bay scallop Pheasantshell Southern fatmucket Yellow sandshell Paper pondshell Rainbow mussel Southern rainbow Downy rainbow Atlantic rangia Snail Tadpole physa snail Eastern mud snail Starfish Bryozoan Bryozoan 21 ------- Vertebrates Acipenseriformes Acipenseridae Anguilliformes Anguillidae Atheriniformes Atherinopsidae Cypriniformes Catastomidae Cyprinidae Cyprinodontiformes Cyprinodontidae Fundulidae Poeciliidae Esociformes Esocidae Mugiliformes Mugilidae Perciformes Centrarchidae Cichlidae Moronidae Percidae Pisces A cipenser bre virostrum Anguilla rostrata Leuresthes ten uis Menidia beryllina Menidia menidia Menidia peninsulae Catostomus commersonii Xyrauchen texanus Carassius auratus Cyprinus carpio Erimonax monachus Gila elegans Notropis mekistocholas Pimephales promelas Ptychocheilus lucius Ptychocheilus oregonensis Cyprinodon bovinus Cyprinodon variegatus Jordanella floridae Fundulus diaphanus Fundulus heteroclitus Gambusia affinis Poecilia reticulata Poeciliopsis occidentalis Esox lucius Mugil cephalus Lepomis cyan el I us Lepomis gibbosus Lepomis macrochirus Lepomis microlophus Micropterus dolomieu Micropterus salmoides Pomoxis nigromaculatus Oreochromis mossambicus Morone americana Morone saxatilis Etheostoma fonticola Shortnose sturgeon American eel California grunion Inland silverside Atlantic silverside Tidewater silverside White sucker Razorback sucker Goldfish Common carp Spotfin chub Bonytail chub Cape fear shiner Fathead minnow Colorado pikeminnow Northern pikeminnow Leon springs pupfish Sheepshead minnow Flagfish Banded killifish Mummichog Mosquitofish Guppy Gila topminnow Northern pikeminnow Striped mullet Green sunfish Pumpkinseed sunfish Bluegill Redear sunfish Smallmouth bass Largemouth bass Black crappie Mozambique tilapia White perch Striped bass Fountain darter 22 ------- Sciaenidae Salmoniformes Salmonidae Siluriformes Ictaluridae Anura Bufonidae Hylidae Ranidae Etheostoma lepidum Perca flavescens Sander vitreus Leiostomus xanthurus Oncorhynchus clarkii Oncorhynchus gilae Oncorhynchus kisutch Oncorhynchus mykiss Oncorhynchus tshawytscha Salmo salar Salmo trutta Salvelinus fontinalis Salvelinus namaycush Ameiurus me/as Ictalurus punctatus Amphibia Bufo boreas Bufo fowleri Pseudacris triseriata Rana catesbeiana Ran a pipiens Rana sphenocephala Greenthroat darter Yellow perch Walleye Spot Cutthroat trout Apache trout Coho salmon Rainbow trout Chinook salmon Atlantic salmon Brown trout Brook trout Lake trout Black bullhead Channel catfish Western toad Fowlers toad Western chorus frog Bullfrog Northern leopard frog Southern leopard frog 23 ------- II. List of Species in Wildlife Database The following species were used to develop Web-ICE wildlife species or family-level models. Aves Anseriformes Anatidae Columbiformes Columbidae Anas discors Anas domestica Anas platyrhynchos Anas superciliosa Anassp. Anassp. Branta canadensis Dendrocygna bicolor Columba livia Columba oenas Columbina inca Columbin a passerina Geopelia cuneata Geopelia humeralis Leptotila verreauxi Streptopelia risoria Streptopelia senegalensis Zenaida asiatica Zenaida auriculata Zenaida macroura Falconiformes Accipitridae Falconidae Galliformes Odontophoridae Callipepla californica Callipepla gambelii Colinus virginianus Phasianidae Alectoris chukar Alectoris rufa Centrocercus urophasianus Coturnix japonica Gal I us gal I us Meleagris gallopa vo Perdixperdix Phasianus colchicus Tympanuchus phasianellus Aquila chrysaetos Falco sparverius Gruiformes Gruidae Grus canadensis Bluewinged teal Peking duck Mallard Pacific black duck Pintail Widgeon Canada goose Fulvous whistling duck Rock dove Stock dove Inca dove Common ground-dove Diamond dove Bar-shouldered dove White-fronted dove Ringed turtledove Laughing dove White-winged dove Eared dove Mourning dove Golden eagle American kestrel California quail Gambel's quail Northern bobwhite Chukar Red partridge Sage grouse Japanese quail Chicken Turkey Gray partridge Ring-necked pheasant Sharp-tailed grouse Sandhill crane 24 ------- Passeriformes Corvidae Emberizidae Fringillidae Icteridae Passeridae Ploceidae Sturnidae Turdidae Psittaciformes Psittacidae Strigiformes Aphelocoma sp. Corcorax melanorhamphos Corvus bennetti Corvus brachyrhynchos Corvus corax Corvus coronoides Corvus frugileg us Corvus mellori Cyanocorax yncas Pica hudsonia Pica nuttalli Junco hyemalis Spizella pa IIid a Volatinia jacarina Zonotrichia atricapilla Zonotrichia leucophrys Carpodacus mexicanus Serinus sp. Agelaius phoeniceus Agelaius tricolor Euphagus cyanocephalus Molothrus aeneus Molothrus ater Quiscalus major Quiscalus quiscula Xanthocephalus xanthocephalus Neochmia temporalis Passer domesticus Passer luteus Taeniopygia guttata Euplectes orix Ploceus cucullatus Ploceus taeniopterus Quelea quelea Sturnus vulgaris Turdus migratorius Aratinga canicularis Aratinga pertinax Calyptorhynchus funereus Melopsittacus undulatus Myiopsitta monachus Platycercus elegans Platycercus eximius Psephotus haematonotus Scrub jay White-winged chough Little Crow American crow Common raven Australian raven Rook Little raven Green jay Black-billed magpie Yellowbilled magpie Darkeyed junco Clay-colored sparrow Blue back grassquit Golden-crowned sparrow White-crowned sparrow House finch Canary Red-winged blackbird Tricolored blackbird Brewer's blackbird Bronzed cowbird Brown-headed cowbird Boat-tailed grackle Common grackle Yellow headed blackbird Red-browed firetail House sparrow Golden sparrow Zebra finch Red bishop Village weaver Northern masked weaver Red billed quelea Starling American robin Orange fronted conure Brown-throated conure Yellow tailed black cockatoo Budgerigar Monk parakeet Crimson rosella Eastern rosella Red-rumped parrot 25 ------- Strigidae Megascops asio Eastern screech owl Artiodactyla Bovidae Cervidae Carnivora Canidae Lagomorpha Leporidae Rodentia Caviidae Echimyidae Muridae Sciuridae Mammalia Capra hircus Ovis aries Odocoileus hem/onus Can is familiaris Canis latrans Lepus californicus Oryctolagus cuniculus Ca viars poreel I us Myocastor coy pus Gerb///ussp. Microtus californicus Microtus pineforum M/crofussp. Miscrotus pennsylvanicus Mus musculus Oryzomys palustris Peromyscus maniculatus Rattus argent/venter Rattus exulans Rattus norvegicus Rattus rattus Sigmodon hispidus Cynomys ludovicianus Spermophilus beecheyi Spermophilus lateral is Spermophilus richardsonii Domestic goat Domestic sheep Mule deer Dog Coyote Blacktailed jackrabbit Rabbit Guinea pig Nutria Gerbil Meadow mouse Pine mouse Vole Meadow vole Mouse Rice rat Deer mouse Ricefield rat Polynesian rat Norway rat Roof rat Cotton rat Blacktailed prairie dog California ground squirrel Goldenmantled ground squirrel Richardsons ground squirrel 26 ------- |