. > UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON, D C 20460 SEP 12 OFFICE OF THE ADMINISTRATOR SCIENCE ADVISORY BOARD SUBJECT: Transmittal of Science Advisory Board Report FROM: Vanessa T. Vu Director, Science Advisory Board Staff Office (HOOF) TO: Samuel Boltik EPA Headquarters Library Repository (3404T) This is to advise you that the Science Advisory Board issued a report numbered EPA-SAB-07-011, "Science Advisory Board (SAB) Review of the Estimation Program Interface Suite (EPI Suite TM)". Two copies of the report are attached. The report is available in electronic format on the Science Advisory Board's web site at: http://www.epa.gov/sab/fiscal07.htm. If you have any questions regarding this report, please contact me directly at 202-343-9999, or via email at vu.vanessa@epa.gov. Attachments (2) Internet Address (URL) • rrttpV/www.epa gov Recycled/Recyclable • Printed with Vegetable Oil Based Inks on 100% Postconsumer. Process Chlorine Free Recycled Paper ------- UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON D.C. 20460 j^r '*t PRO<« OFFICE OF THE ADMINISTRATOR SCIENCE ADVISORY BOARD September 7, 2007 EPA-SAB-07-11 Honorable Stephen L. Johnson Administrator U.S. Environmental Protection Agency 1200 Pennsylvania Avenue, NW Washington, DC 20460 Subject: Science Advisory Board (SAB) Review of the Estimation Programs Interface Suite (EPI Suite™) Dear Administrator Johnson: The Office of Pollution Prevention and Toxics (OPPT) requested that the Science Advisory Board (SAB) review the Estimation Programs Interface Suite (EPI Suite™) software. The Agency uses this software to estimate properties related to a chemical's environmental transport and fate. This information is used to support regulatory decisions in the new chemicals program and in other existing chemical assessment activities. The SAB commends EPA for the strategic decision to support the development of EPI Suite™ and to make it easily and freely available. Governmental and private organizations within the United States and elsewhere make extensive use of this software in supporting decisions regarding new and existing chemicals. The widespread uses of this software for a number of different purposes stems, in part, from its successful utilization and integration of available science in combination with its ease of operation, transparency, and cost-effectiveness. Because EPI Suite™ is part of the Organization for Economic Co-operation and Development's Quantitative Structure-Activity Relationship (Q)SAR toolbox, the software will likely play a significant role in international regulatory activities. It also supports the efforts of emerging industrial economies to develop in an environmentally protective and sustainable manner. The SAB has carefully evaluated the EPI Suite™ software. The Panel's numerous recommendations for improvements in the software's scope, accuracy, and ease of operations appear in the enclosed report, along with comments on appropriate current and potential future uses. Because of its importance in supporting Agency decisions regarding existing and new chemicals, the Panel would like to draw your ------- attention to the following three overarching findings involving the software's underlying science, functionality and uses. First, for chemicals similar to those for which modules to estimate chemical properties were developed, the algorithms that support the calculations are scientifically defensible and appropriate for Agency regulatory screening applications. However, for existing and/or new chemicals whose structures and/or properties are outside the domain used in module development, scientific uncertainty may limit the utility of this software. In such cases, the Agency uses other methodologies to evaluate chemical properties. The Panel also has identified a number of broad chemical categories (e.g., polymers, organo-metallics, nanoparticles, etc.) and associated chemical properties for which the Agency is encouraged to develop modules to estimate chemical properties. Given their importance in industrial and commercial applications as well as their potential environmental and human health impact, the Panel recommends that (Q)SAR development for these chemical categories (and associated properties) be established as an Agency priority. Secondly, the Panel noted that significant improvements in software functionality and ease of operation could be achieved if the graphical user interface were upgraded from its current disk operating system (DOS) appearance to a more familiar format, such as Windows™. By providing a more recognizable user interface, particularly to novice users, the Agency will help to facilitate broader and more extensive application of this software in environmental decision-making. Finally, the resources for updating and improving the software have not been commensurate with its importance in supporting Agency decisions, nor with the rapidity with which new and even novel chemicals are being developed for commercial use. In light of the widespread and multiple uses for this software, the Agency should increase its investments to expand the range of chemical categories over which the software can generate valid predictions, and the number of chemical properties that can be modeled as new scientific information becomes available. Thank you for the opportunity to provide advice on this important suite of modeling software and to interact with the very dedicated and able OPPT staff. Please feel free to contact us if you have any questions concerning this review. Sincerely, Dr. M. Granger Morgan, Chair EPA Science Advisory Board Dr. Michael J. McFarland, Chair EPI Suite Review Panel EPA Science Advisory Board ------- NOTICE This report has been written as part of the activities of the EPA Science Advisory Board, a public advisory group providing extramural scientific information and advice to the Administrator and other officials of the Environmental Protection Agency. The Board is structured to provide balanced, expert assessment of scientific matters related to the problems facing the Agency. This report has not been reviewed for approval by the Agency and, hence, the contents of this report do not necessarily represent the views and policies of the Environmental Protection Agency, nor of other agencies in the Executive Branch of the Federal government, nor does mention of trade names or commercial products constitute a recommendation for use. Reports of the EPA Science Advisory Board are posted on the EPA website at http://www.epa.gov/sab. ------- U.S. Environmental Protection Agency Science Advisory Board EPI Suite Review Panel CHAIR Dr. Michael J. McFarland, Utah State University. Logan, UT OTHER SAB MEMBERS Dr. David A. Dzombak, Carnegie Mellon University, Pittsburgh, PA CONSULTANTS Dr. Deborah Hall Bennett, University of California - Davis, Davis, CA Dr. Robert L. Chinery, New York State Department of Law, Albany, NY Dr. Christina E. Cowan-Ellsberry, The Procter & Gamble Company, Cincinnati, OH Dr. Miriam L. Diamond, University of Toronto, Toronto, Ontario, Canada Dr. William J. Doucette, Utah State University, Logan, UT Dr. Anton J. Hopflnger, University of New Mexico Dr. Michael W. Murray, National Wildlife Federation, Ann Arbor, MI Dr. Thomas F. Parkerton, ExxonMobil Biomedical Sciences, Annandale, NJ Dr. Kevin H. Reinert, AM EC Earth and Environmental, Plymouth Meeting, PA Dr. Daniel T. Salvito, Research Institute for Fragrance Materials, Inc., Woodcliff Lake, NJ Dr. Hans Sanderson, Danish National Environmental Research Institute, Roskilde, Denmark Dr. Louis J. Thibodeaux, Louisiana State University, Baton Rouge, LA SCIENCE ADVISORY BOARD STAFF Ms. Kathleen White, Washington, DC ------- U.S. Environmental Protection Agency Science Advisory Board CHAIR Dr. M. Granger Morgan, Carnegie Mellon University, Pittsburgh, PA SAB MEMBERS Dr. Gregory Biddinger, ExxonMobil Biomedical Sciences, Inc, Houston, TX Dr. James Bus, The Dow Chemical Company, Midland, Ml Dr. Deborah Cory-Slechta, University of Rochester, Rochester, MY Dr. Maureen L. Cropper, University of Maryland, College Park, MD Dr. Virginia Dale, Oak Ridge National Laboratory, Oak Ridge, TN Dr. Kenneth Dickson, University of North Texas, Denton, TX Dr. Baruch Fischhoff, Carnegie Mellon University, Pittsburgh, PA Dr. James Galloway, University of Virginia, Charlottesville, VA Dr. Lawrence Goulder, Stanford University, Stanford, CA Dr. James K. Hammitt, Harvard University, Boston, MA Also Member: COUNCIL Dr. Rogene Henderson, Lovelace Respiratory Research Institute, Albuquerque, NM Also Member: CASAC Dr. James H. Johnson, Howard University, Washington, DC Dr. Agnes Kane, Brown University, Providence, RI Dr. Meryl Karol, University of Pittsburgh, Pittsburgh, PA Dr. Catherine Kling, Iowa State University, Ames, 1A Dr. George Lambert, Robert Wood Johnson Medical School-UMDNJ, Belle Mead, NJ Dr. Jill Lipoti, New Jersey Department of Environmental Protection, Trenton, NJ Dr. Michael J. McFarland, Utah State University, Logan, UT in ------- Dr. Judith L. Meyer. University of Georgia. Athens. GA Dr. Jana Milford. University of Colorado, Boulder, CO Dr. Rebecca Parkin. The George Washington University Medical Center, Washington, DC Mr. David Rejeski, Woodrow Wilson International Center for Scholars. Washington. DC Dr. Stephen M. Roberts, University of Florida, Gainesville, FL Dr. Joan B. Rose, Michigan State University, East Lansing, Ml Dr. Jerald Schnoor, University of Iowa, Iowa City, IA Dr. Kathleen Segerson, University of Connecticut, Storrs, CT Dr. Kristin Shrader-Frechette, University of Notre Dame, Notre Dame, I"N Dr. Philip Singer, University of North Carolina, Chapel Hill, NC Dr. Robert Stavins, Harvard University, Cambridge, MA Dr. Deborah Swackhamer, University of Minnesota, St. Paul, MN Dr. Thomas L. Theis, University of Illinois at Chicago, Chicago, IL Dr. Valerie Thomas, Georgia Institute of Technology, Atlanta, GA Dr. Barton H. (Buzz) Thompson, Jr., Stanford University, Stanford, CA Dr. Robert Twiss, University of California-Berkeley, Ross, CA Dr. Terry F. Young, Environmental Defense, Oakland, CA Dr. Lauren Zeise, California Environmental Protection Agency, Oakland, CA SCIENCE ADVISORY BOARD STAFF Mr. Thomas Miller, Washington, DC IV ------- TABLE OF CONTENTS EXECUTIVE SUMMARY 1 GENERAL COMMENTS 4 SPECIFIC COMMENTS 6 1. Supporting Science 6 A. Comprehensiveness 6 B. Method accuracy and validation 16 C. Estimation Methods and Alternates 23 2. Functionality 26 A. How convenient is the software and does it have all the necessary features?.. 26 B. Are there places where EPI SuiteTM's user guide (and other program documentation) does not clearly explain EPI's design and use? How can these be improved? 29 C. Are there aspects of the user interface (i.e., the initial, structure/data entry screen; and the results screens) that need to be corrected, redesigned, or otherwise improved? Do the results screens display all the desired information? 30 D. Currently one enters EPI Suite7 using SMILES and CAS; are there other ways to describe the structure (e.g., ability to input a structure by drawing it), that should be added? 31 E. EPI Suite™ has many convenience features, such as the ability to accept batch mode entry of chemical structures, and automatic display of measured values for some (but not all) properties. Are there other features that could enhance convenience and overall utility for users 32 F. Are property estimates expressed in correct/appropriate units? '. 33 G. Is adequate information on accuracy/validation conveyed to the user by the program documentation and/or the program itself? 33 3. Appropriate Use 35 A. Currently Identified Uses 35 B. Potential Additional Uses 36 REFERENCES R-l GLOSSARY G-l APPENDIX 1: Summary Assessment of EPI Suite™ Core Models A-l APPENDIX for 1 Ai A-3 APPENDIX for IBi A-4 APPENDIX for ICi A-7 ------- EXECUTIVE SUMMARY EPA's Office of Pollution Prevention and Toxic Substances (OPPTS) regulates pesticides and chemicals to ensure protection of public health and the environment as well as promote innovative programs to prevent pollution. The Office of Pollution Prevention and Toxics (OPPT) within OPPTS is responsible for assuring the public that industrial chemicals for sale and use in the United States do not pose unacceptable risks to human health or the environment. To accomplish this, OPPT promotes pollution prevention, use of safer chemicals, risk reduction, risk management and public awareness. OPPT programs include the pre-manufacture notification (PMN) review of new industrial chemicals; testing, assessment, and risk reduction of existing industrial chemicals; management of "national chemicals" (e.g., PCBs); international chemical issues; pollution prevention advocacy; and partnership programs, such as the High Production Volume Chemicals (HPV) Challenge, Green Suppliers Network, Design for the Environment and Green Chemistry. Accurate and reliable predictions of the behavior of chemicals in a biological or environmental system require a full and comprehensive understanding of their thermodynamic, kinetic and transport properties both within and across multimedia compartments. To support Agency decisions regarding the toxicity, environmental fate and transport of new chemicals, OPPT (with Syracuse Research Corporation (SRJ)) developed the Estimation Programs Interface (EPI Suite™), which OPPT makes freely available from its website. The software combines the available science with user- friendliness, transparency, and cost-effectiveness. EPI Suite™ is utilized by various Agency program offices as well as other US federal agencies, state regulatory agencies, foreign countries and the private sector. The EPI Suite ™ software consists of physical-chemical property estimation routines (PERs) and mass balance based environmental fate models (EFMs). Where measured data are lacking and EPI Suite™ is appropriate, the Agency uses the results of the PERs together with the EFMs, to understand a chemical's environmental fate and transport. This understanding is fundamental to assessing chemical exposure, hazard, and risk. OPPT requested that the Science Advisory Board (SAB) evaluate the science, functionality and uses of the Agency's EPI Suite M software. The EPI Suite Review Panel was formed for this purpose and reviewed the software in the context of OPPT's needs. Science. In summary, the Panel commends the Agency for using sound science to develop and refine EPI Suite™ and encourages the further development and use of this software in supporting Agency decisions. The Panel applauds the Agency for furnishing chemical fate and transport modeling software that is science-based and is used globally to support environmental policy decisions. ------- The Panel encourages the Agency to consider evaluating the chemical fate and transport modules using the latest statistical approaches to determine their predictive accuracy and to evaluate new estimation approaches as they gain acceptance in the scientific community. The Panel endorses a systematic approach for updating and refining the chemical fate and transport modules as high quality and peer-reviewed measurement data become available - both to increase the applicability of the software to a wider array of chemical classes and to support the inclusion of additional physical- chemical properties. The Panel has provided a number of recommendations focused on expanding the current set of chemical properties and associated functionality including EFMs for future upgrades to EP1 Suite M. However, in light of the widespread application of EPI Suite™, the Panel recommends that before the Agency decides to add a module, it assess, to the extent practical, whether there is consensus in the scientific community that the module has been appropriately parameterized and has been sufficiently verified to be applicable in screening assessment. Also, because the accuracy of EPI Suite™ output will vary depending on the chemical and the environmental compartment in which it is found, the Panel recommends communicating the uncertainty associated with estimates provided by EPI Suite™. The PERs currently within EPI Suite™ have received extensive scientific scrutiny with the results published in the peer-reviewed literature. Because EPI Suite™ was historically developed to model the fate and transport behavior of nonpolar organic chemicals, the physical-chemical property estimates for this class of chemicals are typically well within an order of magnitude of measured values. The Panel considered these results adequate to support Agency screening level decision-making. Moreover, these PERs satisfy the Organization for Economic Cooperation and Development (OECD) principles established for quantitative structure-activity relationship ((Q)SAR) validation, a finding which further supports the use of EPI Suite™ PERs in screening level regulatory decision-making. The ability of EPI Suite™ to accurately model physical-chemical properties depends on the chemical's class, the quality of the property module chemical data training set and whether the chemical's properties fall within the range of the chemical training data set. Many of the chemical training data sets are outdated and some are incomplete. Periodic review and refinement of the training sets would support the continuous improvement of module output accuracy and expand the range over which EPI Suite™ results are valid. These refinements could be accelerated if the Agency leveraged its resources to collect additional measured property data. Criteria that the Agency should consider in prioritizing the updates of chemical property data sets are identified in 1-A-ii below. Chemical domain mapping has the potential to significantly improve the predictive capabilities of mechanistically and statistically-based PERs, but no Panel consensus emerged as to the most effective approach to achieve this goal. The Panel encourages the Agency to consider establishing a scientific forum at which the various methodologies for enhancing the accuracy of the PER module output may be evaluated. ------- The Panel agreed on two broad recommendations aimed at improving EFM module predictions. First, the Panel supports a more explicit description and justification for the Agency's selection of EFM parameter default values. Secondly, the Panel encourages the Agency to provide the EPI Suite™ user with a clear and unambiguous display of quantitative uncertainty estimates associated with the fate model (i.e., EFM) output. Functionality. The Panel, which included experienced as well as novice users of EPI Suite™, considered the functionality and usability of EP1 Suite™ software. While there are many positive features associated with the EPI Suite™ user interface including its documentation and HELP file availability, there are also opportunities for functional improvement. For example, although EPI Suite™ operates within a Windows™ platform, a new user to EPI Suite is immediately struck by the disk operating system (DOS) appearance of the graphical user interface (GUI). The Panel encourages the Agency to secure the necessary funding to upgrade the GUI to reflect a typical Windows™ appearance and functionality. Uses. All of the modules in EPI Suite™ are generally accepted by the regulatory and regulated community for use in risk-based priority setting, screening level risk assessment and prioritization for chemical testing for the chemical classes to which the modules apply. Given the mandated 90-day reporting period for which new chemicals in the PMN program must be evaluated and the large number of chemicals that the Agency must screen annually, reliance on (Q)SAR module output is justified. The modules are expected to provide an order of magnitude estimate of a chemical's physical properties, an accuracy level that is generally acceptable by the Agency for screening level assessments. However, application of (Q)SAR-based modules to chemicals outside the module training set domain increases the uncertainty of the module prediction. Because the chemical domains that are used in developing current EPI Suite (Q)SARs do not provide adequate coverage of nanoparticles, inorganic compounds, organo-metallic and certain other classes of chemicals, application of EPI Suite for these classes of compounds within the PMN and pollution prevention (P2) programs is inappropriate. The Panel recommends that the Agency collect more peer-reviewed measurement data on the physical and chemical properties for these chemical classes with the intent of either expanding the domain of the existing (Q)SARs or for creating new (Q)SARs specifically for these classes of chemicals. Owing to its success in supporting Agency decision-making and its accessibility, use of EPI Suite™ is prolific outside of the Agency, including in international regulatory agencies. Given its broad acceptance and use by regulators, industry and the academic community, the Panel strongly encourages the Agency to explore opportunities to develop foreign language versions of EPI Suite™. ------- GENERAL COMMENTS The EPI Suite™ software basically consists of two module categories: physical- chemical property estimation routines (PERs) and environmental fate models (EFMs). The PERs are used to predict important physical-chemical (e.g., water solubility, vapor pressure, octanol-water partition coefficients) and reactivity (e.g., biodegradation, atmospheric oxidation) properties and, together with the EFMs, project a chemical's environmental fate and transport which is considered during the Agency's screening level evaluation. Accurate and reliable predictions of the behavior of chemicals in a biological or environmental system require a full and comprehensive understanding of their thermodynamic, kinetic and transport properties both within and across multimedia compartments. To support Agency decisions regarding the toxicity, environmental fate and transport of new chemicals, the EPI Suite™ software employs twelve individual modules that may be logically placed into one of these two functional categories. Category - 1: The nine regression based estimation modules in the PER category were developed for estimating physical-chemical properties for chemicals that lack the minimum data set needed to support Agency decisions. These modules, including the Octanol-Water Partitioning Coefficient Estimation Program (KOWWIN), the Henry's Law Constant Estimation Program (HENRYWIN), the Soil or Sediment Organic Carbon Partitioning Coefficient Estimation Program (PCK.OCWIN), the Water Solubility Estimation Program (WSKOWFN), the Bioconcentration Factor Estimation Program (BCFWIN) and the Melting Point-Boiling Point (and Vapor Pressure) Chemical Estimation Program (MPBPWIN), MPBPPVWFN), are used for estimating the equilibrium distribution or partitioning of a chemical between two media such as fish tissue-water and organic matter-water (which are functions of the octanol-water partition coefficient), air-water, organic matter-water, etc. The three other modules found in the PER category include: the Atmospheric Oxidation Estimation Atmospheric Oxidation Program (AOPW1N), the Biodegradation Estimation Program (BIOWIN) and the Hydrolysis Estimation Program (HYDROW11M). These modules employ regression- based approximation methods to estimate the value of kinetic parameters for atmospheric gas-phase reaction with the hydroxyl, aerobic biodegradation and hydrolysis reactions, respectively. Category - 2: EPI Suite™ EFM modules that enable the user to estimate the environmental fate and transport of specific chemicals include: the Volatilization Rate from Water Estimation Program (WVOLWIN), the Sewage Treatment Plant Chemical Fate Estimation Program (STPWFN) and multi-media fugacity model (LEV3EPI). These modules, which utilize a chemical species mass balance approach, have been designed to estimate the chemical concentration, phase mass fractions and residence times of chemicals when placed in well-defined environmental systems. The mass balance approach allows the user to estimate the change in chemical concentration over time from which removal rates can be estimated. Moreover, the EFM modules employ, as inputs, the partitioning and reaction kinetic results generated from the PER modules. The EPI ------- Suite™ user, however, has the ability to override these default inputs and enter their own values. The environmental compartments defined within the three EPI Suite™ EFM modules require the user to input the volume and mass fractions of the various media under consideration. In the absence of user defined values. EPI Suite™ assigns default values, which are idealized representations of the real world. Requirements of the EFM modules also include user (or EPI Suite™ - i.e.. default) defined chemical coefficients that quantitatively describe the rate of chemical transport between the various media compartments. An important limitation of the present version of EPI Suite™ is the inability for users to input their own mass transfer coefficient (MTC) data. Moreover, the absence of high quality peer-reviewed MTC data to serve as input to EPI Suite™ exacerbates this problem. Although filling this critical data gap is vital for broadening the range of applicability of EPI Suite , collecting useful MTC data is inherently expensive, a fact which presents the Agency with a considerable resource challenge. Because of the importance of obtaining and incorporating accurate and reliable MTC information into EPI Suite™, the Panel encourages the Agency to develop a systematic and longer-term program, possibly through leveraging resources with other federal agencies, to address this critical data need. However, in the interim, the Panel endorses establishing a modest effort that can, at a minimum, result in the formulation of a guideline MTC module based on available peer reviewed theoretical models and supporting data. A workshop consisting of an expert panel sponsored by the Agency is suggested as a means of producing a draft of the guideline version of the MTC module. A summary assessment of core EPI Suite™ modules can be found in APPENDIX 1. ------- SPECIFIC COMMENTS 1. Supporting Science A. Comprehensiveness i. Are there additional properties that should be included in upgrades to EPI Suite™ for its various specified uses (PMN, P2)? All of the physical-chemical properties that are currently modeled by EPI Suite™ are critical in characterizing the behavior of a chemical released into the environment. Therefore, none should be dropped. Under most circumstances, the PERs predict the measured property value within an order of magnitude, a standard of accuracy that is generally acceptable for screening level Agency decision-making. It would be inappropriate to use PERs to predict physical-chemical properties of chemicals whose characteristics are significantly different than those found in the module training set because the difference between predicted and measured values may be greater. This potential inaccuracy is an important issue unto itself and also for error propagation when these estimates are incorporated into the fate models. Given the broad range of chemicals for which the Agency must prepare environmental assessments together with the need to ensure an equitable and transparent evaluation of all chemical data submissions, the Panel encourages the Agency to furnish stakeholders with a description of the process by which regulatory decisions are made for chemicals when application of EPI Suite has been determined to be inappropriate. With respect to expanding the current set of chemical properties (and associated functionality) for future upgrades to EPI Suite™, the Panel recommends that the Agency consider incorporating the following: • pKa, the negative log of a chemical's dissociation constant • Influence of pKa on other physical-chemical properties • Temperature dependency of all physical-chemical properties • KAW, the air-water partition coefficient • KQA, the octanol-air partition coefficient • Bioaccumulation factors for root plants, leaf plants, and aquatic wildlife • Diffusion coefficients in various environmental media • Metabolism and production of stable chemical intermediates • Neutral hydrolysis • Activity coefficients • Sub-cooled liquid vapor pressure and aqueous solubility • Surface tension ------- • Anaerobic biodegradation potential • Ozone depletion potential, greenhouse gas potential, and maximum incremental reactivity (MIR) used to evaluate ozone formation potential. Some of these endpoints and improved features (e.g., temperature-dependence of physical-chemical properties) can already be predicted by another Agency supported model (SPARC). The Panel, therefore, encourages the Agency to consolidate and build upon existing work for future EPI Suite ™ improvements. The current EPI Suite ™ has only limited utility in predicting parameters for the important and large class of compounds known as polymers. Several Panel members offered the following list of additional chemical properties specifically related to the toxicity and fate of polymers that the Agency may consider in future upgrades to EPI Suite1M: • Glass transition temperature • Crystal melt transition temperature • Elastic mechanical properties like bulk modulus • Viscosity measures • Heat capacity • Cohesive energy • Charge • Water solubility • Dispersibility • Flammability • Parameters (e.g., degradation rates) influencing environmental persistence Several commercial software packages estimate many of the environmentally important physical-chemical properties of polymers. The Panel encourages the Agency to evaluate the scientific underpinnings of these software packages to determine if similar functionality could be incorporated into EPI Suite™. For some classes of chemicals, the physical-chemical properties estimated by EPI Suite™ are not sufficient to predict a chemical's behavior. The Panel encourages the Agency to consider development of a systematic and longer-term plan to develop and integrate additional EPI Suite™ functionality to adequately model additional physical- chemical properties as well as the fate and transport characteristics of these compounds. Similarly, the Panel strongly recommends that the Agency establish and support technical transfer symposia and associated activities (e.g., science workshops) that will help facilitate Agency exposure to the latest scientific approaches to chemical property modeling. Given the Agency's resource limitations, the Panel strongly recommends that the Agency establish a set of objective and transparent criteria for identifying and prioritizing the most important physical-chemical properties required for defensible regulatory ------- decision-making. Examples of possible ranking criteria, which are not listed in any sort of priority, include the following: • The property's potential use in future fate and transport modeling enhancements • The accuracy and reliability of the property's currently available experimental data set • The extent of the chemical domain covered by the modeled property • The opportunity for increasing the scope and applicability of EPI Suite ™ to a broader range of chemical classes and properties. • Determination of whether the new property could be easily modeled using the existing model chemical data set • Relative importance of property value as input to other EPI Suite ™ modules and/or Agency chemical assessments • The relative magnitude between "model error" and "measurement error" • Cost or other resource requirements associated with modeling the new property Greater use of MTCs can improve some applications in EPI Suite™. A recent study comparing the outputs of five multimedia models demonstrated that model homogenization was possible only when the numerical values of the dozen or so MTCs were numerically equal (Cowan, et al., 1995). Where MTCs varied significantly, the computed concentration levels, mass fractions in the media compartments and the chemical residence time estimates differed, in many cases, by several orders of magnitude. The peer-reviewed literature contains a significant quantity of data with which to develop MTCs. Therefore, the Panel encourages the Agency to support the development of additional MTCs and, where possible, establish a systematic process for evaluating and incorporating high quality MTC data within EPI Suite™ The highest priority fate models are those which are judged to be used most often and/or to have the most impact on decision-making processes. The Panel has identified these models to be: • Fugacity Unit World • STP • BCF/BAF • Long-range transport While there was consensus among the panelists that BAF is an important fate parameter to model and the Panel encourages the EPA to develop this module, several ------- panelists strongly cautioned that BCF/BAF models still have an incomplete treatment of certain factors important in predicting uptake and metabolism. For example, while the Arnot and Gobas (2004) model includes a metabolism term, it is not clear, given experimental difficulties, how accurately this term can be parameterized for different compounds in different biota. For metabolizable chemicals (e.g.. aliphatic alcohols or acids that have predicted log Kow values greater than 5 but are readily metabolized), the predictions of BCF and BAF from a model based solely on log Kow can be significantly greater (e.g., one order of magnitude or more) than experimentally determined BCF values. While this type of phenomenon has been recognized by researchers involved in development of BCFWIN (Meylan et al., 1999), and since the module in EPI Suite ™ does contain correction factors to attempt to account for metabolism, further work is needed to improve its predictive capability. Some panelists identified related concerns with the development of this module, including: • Conducting experimental studies for BAF to validate the model is difficult and expensive and such studies have been conducted only for a limited number of substances which are either slowly or not metabolized. • Within the literature there are wide ranges reported in field measured BAFs (and even BCFs in laboratory studies) that have been obtained for a given chemical. • Concern was expressed regarding the difficulty in appropriately parameterizing a BAF model for non-recalcitrant chemicals. A correction factor approach alone (as is used in BCFWIN) may still lead to significant errors in prediction for certain substances (or potential errors where measurement data are not available), and novice users may not appreciate the limitations in these predictions. • There is no widely accepted method for estimating whole body metabolism rates in fish either from first principles (i.e., structure or other properties) or otherwise although there is considerable research on-going to develop and validate such methods. These efforts include the International Life Sciences Institute/Health and Environmental Sciences Institute (1LSI/HES1) project and recently initiated work by ECVAM. Therefore, even if the user were given the option to enter a metabolism rate, these estimates are not currently available. • There is the potential for inconsistencies between the outputs of BCFWIN and the potential new BAF modules (e.g., Arnot and Gobas model) that may lead to confusion in the interpretation of the fate of some chemicals in part because these two models are based on very different approaches. BCFWIN relies on a fitted equation to measured BCF data. The Arnot and Gobas model is based on first principles and, as such, includes hydrophobic partitioning, growth dilution, and metabolism. When differences between the model predictions represent the variability in BCF and BAF data this is acceptable. However, in many cases, the differences will be due to problems in adequately parameterizing the BAF model ------- (e.g., to account for metabolism) and it would be difficult to know that this is the cause of the discrepancy a priori. The training set used to calibrate the existing model, BCFWIN. includes studies based on analysis of parent test substance as well as studies based on analysis of total radioactivity. The total radioactivity based BCF can not distinguish between parent substance bioaccumulation and incorporation of metabolites into the organism as a result of normal catabolic processes (although the Panel recognizes that some metabolites can be of toxicological concern). As a result, the model is trained on data that lacks a consistent basis for (Q)SAR development and subsequent decision-making. The BCFWIN database also fails to indicate whether the basis for the BCF is parent substance or total radionuclide analysis. Given the increasing focus on the assessment of persistent, bioaccumulative and toxic (PBT) chemicals in regulatory contexts, the current BCFWIN data set should be critically reviewed, any inappropriate data that does not meet acceptance standards (e.g., total radioactivity based BCF for metabolized substances) deleted, and new literature data added to provide a consistent basis for an improved "next generation" (Q)SAR. The existing Japanese "MITI" BCF database provides perhaps the best single source of aqueous fish BCF data that could be included in this effort. The data in the MITI database is based on the OECD 305 bioaccumulation test procedure, which is currently considered by many to be the "gold standard" for these types of tests (http://www.safe.nite.go.jp/english/kizon/KIZON start_hazkizon.html). Compilation of such data also would support the development of (Q)SARs for estimating fish biotransformation potential that could be used as input to BAF models or multimedia exposure models that predict human intake fraction. The panelists encourage the Agency to participate in and follow the on-going scientific developments in BAF determinations including: • Additional efforts at experimentally determining bioaccumulation (including better understanding metabolism) • Improved databases for developing and verifying BAF models • ILSI/HESI (International Life Sciences Institute/Health and Environmental Sciences Institute) Work Group on Bioaccumulation • Ongoing modeling research published in the literature In light of the widespread application of EPI Suite™, before the decision is made to add a new module, such as the BAF module, the Agency should assess to the extent practical, whether there is consensus in the scientific community that the model has been or can be appropriately parameterized and has been sufficiently verified to be applicable in screening assessments. 10 ------- More detailed information can be found in the APPENDIX for I Ai and related issues are discussed in section 1-C-ii below. ii. Are there additional sets of existing measured data which should be included in upgrades to EPI Suite™? Are there specific measurements with the potential to improve EPI Suite™ estimates so much that an effort should be made to collect them? Existing peer-reviewed measurement data sets are available for the following parameters: octanol-water partition coefficients (Kow), Henry's law constants (He), air- octanol partition coefficients (KAO), biodegradation rates, organic carbon partition coefficient (Koc), aqueous solubility, and rates of aquatic hydrolysis. Several panelists noted that updating the chemical training data set used in estimating KOC should be a priority because of the limited amount of data that is currently used to estimate the value of this parameter within EPI Suite™. The Panel encourages the Agency to expand the functionality of the KOC module to capture the range of organic carbon types that could affect a chemical's fate and transport including: natural vegetation-based, soot, black carbons, non-aqueous phase liquids (NAPL), etc. Appendix 1-B-i identifies additional data sets the Agency might consider. Because of the Agency's limited resources, the Panel supports a strategic approach to identifying those data sets that require refinement. Criteria that the Agency should consider in prioritizing the updates of chemical property data sets include the following: • The duration of time since the chemical property data set was last updated • Level of uncertainty associated with the chemical property estimates • The domain and quality of the chemical property training set domain • Accuracy of chemical property prediction Several panelists identified scientific proceedings associated with certain highly reputable international conferences and journals such as the J. Phys. Chem. Ref. Data (http://ipcrd.aip.org) as excellent sources of peer reviewed chemical data sets that should be considered for inclusion in upgrades to EPI Suite ™ . There are additional sets of measured data that the Agency could consider for inclusion in upgrades to EPI Suite1 M pending the Agency's satisfaction with the quality of peer-review received. Some of these are: • Additional sewage treatment plant (STP) chemical partitioning and fate data. Appropriate sources for this type of data would include, but are not limited to: a) the National Association of Clean Water Agencies (formerly the Association of Metropolitan Sewerage Agencies), b) Water Environment Research Foundation (WERF), c) Water Environment Federation (WEF), and d) Journal of Environmental Engineering and related journals. II ------- • The existing Japanese "MITl" data - While most data bases aggregate data from a number of different studies using different methods, the MJTI database uses a standard procedure to test a large number of chemicals, including direct measurement of the properties of interest for parent compounds. Some panelists familiar with the database say it provides an excellent source of aqueous fish BCF data. • Additional sources of Polychlorinated Biphenyls (PCB) congener data sets that are available in the peer-reviewed literature (e.g., Frame et al., 1996a, 1996b). • Reliable un-published data reported as part of the High Production Volume (HPV) challenge program (http://www.epa.gov/HPV/) or other international regulatory initiatives such as the OECD Screening Information Data Set (S1DS) program (http://www.epa.gov/opptintr/chemtest/pubs/oecdsids.htm). The Panel agreed that the EPI Suite™ fate and transport modules are limited by the paucity of chemical degradation (e.g., biodegradation and biotransformation processes) data available. Like mass transfer coefficients, chemical degradation information is so important to understanding the fate and transport of chemicals in the environment that, if necessary, the Agency should consider redirecting resources from current programs to address this critical data need. Moreover, there have been a number of recent scientific advances in understanding chemical degradation that merit Agency consideration. For example, an innovative methodology termed the environmental "reagents" approach has been developed for defining the reactive power of environmental compartments. Understanding this reactivity has important implications to the fate of chemicals and should be considered in future upgrades to the EPI Suite™ chemical degradation modules (Green and Bergman 2005). iii. Are there other capabilities that should be included in upgrades to EPI Suite™? The Agency is especially interested in the SAB's views on uncertainty analysis and if/how information on how good the estimates are can be conveyed to users. Uncertainty in Parameter Estimation, Routines, and Predictions When a PER is used to predict properties for chemicals lying outside the domain of compounds used in the training set for that PER, confidence in the prediction will generally be lower than if the chemical were within the existing domain. The Panel recommends that results in such cases be flagged to highlight for the user the potential uncertainties in the estimate value. Although the Panel explored a range of views concerning how uncertainty should be conveyed to the EPI Suite user, two approaches emerged as the preferred options. Both approaches involve the development of appropriate statistical confidence intervals 12 ------- surrounding a mean value of an estimated chemical property. In the first case, the majority of the Panel recommended that the quantitative uncertainty information be displayed only in HELP files while, in the other, several panel members preferred having the data presented with the module output for each endpoint/test chemical. Advantages and disadvantages of both approaches are summarized in the following: • Provide information on the confidence range in HELP files: Advantage: This approach does not require that the Agency defend quantitative estimates, particularly for test chemicals that are outside of the model domain. Moreover, by limiting the availability of the uncertainty discussion to the help file, the Agency reduces the potential for misinterpretation or misapplication of the uncertainty results. Disadvantage: If not presented more explicitly, the novice user may overlook this information increasing the potential for misinterpretation or misapplication of the model results. • Provide the confidence interval in the module output: Advantage: The Agency and the scientific community are moving toward more explicit acknowledgement and quantification of uncertainty. This approach is consistent with such goals. Moreover, by including quantitative uncertainty estimates with module output, the EPI Suite™ user is compelled to recognize the potential of making decision errors. Disadvantage: While the complex nature of data uncertainties and modeling uncertainties needs to be communicated, more informative, but potentially more complex, quantitative uncertainty assessment methods present novice users and decision makers with new challenges. Effective incorporation of uncertainty in decisions will not be accomplished with quantitative uncertainty analysis alone. The Panel encourages the Agency to explicitly acknowledge to the EPI Suite™ user the fact that the quantitative uncertainty estimate for each endpoint/test chemical includes only the statistical error associated with the model prediction and neglects the error in reported experimental measurement values that were used to calibrate the model. To the extent practical, the Agency should provide guidance to the user on the expected data error component for each modeled property. 13 ------- Uncertainty in Environmental Fate Model Predictions The Panel endorses that uncertainty associated with the EPI Suite™ fate model (i.e., EFM) be better conveyed to the user. The Panel identified the following sources of EFM uncertainty: • Model structure • Model parameters (e.g., chemical properties, mass transfer coefficients, etc.) • Media compartment(s) including type, size and distribution Panel deliberations included consideration of various approaches to effectively convey uncertainty to the EPI Suite™ user. The following list summarizes the range of approaches discussed by the Panel together with their potential advantages and disadvantages. • Model output details could remain in its current form, while the documentation could more fully describe the input parameter range and limitations of the evaluative fate models. The EFM modules in EPI Suite™ are designed to produce "evaluative" predictions. The media compartments reflect generic environmental scenarios such as the "unit world". The term evaluative is used to describe an output that is interpreted to be of relative significance and/or order-of-magnitude rather than a precise numerical result. The major (i.e., 1st order) sources of output uncertainty are associated with the ascribed media of chemical entry. For example, significantly different media concentration predictions will result if the chemical is "emitted" into the air compartment rather than the water compartment. Clear data/information available in the PMN as to the choice of media for chemical entry is needed. In addition, cautions/alerts as to the high level of output variability resulting from media entry choice need be placed in the documentation as understanding this variability is key to controlling this source of EFM output uncertainty. Experience with such models indicates that input variations in chemical properties and MTCs result in 2nd order levels of EFM output uncertainty (Webster, et al, 1998). Advantages: Simplicity and consistency in interpretation of fate model output. Disadvantages: Only presenting uncertainty information in the help section assumes that the user will read this section. Even if this section were read, there is no guarantee that the scientific or regulatory implications of uncertainty will be fully understood. • Give qualitative information regarding the uncertainty associated with model results based on the range of the chemical property values. 14 ------- An example of such an approach is illustrated by describing a chemical's distribution using a K.QA versus KQW diagram. Construction of such a plot will depict the distribution of the chemical with respect to the various environmental phases, e.g.. air, water or soil/sediment. EPI Suite™ should provide explanatory text that clearly informs the user that the relative media compartment sizes, inter-compartmental chemical mass transfer rates and the media compartment into which the chemical is released will affect the model predictions of the chemical's allocation between media compartments. Moreover, if a chemical were associated exclusively with a single medium, uncertainty in the partition coefficients would have a minimal impact on the chemical's allocation between compartments (as compared to those chemicals that are distributed between phases). Advantages: The user will receive qualitative information regarding the potential sensitivity of model output to physical-chemical properties as it relates to environmental fate. This approach provides yet another level of screening whereby a chemical that does not clearly lie exclusively within a specific environmental compartment may merit further investigation (based on environmental partitioning concerns alone). Disadvantages: Development of a robust method for determining and presenting this information represents a considerable technical challenge. • Calculate error propagated from estimates of physical-chemical properties and fate models, i.e., input 95% confidence limits or qualitative confidence factors from each estimated physical-chemical property to obtain a range of fate results (MacLeodetal. 2002). MacLeod et al. (2002) present a simple, semi-quantitative method for calculating error propagated through environmental fate models. Several panel members supported this approach over the computationally demanding Monte Carlo simulation where the required number of model iterations can be significant (e.g., > 2000 iterations). The semi-quantitative approach provides a simple view of the range of values that could be expected based on user-defined uncertainties associated input parameters where uncertainty is expressed as a multiplicative factor. Advantages: With this method, the user generates an estimate of the distribution of the model output for each chemical in the various media compartments. Use of this approach assumes that the user will have an estimation of the uncertainty associated with the model inputs. Disadvantages: The uncertainty associated with other factors (e.g., mass transfer coefficients and media of chemical emission) may be of more importance in interpreting modeling results particularly given that the intent of these models are often to be evaluative (screening use) in nature. 15 ------- Finally, the Panel supported a more explicit description and justification for the Agency's selection of EFM parameter default values. This information, which should be easily accessible to the EP1 Suite™ user, must provide sufficient detail of the environmental media that the default values purport to represent (e.g., temperate or arid terrestrial system). iv. Are there other estimation methods that should be considered in upgrading EPI Suite™? The Panel was able to identify several innovative methodologies that have the potential to enhance both the accuracy and scope of the EPI Suite™ modules. These methodologies include the: a) least squares adjustment of chemical properties approach (Schenker et al., 2006), b) polyparameter linear free energy relationship approach (Goss et al., 2003. Nguyen et al., 2005), and c) the use of molecular polarizability to predict vapor pressure and KQA (Staikova et al., 2004). In addition, the Panel encourages the Agency to partner with other stakeholders to establish a forum (e.g., technical workshop, interagency workgroup, etc.) to evaluate the various methodologies available for mapping chemical domains in support of future (Q)SAR development and innovations in fate modeling. B. Method accuracy and validation i. Is the accuracy of the modules in the EPI Suite™ sufficient for its various specified uses? EPI Suite™ is a screening tool that supports Agency risk-based decisions regarding new and existing chemicals. EPI Suite™ outputs are generally found to be within an order of magnitude of measured values, an accuracy standard that has been deemed sufficient by the Agency for defensible decision-making at the screening level. Since many users may not recognize the range of accuracy associated with EPI Suite™ output, the Panel encourages the Agency to electronically post a detailed disclaimer that clearly identifies the recommended uses of the current version of the EPI Suite™ software. Although the accuracy of EPI Suite™ varies depending on endpoint, the Agency staff described EPI Suite's design as intended to provide "best estimates," and in the view of some panel members, the screening level models used for assessing exposure are generally designed to be conservative. The reason for this is that, for a screening level assessment, the Agency generally develops estimates that are conservative (protective). Such conservatism minimizes the probability of users making decision errors based on module output. While minimizing false positive decision errors improves the effectiveness with which the Agency uses its scarce resources, minimizing false negative decision errors also establishes greater confidence that Agency decisions based on EPI Suite ™ output will be sufficiently protective of the environment. 16 ------- Concerning application of EPI Suite™ output, greater transparency in describing the process by which decision errors are considered in regulatory decision-making would more effectively communicate environmental assessment decisions. By explicitly defining the acceptable level of false negative and false positive decision error rates within each regulatory program that uses EPI Suite™ module output, the Agency would make the basis for its decisions more easily understood. In describing EPI Suite™'s level of quality assurance, the Agency confirmed that EPI Suite™ was in full compliance with the EPA's Information Quality Guidelines (USEPA 2002)1. The Agency has stated that extensive software security precautions have been fully integrated into EPI Suite™ to prevent the possibility of unauthorized algorithm modification. Moreover, the use of scientifically defensible (Q)SARs within the individual modules ensures that the software output is presented in a complete and unbiased manner. The three basic steps employed by the Agency in developing the EPI Suite™ software include the following: • Model Development: This step includes: a) defining the Agency programmatic needs, b) scientific evaluation of the peer-reviewed literature, c) developing and testing the theoretical concept that supports the model, and d) developing and documenting the (Q)SAR(s). • Model Evaluation: This step includes: a) evaluating the (Q)SAR(s) and their intermediate output, b) evaluating the model results against peer- reviewed measurement data, c) providing basic quality assurance/quality control checks, d) alpha testing the model to ensure that it performs as designed, e) beta testing the model by independent users, and f) facilitating peer review of the QSAR by the scientific community. • Model Application: This step includes evaluating and documenting the data quality and model performance limitations to ensure that users will apply the model appropriately. At the present time, there are relatively few systematic evaluations of the training data sets for EPI Suite™ modules. The Panel strongly recommends that the Agency establish a data quality oversight program that monitors, critically evaluates and incorporates new peer-reviewed measurement data as well as new modeling approaches. Several innovative methodologies offer potential opportunities to improve the accuracy and broaden the scope of EPI Suite™ software. These include the: 1 As described in the Council for Regulatory Environmental Models Guidelines (USEPA 2003), EPA's Information Quality Guidelines (USEPA 2002) define quality as a broad-term that includes the concepts of integrity, utility, and objectivity. The Guidelines state that "integrity refers to the protection of information from unauthorized access or revision to ensure that it is not compromised through corruption or falsification. In the context of environmental models, often integrity is most relevant to protection of code from unauthorized or inappropriate manipulation Utility refers to the usefulness of the information to the intended users Objectivity involves two distinct elements, presentation and substance Objectivity includes whether disseminated information is being presented m an accurate, clear, complete and unbiased manner. In addition, objectivity involves a focus on ascertaining accurate, reliable and unbiased information " 17 ------- • least squares adjustment of chemical properties approach (Schenker et al., 2006), • polyparameter linear free energy relationship approach (Goss et al., 2003. Ngyuen et al., 2005), and • use of molecular polarizability as a predictor of physical-chemical properties (Staikova et al., 2004). EPI Suite's™ data quality should be evaluated at regular intervals (e.g., at least annually). Updates to individual modules should be documented for technical comment and use by the user community. Currently, the Agency has other software packages (e.g., SPARC) at its disposal whose output may be compared to selected output from EPI Suite™. For EPI Suite ™ users, the following quality assurance information would be helpful in evaluating and characterizing individual module output: • Provide a detailed description of the module chemical training set domain. • Flag output when the chemical and associated physical-chemical properties are outside the training set domain. • Furnish the range of experimental data used in the module chemical training set in addition to the selected value used in calculations. • Provide statistical comparison of results using estimated and experimental data. • Identify any chemical fragments that are not captured by the Simplified Molecular Input Line Entry System (SMILES) algorithm within the module output. • Identify those chemicals or class of chemicals that have been placed on the 'potential problem' list under Toxic Substances Control Act (TSCA). • Within the help files, module accuracy or method error should be fully discussed. • A description of how default parameters or data were selected should be provided. The Panel recognizes the importance of the availability of high quality, peer- reviewed measurement data as the basis for EPI Suite modules. Therefore, the Panel encourages the Agency to upgrade the current set of EPI Suite™ modules to include as much peer-reviewed measurement data of a credible and known quality as possible and remove, where justified, data of lower or unknown quality. Moreover, the Agency should develop a programmatic framework that would facilitate the systematic evaluation of data quality obtained from both intra-Agency and inter-Agency sources. The goal of 18 ------- these activities is to develop improved chemical data training sets of known quality for each of the properties estimated by EPI Suite™. More detailed information can be found in APPENDIX IBi. ii. Have the modules been adequately validated, and have they been published in the peer-reviewed technical literature or elsewhere? While no module is ever completely validated, the Panel agreed that the EPI Suite™ modules have, for the most part, been satisfactorily evaluated. The scientific underpinnings of each of the compartment modules have been appropriately vetted in the peer-reviewed scientific literature and the physical-chemical property (Q)SARs have been found to satisfy the OECD principles for (Q)SAR validation. The five OECD principles established for (Q)SAR validation (OECD 2004) are summarized as follows: • Principle 1: Defined endpoint • Principle 2: Unambiguous algorithm • Principles: Defined domain of applicability • Principle 4: Appropriate measures of goodness of fit (e.g., coefficient of determination - R2) • Principle 5: Mechanistic interpretation. OECD Principle 1 requires that (Q)SARs should have a defined endpoint. Most EPI Suite™ modules conform to this requirement. The end point for the biodegradation module (BIOWfN) is less clear because certain aspects of the module (e.g., primary degradation) could range from a minor change in chemical structure (e.g., loss of one halogen, change from one unsaturated to saturated bond in a complex structure) to full mineralization of the chemical. The user should fully recognize that, because of the inherent complexity of the degradation process, ascribing a consistent primary degradation endpoint under all possible environmental conditions may not be feasible. Some panelists commented on the inconsistency in the underlying training data used for calibration of the BCFWIN module (e.g., inclusion of studies involving both parent substance as well as non-parent specific radiotracer studies). OECD Principle 2 has been consistently achieved by the EPI Suite™ (Q)SARs. Most EPI Suite™ modules are relatively transparent in their design and construction. An overview of their structure and development is provided in the user guide and in the published peer-reviewed literature. The one notable exception to this finding is the biodegradation module (BIOWfN), whose structure and parameterization is less transparent. The Panel strongly recommends that the Agency better define the design, structure and data quality implications of the BIOWfN module. Definition of the 19 ------- environmental medium to which the BIOWIN module output results apply would be a valuable first step. Furthermore, the scientific justification for the scaling rules used to extrapolate results from BIOWIN estimates associated with aqueous environments to soil and sediment should be fully described in the Help files. Finally, the Agency should fully describe the sensitivity of module output when chemical removal through various abiotic processes is prevalent (e.g., sorption, hydrolysis, chemical oxidation, etc.). EPI Suite™ modules are generally consistent with OECD Principle 3. However, the Panel noted that module predictions are less reliable for chemicals that are outside of the chemical training set domain. Moreover, for modules that have multidimensional interpolation domains (i.e., models that use atom/fragment components, e.g., KOWWIN), determining the actual interpolation domain is not trivial. A recently peer-reviewed publication evaluated the domain of the chemical training data set utilized by KOWWIN. This work proposes a novel approach for defining the multi-dimensional space that describes the chemical data training set (Nikolova-Jeliazkova, et al. 2005). The Panel encourages the Agency to explore this and other scientific approaches suitable for defining the chemical training set domains for EPI Suite™ modules. The ultimate goal, of course, is to develop a scientifically defensible process by which chemicals are selected for inclusion in the chemical training set domain. Moreover, based on the insight developed through this approach, priorities can be established to target new data collection that efficiently expands the model domain for substances of regulatory importance. In general, the EPI Suite™ modules are consistent with OECD Principle 4. External evaluation of an EPI Suite™ module using query chemicals with known properties is the standard procedure for assessing (Q)SAR reliability. External evaluation has produced adjusted R2 values of approximately 0.75, a value that is considered satisfactory for regulatory screening level chemical evaluation. A few of the EPI Suite™ modules (e.g., BCFWIN, HYDROW1N, etc.) do not appear to have had external evaluation. The Panel strongly encourages the Agency to scan the peer- reviewed literature to determine if external evaluation of these modules has occurred and, if so, is the data quality suitable for supporting upgrades to EPI Suite™. Those EPI Suite™ modules which are not regression-based routines do not conform to OECD's Principle 5. However, the EFM modules are mechanistically based and are adequately described in the Help files. iii. Are some modules more accurate/better validated than others, and if so, which need more work? Most of the EPI Suite™ modules have been evaluated sufficiently to support regulatory decision-making. However, all modules would benefit by improved domain mapping, which would allow, amongst other things, the ability of the user to determine a priori the suitability of a particular module to reliably estimate a given physical-chemical property for a specific chemical. 20 ------- Of the EPI Suite™ modules that require additional validation/evaluation beyond that already discussed in the response to the preceding question, the organic carbon partition coefficient model, PCKOCWIN is a priority because this module was developed twenty years ago (1986) and has yet to be revised. Presently. KOC estimation routines use molecular connectivity indices (MCIs) and correction factors based on structural features of the chemical. MCIs are generally not widely used or accepted by (Q)SAR developers because MCI mechanistic information is difficult to interpret. Finally, the database of KOC values used to develop the present version of the PCKOCWFN module is not as large and inclusive as for other EPI Suite™ modules. iv. To the extent that modules work together to generate estimates, do they do so correctly? EPI Suite™ modules work together to generate scientifically defensible estimates of the physical-chemical properties of chemicals. However, the transfer of data between modules requires further refinement. The Panel encourages the Agency to explicitly describe the protocol (and hierarchy) that govern the passing of physical-chemical property module output to the chemical fate and transport modules. For example, the user may want to know whether a measured physical-chemical property value is used preferentially over a chemical property module prediction in fate and transport modules and the implications of either choice (e.g., advantages of using presumably more accurate measured data over the advantage of using an internally consistent set of physical- chemical properties when estimating chemical fate, (e.g., Beyer et al.2002)). To improve transparency in describing module interaction, module inputs as well as outputs should be provided as part of the EPI Suite™ results. Moreover, the Panel strongly supports separating the physical-chemical property estimation modules from the fate modules, such that the fate modules can be executed independently. With respect to module default values for certain parameters (e.g., mass transfer coefficients, media compartment volumes, deposition parameters), the Panel endorses greater user- customization capabilities including the option for batch mode processing with user- defined inputs. The Panel found that for some modules, inconsistent results can be obtained for a homologous series of compounds where predictions rely on values for other PER parameters in EPI Suite™. For example, the estimated BCF values for five compounds in the n-alkane series, based on either experimental or predicted log Kow values are given below. 21 ------- Table X: Octanol-Water Partition Coefficients and Estimated Bioconcentration Factors for Several n-Alkanes Derived from EPI Suite Compound n-octane n-nonane n-decane n-undecane n-dodecane Log Kow* Experimental 5.18 NA 5.01 NA 6.10 Predicted 4.27 4.76 5.25 5.74 6.23 BCF 1944 93 144 528 314 •Bolded values used by EPI Suite to predict BCF As seen above, the predicted log KOW values show a predictable pattern of increasing hydrophobicity with increasing chain length. However, BCF values do not show this pattern - the shortest chain compound (n-octane) with the lowest predicted log KOW (and an experimental value intermediate between two other experimental values for higher molecular weight alkanes) produces the highest predicted BCF. This pattern is not undone by manually entering an experimental value - for example, entering a log KOW of 5.18 for n-nonane gives a predicted BCF (based on that value) of 194, still an order of magnitude lower than the predicted BCF for n-octane, with an identical experimental log Kow. It appears further work may be needed in development and use of correction factors employed to estimate BCF in EPI Suite. The common option that allows the user to enter the CAS number of a chemical to obtain the corresponding SMILES string is a convenient feature of all EPI Suite™ modules. However, it appears that a number of commercial substances that are not unique structures (i.e., Unknown, Variable Composition and Biologicals - UVCB) are included in the database as single representative structures. There are two principal concerns with this approach. First, it is unclear from the user guide how representative structures have been selected. Second, it is uncertain if predictions derived from unique structures can be reliably extrapolated to characterize the actual complex substance. To illustrate this concern, the representative structure for CAS number 68526-86-3 (Alcohols, Cl 1-14-iso-, C13-rich) is shown below. CH, CH, 22 ------- This isomeric alcohol mixture is reacted with phthalic anhydride to produce CAS number 68515-47-9 (1. 2-Benzenedicarboxylic acid, di-Cl 1-14-branched alkyl esters. CIS-rich) 'CH, The representative structures selected for these two chemicals are inconsistent since they reflect different alkyl chain branching. Moreover, such arbitrary differences in selection of representative structures can yield misleading predictions for some key endpoints (e.g., biodegradation). To address this concern, the user could first be alerted by EPI Suite™ to the fact that the chemical under consideration is complex and may not have a unique structure and that physical-chemical property predictions may be less certain than for a unique chemical. C. Estimation Methods and Alternates i. Are the estimation methods in the EPI Suite™ up-to-date and generally accepted by the scientific community for its various uses? In general, the Panel concluded that the current estimation methods used in the EPI Suite™ modules are generally accepted by the scientific community. However, the methods are at risk of becoming outdated as data and practice advance, particularly with regard to the data included in the module training sets. For this reason, the Panel encourages the Agency to evaluate whether the incorporation of newer statistical approaches (e.g., logistical modeling) would increase the accuracy of module prediction. A detailed summary of the relevance and general acceptability of EPI Suite™ estimation methods is provided in the following bullets. • Up-to-date: The underlying data and statistical models are generally not up to date. The Agency should consider incorporation of new data sets and newer statistical analysis tools to optimize the accuracy of the modules. Linear 23 ------- regression may not always be the optimal statistical model for physical-chemical property estimation. Acceptance by the scientific community: Those in the scientific community who understand the role and accuracy limitations of screening models used in regulatory decision-making generally accept the EPI Suite™ module results for many classes of organic chemicals. The EPI Suite™ modules are also generally accepted among regulators. EPI Suite™ modules have been accepted by the OECD and are being tested for implementation in relation to high production volume (HPV) chemicals and the Globally Harmonized System (GHS) for classification and labeling of chemicals by OECD. At the request of the United Nations Sub-Committee of Experts on the GHS, the OECD is developing proposals for classification criteria and labeling of chemicals according to the health and environmental hazards the may present. A Task Force on Harmonization of Classification and Labeling has been established to coordinate the technical work carried out by the experts. OECD typically assigns a reliability code of 2 (valid with restrictions) to EPI Suite M estimates. Moreover, the extensive peer-reviewed documentation that supports the use of EPI Suite™ (Q)SARs as well as the large number of evaluation (validation) studies published demonstrates that EPI Suite™ complies with EPA information quality guidelines (USEPA 2002). Use in assessments: Within the wider scientific community there is some confusion about whether EPI Suite™ module output is appropriate for full risk assessment or hazard assessment. However, in general, those experts that understand that the EPI Suite™ modules are evaluative by design, hypothesis generators, and first tier predictions of a chemical's fate when the alternative is no data at all support the predictive functionality that the modules provide. More detailed information can be found in the APPENDIX for ICi. ii. Are there other estimation methods that should be considered in upgrading EPI Suite™? Owing to the breadth of this charge question, the Panel's response was two-fold. The first part of the Panel's response is focused on estimation methods that are applicable primarily to new physical-chemical properties (i.e., those that are not currently available within EPI Suite ). The second part of the Panel's response describes the development of methods/approaches that could be used to more effectively estimate properties that are currently available in EPI Suite™. 24 ------- With respect to new additional physical-chemical properties, the Panel identified the following as important for expanding the accuracy and scope of EPI Suite™ for organic compounds: • pKa • Influence of pKa on other physical-chemical properties • Temperature dependency of all physical-chemical properties • KOA • Bioaccumulation factors for root plants, leaf plants, fish and terrestrial organisms (e.g., meat and milk transfer factors) • Diffusion coefficients in various environmental media • Metabolism and production of stable chemical intermediates • Neutral hydrolysis • Activity coefficients • Sub-cooled liquid vapor pressure and aqueous solubility • Surface tension • Anaerobic biodegradation potential • Ozone depletion potential, greenhouse gas potential, and maximum incremental reactivity (MIR) for assessing ozone formation potential. With respect to EFMs for wastewater treatment, EPI Suite ™ currently includes predictions for only a default conventional activated sludge system. Future enhancements should provide options for user-defined treatment systems (e.g., tank dimensions, fine versus coarse bubble diffusers) as well as alternate treatment designs (e.g., aerobic lagoons). Several panel members offered the following list of additional chemical properties specifically related to the toxicity and fate of polymers that the Agency may consider adding to EPI Suite™: • Glass transition temperature • Crystal melt transition temperature • Elastic mechanical properties like bulk modulus • Viscosity measures • Heat capacity • Cohesive energy • Flammability • Parameters (e.g., degradation rates) influencing environmental persistence 25 ------- With regard to improving the accuracy of predictions of those physical-chemical properties currently available within EPI Suite™, the Panel identified the following new approaches: • The Agency should consider the use of poly-parameter linear free energy relationships (poly-parameter LFERs) and neural networks in module optimization as well as partial least squares and support vector machine methodologies in data fitting. • In those cases where multiple modules exist that are capable of predicting the value of the same physical-chemical property, consensus modeling should be conducted. If all modules for estimating a given property for a particular chemical agree, there is a high level of confidence associated with the property estimation. Conversely, if the modules results vary widely, the reliability of the property prediction is uncertain. • To the extent that the Agency can document data quality, the Agency should consider moving from two dimensional to three dimensional chemical structure based methods. Additional comments relating to this topic can be found in section 1-A-i above. 2. Functionality A. How convenient is the software and does it have all the necessary features? Although the software is convenient to use, significant improvements should be made to enhance the appearance, navigability and quality of technical support provided by the EPI Suite™ software. The following bullets summarize the technical recommendations. • Currently, the individual property estimation and fate modules cannot be launched from the EPI Suite™ interface. The Panel supports greater program flexibility that would allow software users the ability to launch individual modules directly from the user interface, with appropriate indication of options for entering data or utilizing values generated by EPI Suite™ to run the modules. • To ensure that software users are cognizant of the quality assurance limitations associated with module output, individual modules should alert the user when a chemical's physical-chemical properties are outside the chemical training set domain. • Although EPI Suite™ operates on a Windows™ platform, the graphical user interface (GUI) has an archaic DOS appearance. The Panel encourages the 26 ------- Agency to upgrade EPI Suite™'s GUI to reflect a more typical Windows™ operating system environment. • To minimize the loss of data when new versions of EPI Suite™ are released, the Panel recommends that the new version installation program not delete chemical data input by the user but, rather, only overwrite older versions of EPI Suite™ software itself. • To address the myriad of data reporting requirements, the Panel recommends that users have the option of saving output files in various formats (e.g., Word™, WordPerfect™, Excel™, etc.). • Providing greater flexibility for inputting data files in batch mode e.g., provision of a screen that allows EPI Suite™ users the ability to simply "cut and paste" Chemical Abstract Services Registry Number (CAS) numbers or SMILES notations would increase efficiency. • EPI Suite™ EFM module users would benefit from having access to a simple flow chart that clearly describes the data processing steps that result in generating environmental fate model output. • To enable users to access various data sets simultaneously, the EPI Suite™ program should allow minimization of all screens. • To reduce confusion when saving a chemical name run (via Save User), it would be helpful if the program used as a default the full chemical name (or a truncated version), rather than the most recently saved name. • To improve program navigability, all parameters should be located in a single location rather than having some parameters placed in the "Functions - Other" category. • The default option for displaying module results should be the full output results category rather than simply furnishing the summary output results. • Use of color-coded text to distinguish experimental values from predicted values or to alert users of chemicals whose properties were outside those contained in the module's chemical data training set would help to minimize misinterpretation of results. • When inputting a chemical based on SMILES notation alone, the chemical name should be displayed in both the data entry screen and in the output file. 27 ------- • In the AOPWIN module, EPI Suite ™ should specify the environmental conditions that are associated with the default concentrations of hydroxyl radical and ozone and allow user input of alternative hydroxyl radical and ozone concentrations. • Clarify the units used in the EPI Suite ™ module PCKOCWIN. • BIOWIN Help information should clearly state the conditions which pertain to this program's estimates (e.g., aqueous slurry) as well as decision rules for extension of BIOW1N results to other media (e.g., sediment, soil). • More details regarding the structure, function and parameterization of the WVOLWrN module should be provided in the Help files. For example, it is unclear what default values are being used for air and water temperature, water advective flow, depth of water etc. • For the sewage treatment plant module, i.e., STPW1N, the Help files fail to provide the default plant operating conditions. Temperature of water, whether the plant has only secondary treatment or includes tertiary treatment as well, solid retention time for the activated sludge systems, etc. should be provided in the Help files. • Since AOPWIN and the Level 3 fugacity module output is sensitive to mass transfer rates as well as degradation/transformation rates, the default values (and their associated temperature dependency) should be provided in the Help files or in an appendix in the user guide. • Experimental data that may be available for a specific structure is not provided for some endpoints (e.g., BIOWI1M, BCFWFN). • Entering air advection times in hours is not intuitive. Users should have the option of entering wind speed instead. On the KQC tab, it is impossible to determine whether the module uses the KQW method, as KQW is not a calculated property in the results. In EPI Suite ™ module results, it would be preferable to list experimental values in the same order as predicted values are given (i.e., boiling point, melting point, and vapor pressure). For the example of lindane, there seems to be a problem with experimental results for melting point and boiling point (i.e., values in wrong order). 28 ------- • In the half-life selection module (LEVEL3NT) the user is not allowed to specify a model estimate or a selected value for air. which is an option for the other environmental media. • For those physical-chemical properties for which two or more methods are currently available. EPI Suite™ should provide to the user the ability to select which module they would prefer to use (e.g., water solubility). • The reference feature should be enhanced by allowing the user to easily access individual references (including a brief abstract) through addition of a simple pop- up window. • For key references, EPI Suite™ should provide links to web pages where pdf versions of the documents can be accessed, if available. • The Help files should contain the list of all references used in developing the predictive models. • When modeled property estimates are passed on to other modules (e.g., fate and transport modules), the EPI Suite™ program should identify to the user the values that are passed as well as provide clear documentation in the user guide of the protocol used to establish data transmission priority. This is especially important when there is more than one method available for estimating a particular property, e.g., Henry's law constant. • Where the EPI Suite™ Help files explicitly indicate that certain chemicals have been excluded in the database (e.g., CAS Number database), supporting explanation should be provided. • Help files and other documentation should be regularly checked by the Agency for typographical errors. • The Agency should consider adding a "comments" facility to the EPI Suite™ to enable receipt and incorporation of feedback from users such as identification of errors and recommendations. B. Are there places where EPI SuiteTM's user guide (and other program documentation) does not clearly explain EPI's design and use? How can these be improved? The user guide should more clearly identify the modules which can be executed independently and the features available for a particular module when executed alone. The stand alone modules could be identified in a separate highlighted section. Some features are unavailable when executed as part of EPI Suite yet can be accessed in stand alone operations. The "Experimental Value Adjusted" option in KOWWPN and 29 ------- HENRYWFN is an example. In addition, separate sections in the user guide should incorporate increased discussion of training set domains and uncertainty in predictions, as noted elsewhere in this report. The Panel agreed that the EPI Suite™ user guide provided a clear and succinct description of the design and use of the software. However, the Panel noted that the documentation quality was uneven with many sections supported by detailed references while others were noticeably devoid of such support. Moreover, the Panel was unanimous in its recommendation that the EPI Suite™ software should allow users the ability to easily download and print a copy of the user manual as a stand-alone document. With respect to general improvements for the user guide, the Panel recommends that the Agency develop a separate detailed guide for activities or functions common among the various modules (e.g.. how to import chemicals through the SMILES notation, function keys and buttons, use of results and structure windows, etc.) as well as a quick start guide for experienced users. Finally, the guide should clearly describe those modules that predict chemical properties based on the output from other modules (e.g., use of Kow output to predict bioconcentration factors through BCFW1TM. C. Are there aspects of the user interface (i.e., the initial, structure/data entry screen; and the results screens) that need to be corrected, redesigned, or otherwise improved? Do the results screens display all the desired information? The Panel applauds the multi-faceted functionality of the EPI Suite™ user interface. However, the Panel is of the unanimous opinion that EPI Suite™ does not take full advantage of the opportunities provided by a Windows™ environment. Moreover, while there are many positive features associated with the EPI Suite ™ user interface including its documentation and HELP file availability, there are also opportunities for substantial improvement. Recommendations for improving the overall functionality of the user interface could include the following: • The format for module output should be user defined and include the following Windows™-based display options: Excel™, WordPerfect™ and/or Word™ file. • When multiple measured values are available within a module, the user should have the option to select which measured value is applied in the calculations. • Under the fugacity tab, the input screen should identify the source of module input(s) as well as what algorithms are being executed. • Because a user can enter data through either a SMILES string or the chemical name, the screen could more clearly indicate that both options are possible. 30 ------- The "previous"' button has limited functionality and does not seem to work in all scenarios. The "previous" option should allow the user to return to a chemical when evaluating multiple chemicals and ideally recall more than simply the most recent chemical evaluated. D. Currently one enters EPI Suite™ using SMILES and CAS; are there other ways to describe the structure (e.g., ability to input a structure by drawing it), that should be added? The SMILES structure and Chemical Abstract Services (CAS) registry number input options are adequate to describe and query chemical structures. However, the addition of an input drawing program would extend the utility of the EPI Suite™ to users who are unfamiliar with the SMILES notation. Alternatively, the user could be directed towards commercial packages to assist in the derivation of SMILES structures. When CAS registry numbers are unavailable, users typically prefer to draw their structures rather than use a string language input. Moreover, the use of SMILES may limit the users of EPI Suite™ to those with basic knowledge of organic chemistry. Inclusion of a two-dimensional [2D] structure drawing program in addition to SMILES will be valuable to users with limited knowledge of organic chemistry. It is also useful to highlight to current users that a structure drawn in commercial software packages (e.g., Cambridge Soft's Chemdraw™) can be copied and pasted directly into EPI Suite™. The Panel does not recommend that the Agency attempt to develop its own structure drawing program, but, rather purchase/license one of the many commercially available software packages. There are several computer-based chemical drawing packages that generate SMILES or other 2D [and 3D] structure tables. The Panel noted the following observations that support utilizing commercially available software packages: • Most commercially available software packages are generally accepted by the scientific user community. • Programs like those offered by Elsevier's MDL and Chemdraw™ have options to execute batch mode operations as well as read and write structure files interfaces to other commercial software. • The MDL software package has a module that effectively models chemical properties of linear polymers. • Both the MDL (1) and Chemdraw1 M (2) software packages can effectively draw isomeric structures and have the ability to interface to three dimensional [3-D] chemical structure generating programs. 31 ------- E. EPI Suite1 M has many convenience features, such as the ability to accept batch mode entry of chemical structures, and automatic display of measured values for some (but not all) properties. Are there other features that could enhance convenience and overall utility for users? While there are a number of features in EPI Suite™ that increase the convenience of the program (including multiple modes of identifying a chemical of interest in the input, and allowing for user-specified input parameters), the Panel has recognized that the program interface should balance convenience with transparency including characterization of the uncertainty associated with model output. In other words, while the Panel is cognizant of the importance of usability in executing the EPI Suite ™ programs, convenience should not come at the expense of providing users with a better sense of how estimated values are derived. The following bullets summarize the specific Panel recommendations with respect to additional features to enhance software convenience. • To encourage examination of the sensitivity of user input on module output, the Panel recommends that the user have the ability to execute the fate modules separately from the physical-chemical parameter prediction models (with the caveat noted previously in Section 2A). • The CAS number database should be validated in the current version (in particular, discrepancies between CAS numbers and SMILES notation), and regularly updated with new information. There is a discrepancy between the number of chemicals for which SMILES notation exists and the number of chemicals in the TSCA inventory. It would be valuable to document within the SMILES HELP files the reason for the difference in chemical coverage. The documentation on SMILES refers to approximately 20,000 discrete organic chemicals in the original TSCA inventory that are in the SRC database, while the June 2005 U.S. Government Accountability Office (GAO) report references 62,000 organic chemicals in the original inventory (GAO, 2005). • For the batch mode entry feature, the system should allow user-specified inputs of physical-chemical properties. • Rather than having the module output written to the directory containing the program for the batch mode entry feature, it would be preferable to give the user the option of naming the output file and identifying the location where it will be saved. • The output data in batch mode should include CAS numbers for each chemical as well as the names and SMILES notation. 32 ------- The Name Lookup feature should be added to each individual module. For chemicals that have isomers, the module output should explicitly state that fact. The output display should include identification of the other isomers that exist, by name and CAS number. For results displayed in summary format, measured values should be given for several isomers of the chemical assessed, if available. For both the summary and full output options, repeating the listing of experimental values in the results screen for a given parameter is confusing and should be avoided (e.g., experimental aqueous solubility in both fragment approach and log KOW approach). F. Are property estimates expressed in correct/appropriate units? In general, the Panel found no specific concerns regarding the units used to express the property estimates. However, the Panel has made the following recommendations that should improve the overall utility of module output. • Output data should be presented in International System of Units (SI) units. • There should be consistency with the use of significant figures. • For BCFW11M the units are L/kg (wet weight) for fish and should be included in the output. • Units should be specified for log Koc. • The Agency should provide a unit conversion program to allow the user the option to convert from one set of output units to another. G. Is adequate information on accuracy/validation conveyed to the user by the program documentation and/or the program itself? In general, the Panel found that the information on module accuracy and validation was conveyed adequately to the user, but not in a consistent and transparent manner. For the sake of clarity, the Panel has addressed accuracy/validation issues pertaining to (Q)SARs (i.e., algorithms) and the actual property estimation outputs separately. 33 ------- i) Is adequate information on accuracy/validation conveyed with respect to the QSAR predictive model itself? The Panel found that, while regression statistics are provided in assessing model performance and residual error, for most modules, this information is not generally transparent to the user. Moreover, when such information is available, it is uncertain as to which version of the module the reported analysis applies. It should be straightforward to determine a common set of statistical significance measures valid across all modules that would provide common and comparative measures of accuracy and validity. A suggested set of module performance metrics for consideration include: V.x.y = model version N(T) = number of compounds in the training set N(O) = number of outliers removed in developing the module R2 = standard coefficient of determination Q2 = leave-one-out cross-validation coefficient. SD = standard deviation of fit R(l) = lower value of the range of the property in the training set R(u) = upper value of the range of the property in the training set MRE = mean residual error SRE = standard deviation of residual error RX = average correlation coefficient for models built from X random values of the dependent variables contained in the training set R(t)x2 = correlation coefficient of an external validation set of X compounds In addition, for each of the endpoints predicted by QSAR, a brief discussion on the measured error associated with current test protocols would be valuable. Any insights regarding trends in measurement error (e.g., measurement error of Log KOW and water solubility tends to increase with increasing Log KOW or decreasing water solubility, respectively) should be summarized. ii) Is adequate information on accuracy/validation conveyed with respect to making the property estimation of a particular test chemical? The Panel concluded that a major shortcoming of EPI Suite™ is that the user is given no indication as to whether the domain of module applicability is appropriate for the test chemical. Currently, the decision to use a module for a specific chemical appears to be based on past experience and/or professional judgment. This approach is not transparent and could lead to inconsistency and error in assessments among chemicals. How the Agency uses (or does not use) or interprets EPI Suite™ results in making decisions is an important consideration in determining if the software provides the degree of accuracy that supports its intended use. Upgrades to EPI Suite™ provide the Agency with an opportunity to better understand the accuracy of the software's estimates. For example, the Agency could 34 ------- compare earlier estimates to any new measured values that have since been published for the chemical of interest. After an EP1 Suite™ PER or EFM has been upgraded, the new estimate can be compared with the earlier estimate for the same chemical. Assuming the more recent upgrade is producing more accurate estimates in general, results from the new predictions can indicate the degree of over- or under-prediction for the parameter of concern in the original assessment. Comparing either new experimental or estimated data to decision-making criteria can be used to assess the performance of EP1 Suite™ in supporting regulatory decision-making. The Panel also endorses an independent "model domain" analysis to improve accuracy and reliability estimates of chemical property values. In this analysis, the degree of molecular similarity of the test chemical to that of the chemicals used in the module training set establishes the reliability of a property estimate. The Panel did not have the necessary expertise to provide specific advice on preferred domain analysis methods but encourages the Agency to seek experts for technical guidance so that this functionality can be included in future EPI Suite™ upgrades. 3. Appropriate Use A. Currently Identified Uses i. Is the science incorporated into EPI Suite1 M adequate for each of these current uses? All of the modules in EPI Suite™ are generally accepted for use in risk-based priority setting, screening level risk assessment and prioritization for chemical testing, for the chemical classes to which the modules apply. Given the large number of chemicals that the Agency must screen in a short period of time, reliance on (Q)SAR module output is justified. The modules are expected to provide order of magnitude estimates, an accuracy standard that is generally acceptable by the Agency for screening level assessments. This level of accuracy should be clearly conveyed to users outside the Agency. The Agency should continue to validate, update, and investigate the uncertainty associated with the modules in various regulatory programs. A more extensive analysis and explanation of the limitations of the PERs and EFMs would help clarify appropriate use. ii. If not, what improvements are needed to make EPI Suite™ adequate and what alternative approach could be used in the interim? There are specific uses of EPI Suite™ that are not entirely appropriate for supporting the PlvTN and pollution prevention (P2) programs. At present, the chemical domains that are used by (Q)SARs do not provide adequate coverage of nanoparticles, inorganic compounds, organo-metallic and some polymeric chemicals (as well as other 35 ------- classes of chemicals). Application of (Q)SARs to chemicals outside the domain of the training set is likely to result in unreliable estimates. The Panel recommends that the Agency collect more peer-reviewed measurement data on the physical and chemical properties for these chemicals with the intent of either expanding the domain of the existing (Q)SARs or for creating new (Q)SARs specifically for these classes of chemicals. B. Potential Additional Uses Given the Agency's global leadership in the field of chemical screening to emerging industrial economies in Asia, South America, Eastern Europe, and Africa, it should come as no surprise that these regions are adopting EPI Suite in their regulatory programs as well. EPI Suite™, if translated into major foreign languages (e.g., Arabic, Spanish, Portuguese, French, Russian, Standard Chinese and Mandarin, Bahasa Indonesia, and Hindi), is a practical and scientifically-credible risk management technology transfer that will allow countries with emerging industries to establish sustainable chemicals management systems. The United Nations (UN) Strategic Approach to International Chemicals Management (SAICM) project represents an ideal forum in which the benefits of EPI Suite™ application can be shared with the international regulatory community. In addition to the direct uses of EPI Suite™ by the Agency, the following additional potential uses have been identified. • EPA and other Federal Agencies' EPI Suite™ is clearly seen as an important tool in any regulatory program that evaluates chemicals for public health and environmental safety. Agency programs that benefit from EPI Suite™ include: a) EPA Office of Pesticide Programs (OPP), b) EPA Office of Water (0 W), c) EPA Office of Solid Waste and Emergency Response (OSWER), and the c) US Food and Drug Administration. • Private Industry: Industrial applications where EPI Suite™ software can be valuable include the development of more environmentally friendly products or "green" engineering processes. EPI Suite™ can be used to support the issuance of chemical exposure- based waivers that reduce the use of animal testing under programs such as TSCA and HPV Challenges world-wide. 36 ------- EPI Suite™ output can inform and guide environmental exposure monitoring programs. International Regulatory and other Programs: EPI Suite™ output can be used to support hazard classification when experimental data are not available. EPI Suite™ output can be used as part of the process to conduct Persistent Bioaccumulative and Toxic (PBT) identification/categorization. EPI Suite™ output can be used to support chemical assessment and management programs especially for High Production Volume (HPV) chemicals. EPI Suite™ output can be used to support global initiatives such as the Stockholm Convention to control the long-range transport of Persistent Organic Pollutants (POPs) or other assessments of the potential for long range transport of chemicals and other Green House gas assessments. EPI Suite™ may play a significant role in the OECD (Q)SAR ToolBox. 37 ------- REFERENCES FOR BOTH REPORT AND APPENDICES Altschuh J; R. Bruggemann ,H. Santl, G. Eichinger and O.G. Piringer. Henry's law constants for a diverse set of organic chemicals: experimental determination and comparison of estimation methods. Chemosphere. Volume 39, Number 11, pp. 1871- 1887(17), November 1999 Arnot, Jon A. and Frank A.P.C. Gobas A Food Web Bioaocumulation Model for Organic Chemicals in Aquatic Ecostystem, Environmental Toxicology and Chemistry: Vol. 23, No. 10, pp. 2343-2355, 2004. Beyer A, Wania F, Gouin T, Mackay D, Matthies M. 2002. Selecting internally consistent physicochemical properties of organic compounds. Environ. Toxicol. Chem. 21:941-953. Cambridge Soft, Inc. 100 Cambridge Park Drive, Cambridge, MA 02140 USA info@.cambridgesoft.com: TEL 1 (617) 588-9100; FAX 1 (617) 588-9190 Cowan, C E, D Mackay, TCJ Feijtel, D van de Meent, A Di Guardo, J Davies, and N Mackay. 1995. "The multi-media fate model: a vital tool for predicting the fate of chemicals". SETAC Press, Pensacola, FL. USA. DiToro, DM. 2005. "Sediment flux modeling", John Wiley, NY, USA. Elsevier MDL, 2440 Camino Ramon, Suite 300; San Ramon, Ca 94583; http://www.mdl.com/ Frame, G.M.,Cochran, J.W., and Boewadt, S.S. 1996a. "Complete PCB congener distributions for 17 Aroclor mixtures determined by 3 HRGC systems optimized for comprehensive, quantitative, congener-specific analysis," HRC-J. High-Resolul. Chromatogr., 19(12):657-668. Frame, G., Wagner, R., Carnahan, J., Brown, J., May, R., Smullen, L., and Bedard, D. 1996b. "Comprehensive, quantitative, congener-specific analyses of eight Aroclors and complete PCB congener assignments on DB-1 capillary GC columns," Chemosphere, 33(4):603-623. Goss, K.-U.; Buschmann, J.; Schwarzenbach, R. P. Determination of the Surface Sorption Properties of Talc, Different Salts, and Clay Minerals at Various Relative Humidities Using Adsorption Data of a Diverse set of Organic Vapors. Environ Toxicol. Chem 2003, 22, 2667-2672. Government Accountability Office (GAO), 2005, Chemical Regulation: Options Exist to Improve EPA's Ability to Assess Health Risks and Manage Its Chemical Review Program, GAO-05-458, June 2005. G-l ------- Green. Nicholas and Ake Bergman, Chemical Reactivity as a Tool for Estimating Persistence: A proposed experimental approach for measuring this key environmental factor, Environmental Toxicology and Chemistry: Vol. 39, Iss. 23, pp 480A-486A, 2005. Hilal, et al. QSAR Comb. Sci. 2003, 22, pp. 565- 573: Hilal, et al. QSAR Comb Sci. 2003, 23, pp. 709-720 Hilal, personal communication, 2005; Long Chained Aliphatic Alcohols S1AR, 2006. Leo, A.J. 1992. 30 years of calculating Log Poet- QSAR Meeting, Duluth M"N, July 23, 1992 MacLeod M, Fraser AJ, Mackay D. 2002. Evaluating and expressing the propagation of uncertainty in chemical fate and bioaccumulation models. Environ. Toxicol. Chem. 21:700-709. Meylan, WM; Howard, PH. Bond Contribution Method for Estimating Henry's Law Constants. Environmental Toxicology and Chemistry ETOCDK, Vol. 10, No. 10, p 1283-1293, October 1991. Meylan, W.M., Howard, P.H., Boethling, R.S., Aronson, D., Printup, H., Gouchie, S., 1999, Improved Method for Estimating Bioconcentration/Bioaccumulation Factor from Octanol-Water Partition Coefficient, Environ Toxicol Chem, 18(4):664-672. Nguyen, T. H.; Goss, K.-U.; Ball, W. P. Polyparameter linear free energy relationships for estimating the equilibrium partition of organic compounds between water and the natural organic matter in soils and sediments. Environ Sci Technol 2005, 39, 913-924. Nikolova-Jeliazkova, N. and Jaworska, J. (2005). An Approach to Determining Applicability Domains for QSAR Group Contribution Models: An Analysis of SRC KOW1N. ATLA 33, 461-470. OECD, Principles for the Validation, for Regulatory Purposes, of (Quantitative) Structure-Activity Relationship Models, November 2004. http://www.oecd.org/document/23/0.2340.en_2649 34365 33957015 I 1_1 I.OO.html Peijnenburg, Pure & Appl. Chem. 1994, Vol. 66, No 9, 1931-1941 Schenker U, M Macleod ,Scheringer M, Hungerbuhler K. 2005 Improving Data Quality for Environmental Fate Models: A Least-Squares Adjustment Procedure for Harmonizing Physicochemical Properties of Organic Compounds. Env. Sci. Technol. 39:8434-8441. Staikova M, Wania F, Donaldson DJ. 2004. Molecular polarizability as a single- parameter predictor of vapour pressures and octanol-air partition coefficients of non-polar compounds: a priori approach and results. Atmos. Environ. 38:213-225. G-2 ------- Thibodeaux, LJ. 1996. "Environmental Chemodynamics". John Wiley, NY, USA. Trapp, S and M Matthies. 1998. "Chemodynamics and environmental modeling-an introduction". Springer-Verlag, Berlin, Germany. USEPA. 2002. Information Quality Guidelines. Office of Environmental Information. (EPA/260R-02-008) Washington DC. USEPA. 2003. Draft Guidance on the Development, Evaluation, and Application of Regulatory Environmental Models. The Council for Regulatory Environmental Modeling. November 2003. G-3 ------- GLOSSARY AOPWFN BAF BCF BCFWFN BIOWfN CAS Number Chemdraw™ DERMWFN DOS DSL ECOSAR EFM EPI GAO GHS GUI He HENRYWTN HPV HYDROWFN 1UPAC KQW KOWWIN LEVEL3NT MCI MDL MTC MPBPWIN NAPL OECD OPP OPPT OPPTS OSWER PCKOCWIN PER pKa PMM POP Atmospheric Oxidation Estimation Pestimation rogram Bioaccumulation factor Bioconcentration factor Bioconcentration factor estimation program Biodegradation factor estimation program Chemical Abstract Services Registry Number Chemical Drawing Program - CambridgeSoft Corporation Dermal Permeability Coefficient Program Disk operating system Domestic Substances List - Environment Canada Ecological Structure Activity Relationship Program Environmental Fate Models Estimation Program Interface Government Accountability Office Globally Harmonized System for Classification of Chemicals Graphic User Interface Henry's Law Constant Henry's Law Constant Estimation Program High Production Volume Chemicals Hydrolysis Factor Estimation Program International Union of Pure and Applied Chemistry Air-water Partitioning Coefficient OctanolOctonal-Air Partitioning Coefficient Organic Carbon Partitioning Coefficient Octanol-Water Partitioning Coefficient Octanol-Water Partitioning Coefficient Estimation Program Level 3 Fugacity Estimation Program Molecular Connectivity Indices Elsevier Molecular Design Limited (MDL) Information Systems Mass Transfer Coefficient Melting Point-Boiling Point Chemical Estimation Program Non-aqueous Phase liquid Organization of Economic Cooperation and Development Office of Pesticide Programs Office of Pollution Prevention and Toxics Office of Prevention, Pesticides and Toxic Substances Office of Solid Waste and Emergency Response Organic Carbon Partitioning Coefficient Estimation Program Property Estimation Routine Negative Log of a Chemical's Dissociation Constant Premanufacture Notice Persistent Organic Pollutants G-l ------- PP-LFER QSAR QSPR REACH SAICM SPARC SMILES SPC STPWIN TSCA UVCB WATERNT WSKOWIN WVOLY1N VOC Polyparameter Linear Free Energy Relationships Quantitative Structure Activity Relationship Quantitative Structure Property Relationship European Union's Registration, Evaluation and Authorisation of Chemicals Policy United Nations (UN) Strategic Approach to International Chemicals Management Spare Performs Automated Reasoning in Chemistry - http://ibmlc2.chem.uga.edu/sparc/ Simplified Molecular Input Line Entry System Structure Property Correlation Sewage Treatment Plant Chemical Fate Estimation Program Toxic Substances Control Act Unknown or Variable Composition, Complex Reaction Products and Biological Materials [per HS] Organic Compound Water Solubility Program Water Solubility Estimation Program Volatilization Rate from Water Estimation Program Volatile Organic Compound G-2 ------- TM Summary Assessment of EPI Suite ' Core Models Model AOPWfN BCFWFN BIOWIN HYDROW1N KOWWIN MPBPVP HENRY WIN PCKOCWIN WATERNT WSKOWIN WVOLVIN LEVEL3NT Assessment A^ospheric oxidation/ozone reaction rates are predicted using AOPWIN using the Atkinson fragment and functional approach method. It is the generally accepted approach for estimating these properties. It has been validated on a relatively small dataset of 77- 79 chemicals EPA should consider more validations for this method. R2 = 0.93 BCFWIN is generally accepted as the best fit to existing bioconcentration data. BCFWIN does not appear to have been externally validated or the information is not available in the user guides. If these models have been externally validated in the literature by various investigators, EPA should include this data in the user's manuals. No R2. The (Q)SPR estimation of biodegradation has inherent problems, one of which is the lack of reproducibility of measured biodegradation data The BIOWIN model is reasonably well accepted and generally performs as well as or better than the available models EPA should summarize all available validation data for BIOWTN in the users manual so that this information is readily available. Also, EPA should consider giving more advice on which of the 3 BIOWIN model approaches is most appropriate in a given situation R2 = 0.5-0.97 Hydrolysis rates for a specific set of functional groups are predicted by HYDROWIN and are a generally accepted approach. HYDROWIN does not appear to have been externally validated or the information is not available in the user guides If these models have been externally validated in the literature by various investigators, EPA should include this data in the user's manuals No R2 The KOWWIN model is well accepted, uses an accepted fragment-based technique and is an important (Q)SPR for regulatory use It generally performs better than most existing (Q)SPR Kow prediction methods The external validation data for this method is good and the summary information is available to the user R2 = 0 94 The MPBPVP (Q)SPR is accepted as a good estimator of BP, MP and VP. The melting point (Q)SPR is the weakest of this group because the external validation coefficient of determination was reported as 0.66 The standard deviation of 63 K is also indicative of some prediction error It is not likely that a significantly more accurate melting point determination is necessary for EPA regulatory programs and this method should be satisfactory for most regulatory uses R2 = 0.92-0.95 Uses two different methods and produces two different estimates (bond and group contribution) for air-to-water partition coefficient The models are generally accepted with R2 = 0 94-0 96 The as a good estimation tool of soil sorption coefficients (Koc) based on first order molecular connectivity index (MCI). It is satisfactory for most regulatory uses R2 = 0 86- 096. WATERNT uses the atom fragment contribution (AFC) method to predict water solubility building upon the KOWWIN methods water solubility of organic compounds at 25°C is predicted R2 = 0.87-0 98. WSKOWIN is a good model for prediction of water solubility. It has been validated with a large dataset with a high coefficient of determination, R2 = 0 9 Estimates volatilization half-lives from a model river and lake The program's default parameters for a model river will yield a half-life that is indicative of the fastest volatilization that may be expected in environmental waters (a shallow, rapidly moving river with strong surface wind). The default parameters for the lake yield a much slower rate. The EPI interface program executes the WVOLNT( Volatilization Rate from Water) program by transferring the Molecular Weight, the Henry's Law Constant, and various volatilization parameters to WVOLNT No R2 Half-lives are required for air, soil, sediment and water . . the fugacity can not run A-1 ------- without them If the half-lives in air, water, soil and sediment are known, the "Use Half- Lives Entered Below" should be selected and the known values should be entered in the appropriate fields. Often, however, these data are not available and require estimation The BIOWIN and AOPWIN programs are used to make these estimates The AOPW1N air estimate is based upon estimated hydroxyl radical and ozone rate constants. AOPWIN does have an experimental database containing more than 700 compounds If an entered structure has a database match, the database value is used instead of the program estimate. The half-life for degradation of a chemical in water, soil, and sediment is determined using the ultimate biodegradation expert survey model of the BIOWIN estimation program. This estimation program provides an indication of a chemical's environmental biodegradation rate in relative terms such as hours, hours to days, days, days to weeks, and so on, the terms represent the approximate amount of time needed for degradation to be "complete". This output cannot be used directly by the level IJI multimedia mass balance model The mean value within the estimated time range returned by Biowin3 is converted to a half-life using a set of conversion factors. These conversion factors consider that 6 half-lives constitute "complete" degradation of a chemical substance, assuming first-order kinetics. The resulting conversion factors for water are provided below. The Fugacity Model can not run without a vapor pressure. If the vapor pressure is not user-entered, the model uses the vapor pressure estimate by the MPBPWIN Program If the MPBPWIN Program estimates a vapor pressure of zero (which can occur if an estimate is less than l.OOe-40 mm Hg), the fugacity model uses an assumed value of 1 .OOe-15 mm Hg (this value is low enough to have no sensitivity effect in the fugacity estimates). The model also requires a log Kow value If the log Kow is not user-entered, the model uses the value from the KOWWFN Program (an experimental database value is used if available instead of the estimate) The Fugacity model in EPIWIN has limited user-access to many parameters in the Mackay Level III Model. For example, parameters such as rain rate, aerosol deposition, soil water runoff, and diffusion mass transfer coefficients can not be changed by the EPIWIN user For these parameters, EPIWIN relies solely upon the defaults values as determined by Mackay and co-workers This greatly simplifies application of a Level III model for most users No R2. STPWIN | The STPWIN program is a version of the Toronto Model originally developed by Donald Mackay and colleagues at the University of Toronto Includes outputs on. Bio P- the biodegradation half-life (in units of hours) in the primary clanfierof a sewage treatment plant (STP) Bio A: the biodegradation half-life (in units of hours) in the aeration vessel of an STP Bio S the biodegradation half-life (in units of hours) in the final settling tank of an STP All STP parameters are now accessed from the mam menu bar by selecting "STP" The STP program uses only default operating conditions of a model sewage treatment plant operating at 25 degree C. No R2 A-2 ------- APPENDIX for lAi The upgrades to EPI Suite™ could include a module containing algorithms for estimating the mass-transfer coefficients (MTCs) used in the EFM Category as well as allowing for user-entered values. A recent study comparing the outputs of five multimedia fate models demonstrated that model homogenization was possible only when the numerical values of the dozen or so MTCs were numerically equal (Cowan, et al., 1995). Otherwise, the computed concentration levels, mass fractions in the compartments and the chemical residence time estimates were dramatically different, many by orders- of-magnitude. Typically the numerical values of these MTCs vary by a factor often at a particular environmental interface and sometimes much more (Thibodeaux, 1996). The LEV3EPI module for example, contains twelve default MTC values; these were likely chosen by the model developers and are embedded within the code. In addition to having chemical species and physical property dependence, the MTCs are also functions of parameters that characterize the sizes, fluid dynamics, etc., of the environmental compartments. In the future, as EFMs develop in sophistication the users will need the option of having algorithms for estimating MTCs, including those that are most representative of the environmental compartments into which the chemicals are entering. It is possible and appropriate for EPA, with only a modest expenditure of resources, to develop estimating algorithms for these MTCs. A sizable quantity of data and accompanying theoretical models exist in diverse types of published literature. In general the tasks required in the algorithm development efforts will include the collection and evaluation of the existing data followed by producing the appropriate theory-directed statistical correlations needed for their estimation. These final algorithms should be similar to those in the PER Category of the EPI Suite™. Some limited compilations of these MTC algorithms are available in textbooks and other documents (Thibodeaux, 1996; DiToro, 2005; Trapp and Matthies, 1998). Many are imbedded within existing Agency software, EXAMS for example. However, there is no single location for accessing such parameters for direct use by the Agency or others. By having such an EPI Suite™ module (e.g., MTCWFN) a major input parameter for the LEV3EPI could be definitively selected by the user thereby eliminating one level of uncertainty that presently exists by relying on unknown imbedded default values. A-3 ------- APPENDIX for IBi The following descriptions are edited versions of the accuracy statements in the EPI Suite™ HELP Files. Estimation Accuracy of WATERNT: The statistical accuracy of the current 1000 compound training set is excellent; the correlation coefficient (R2) is 0.975, the standard deviation is 0.336 and the absolute mean error is 0.28. However, to be effective, an estimation method must be capable of making accurate predictions for chemicals not included in the training set. Currently, WATERNT has been tested on a validation dataset of 3,923 compounds. The validation set includes a diverse selection of chemical structures that rigorously test the predictive accuracy of any model. It contains many chemicals that are similar in structure to chemicals in the training set, but also many chemicals that are different from and structurally more complex than chemicals in the training set. Statistical performance for estimated vs. experimental log WatSol (moIes/L) are: n = 3923; R2 = 0.86; sd = 0.869; me = 0.70. Accuracy of AOPWIN: The accuracy of the estimation methods used by the Atmospheric Oxidation Program can be examined by comparing a list of more than 640 experimentally determined hydroxyl radical rate constants to the program's estimated rate constants. Over 90 percent of the estimated rate constants for the 647 different chemicals are within a factor of two of the experiment value. Over 95 percent of the estimates are within a factor of three of experimental. This can be compared to the PCFAP program (Fate of Atmospheric Pollutants) of the USEPA GEMS software which estimates the same rate constants as AOPWfN. For 617 compounds (PCFAP can not estimate or produces program errors for the remaining experimental values), PCFAP is within a factor of two for about 49 percent of the experimental values and within a factor of 3 for about 65 percent. PCFAP is particularly inaccurate for many compounds containing nitrogen, sulfur or phosphorus. The document "Estimation Accuracy of the Atmospheric Oxidation Program" contains a compilation of the experimental rate constants used to determine the accuracy of AOPWIN and PCFAP. Each chemical in the compilation includes the experimental rate constant, the AOPWIN estimate, the PCFAP estimate, and the SMILES notation for that chemical. For Aromatic Compounds, one of the advantages of the SMILES interpreter used by AOPWIN is the ability to identify individual aromatic rings and ring structures. This allows the overall rate constant estimation of many aromatic compounds to begin with an experimentally measured value for the basic ring structure. For example, if 1-methylnaphthalene is entered into AOPWIN, AOPWIN finds the naphthalene ring and assigns it the experimentally measured value for naphthalene (21.6 x 10"12 cm3/molecule-sec). It then adjusts the experimental naphthalene value for one methyl group attachment to an aromatic ring to yield an overall estimate of 56.9 x 10"l2cnwmolecule-sec (the experimental value for I- methylnaphthalene is 53.0 x 10"12). AOPWIN identifies and uses the aromatic A-4 ------- rings (15) that have experimental values (x 10~'2cm3/molecule-sec) and 7 rings are assigned a value based primarily upon experimentally measured ionization potentials (x 10"l2cm3/molecule-sec): Accuracy of BIOWIN: B1OWIN produces two separate MITI probability estimates for each chemical. The first estimate is based upon the fragments derived through linear regression. The second estimate is based upon the fragments derived through non-linear regression. Prediction accuracy of the training and validation sets are listed below. The validation set is completely independent of the training set. Chemicals in the validation set were not used to derive any fragment values. The numbers correspond to correct predictions (either "readily degradable" or "not readily degradable"): Training Set: Critically Evaluated as "Readily Degradable" Italian (Italy)Linear Model: 201/254 (79.1%) Non-Linear Model: 204/254 (80.3%) Training Set: Critically Evaluated as "Not Readily Degradable" Italian (Italy)Linear Model: 284/335 (84.8%) Non-Linear Model: 284/335 (84.8%) Training Set: TOTAL Linear Model: 485/589 (82.3%) Non-Linear Model: 488/589 (82.9%) Validation Set: Critically Evaluated as "Readily Degradable" Italian (Italy) Linear Model: 105/131 (80.2%) Non-Linear Model: 103/131 (78.6%) Validation Set: Critically Evaluated as "Not Readily Degradable" Italian (Italy)Linear Model: 135/164 (82.3%) Non-Linear Model: 135/164 (82.3%) Validation Set: TOTAL Linear Model: 240/295 (81.3%) Non-Linear Model: 238/295 (80.7%) Accuracy of HENRY WIN: The accuracy of the bond contribution method is discussed in detail in Meylan and Howard (1991). Briefly, a correlation coefficient (R2) of 0.97, a standard deviation (sd) of 0.34 and a mean error (me) of 0.21 were found for a 345 compound training set (all statistics apply to LWAPC values). A 74 compound validation dataset had respective R2, sd and me statistics of 0.96, 0.46 and 0.31. SRC's current experimental database contains 1650 compounds. Since publication of the Meylan and Howard (1991) article, the methodology was updated (HENRYWfN version 2) by adding new bond contribution values and new correction factors, especially for various classes of pesticides. At times, the bond estimate and the group estimate made by HENRYWfN may vary significantly. Experience with HENRYWIN has shown that the difference between bond and group methods can vary by as much as 2 orders of magnitude A-5 ------- for some compounds with many functional groups. The estimation from the group method is sometimes preferred unless the bond method uses a correction factor from Table D-3 (Appendix D) or Appendix F. A recent independent evaluation (Altschuh et al., 1999) for a diverse set of organic chemicals found the bond method more accurate than the group method. The group method generates inaccurate estimates for certain types of structures, such as hexachlorocyclohexanes (Altschuh et al., 1999). At times, averaging two widely divergent values is appropriate. For some compounds, both methods can yield a Henry's Law constant of 1.0x10"12 atm-m3/mole or smaller. Numbers smaller than this value may be unrealistically low. However, any organic compound with a Henry's Law constant less than 3.0 xlO"7 is considered essentially non-volatile from water (Thomas, 1990). The Exposure Evaluation Branch of the U.S. EPA (OPPT) uses a cut-off of 1.0 xlO"8 atm- mVmole for HLC estimates; any estimate less than the cut-off is considered 1.0 xlO"18 atm- m3/mole. Estimation Accuracy of KOWWIN: The figures in this Help file (not shown) illustrate KOWWFN's ability to estimate accurate log P values. The listing compares the accuracy of KOWWIN to the ClogP1"1 Program (Daylight, 1995; BioByte, 1995) statistics using SRC's Experimental Log P Database: (n = number of compounds; R = correlation coefficient; sd = standard deviation; me = absolute mean error) KOWWINvl.63 Total: n=!2805; R2=0.95; sd=0.435; me=0.316 Training: n=2474 R2=0.981 sd=0.22 me=0.16 Validation:n=10331 R2=0.94 sd=0.47 me=0.35 CLOGP for Windows (vl .0) Total: n=l 1735(a) R2=0.91 sd=0.59 me=0.384 CLOGP (UNIX version as reported by Leo, 1992) Total: n=7250 R2=0.96 sd=0.3 (using equation: Log P = 0.914 CLOGP + 0.184) (b) (a) Taken from the current database; the difference between the entire database (12686) and the number used (11616) is primarily due to "missing fragments" in the CLOGP program. BioByte's Internet website reports the following statistics for its starlist: n=8942, R2=0.917, sd=0.482 using the equation: Log P = 0.876CLOGP + 0.307. * These statistics were determined after removing large systemic deviant compounds and other large deviant structures where the underlying difficulty is conformational (Leo, A.J. 1992. 30 years of calculating Log Poet. QSAR Meeting. July 23, 1992). A-6 ------- APPENDIX for ICi The primary regulatory obligation in a tiered approach to risk assessment is conservatism of the prediction at the lowest tier (model based screening), not accuracy; tolerance towards false negatives differs between countries. The OECD HPV group is currently running an assessment of member countries' appreciation and application of (Q)SAR estimates. The results of this effort would lend itself useful to the EPA in reviewing the EPI Suite™. The EPA should consider all the listed criteria in the appendix when upgrades to the models are made. Case study I: Water Solubility. Estimation of long chained aliphatic alcohols water solubility, comparative analysis between EPI Suite™ (WSKOWWIN). SPARC. and measured In relation to an HPV submission, a comparison of water solubility estimations for aliphatic alcohols, found that, for shorter-chain alcohols (C6-C10), the modeled and measured values were comparable. For mid-chain (CIO-14) alcohols, the EPI Suite model moderately overestimated the water solubility. For the longer-chain alcohols (C14- C18), the EPI Suite™ overestimated water solubility by approximately one log unit, which could have an impact on the need for further toxicity assessment. This case illustrates that empirical regression driven models are more susceptible to error when screening complex compounds with few empirical data or data of questionable quality close to the limit of solubility, than thermal and quantum energy driven models such as SPARC, which are less dependent on measured values (Hilal, et al.,2003 a and b) see Figure in Appendix. TM Long Alcohols Water Solubility en 1e+5 1e+4 1e+3- 1e+2 - 1e*0 1e-2 - 1e-3 - EPIWIN SPARC Experimental (mean +/- SD) C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 Chain length A-7 ------- Case study II: Hydrolysis. Comparative analysis between EPI Suite™ (HYDROWIN). SPARC, and measured For hydrolysis, SPARC, at this time, calculates only carboxylic acid ester hydrolysis rate constants in any single or mixed solvent at any temperature. EPI Suite ™ calculates esters, carbamates, epoxides, halomethanes, and alkyl chlorides hydrolysis rate constants only in water. The SPARC residual mean squares deviation error of the calculated versus observed values in water is better than 0.37 and R2 equal to 0.98 while the EPI R2 for 124 ester compounds is 0.965 (see appendix for list of compounds). Below are graphs comparing the SPARC versus EPI Suite™ calculations for carboxylic acid ester hydrolysis rate. The end result for the 61 compounds is that SPARC does slightly better in the mean unsigned error, but has a lower frequency of potential significant outliers than EPI Suite does - see graphs (Hilal, personal communication, 2005; Long Chained Aliphatic Alcohols SIAR, 2006). Hydrolysis Log K 4 - -i 1 1 1 1 1 1 1 1 1 1 1 1— 0 5 10 15 20 25 30 35 40 45 50 55 60 65 Compound A-8 ------- Error (experimental +/- predicted) 2.50 2.25- 2.00- 1.75 - 1.50 - §> 1.25 H 1.00 0.75- 050 0.25 - 0.00 10 15 20 25 30 35 40 45 50 55 60 65 Compound Biodegradation (BIO WIN) and other fate models are more complicated than most of the other algorithm driven (Q)SARs under EPI Suite™. Modeling of biodegradation has been challenged due to lack of data and variability in soils (e.g., microorganism communities) by IUPAC (Peijnenburg 1994). For Kow (octanol/water) partition coefficient, EPI Suite T estimations are slightly better than SPARC especially when the log Kow < 7 (which is borderline for any model and experimental test). At higher KOW , SPARC calculations are better than EPI Suite™. At higher KOW'S EPI Suite™ is bound to measurements that were later (through slow stir method) shown to be inaccurate. SPARC models Kow as a ratio of activity coefficient calculations. Originally, SPARC versus EPI Suite M calculations indicated that SPARC values are too high. After slow stir, SPARC calculated the same Kow value but experimental values changed and the SPARC values, though they did not change, were now in better agreement with experimental values. For boiling point, solubility and Henry's constant, SPARC performed well, and out-performed EPI Suite'M (R2 = 0.999) (Hilal, et al. a and b). Total mean across all EPI Suite'M R" best case (using the highest R2) = 0.95 ± 0.03 SD, and worst-case (lowest R2) = 0.86 ±0.15 SD. A-9 ------- Table I1 Compounds in graphs in numerical order Smiles (1-62) 0=C(OCC=C)C c(ccc1COC(=O)C)cc1 O=C(OC(C)C=C)C O=C(OC(C)C#C)C O=C(OC=C)C O=C(OC(C)(CC)C=C)C O=C(OCC)CSC 0=C(OCC)CS(=0)C O=C(OCC)CS(=O)(=O)C O=C(OC)C O=C(OC)C=CC O=C(OCC) O=C(OCC)CCC O=C(OCC)C=CC O=C(OC)C(CI)CI O=C(OC)C=C(C)C O=C(OC)C(C)=C 0=C(OC)C 0=C(OCC)C=CC O=C(OC) 0=C(OC)C=CC 0=C(OCC)C(CI) O=C(OCC)C(CI)CI O=C(OCC)C(F)F O=C(OCC)C=C 0=C(OCC)C#CC O=C(OCC)C#C c12c(C(=O)OCC)cccc1 cccc2 c12cc(C(=O)OCC)ccc1 cccc2 O=C(OCC)c(cccc1)c1 O=C(OCC)C=CC(=O)OCC o=c(occ)c=cc(=o)occ O=C(OC(C)C(C))C=CC O=C(OC(C)C)C=CC c12c(C(=O)OC(C)C)cccc1 cccc2 c12cc(C(=O)OC(C)C)ccc1 cccc2 O=C(OC(C)C)C O=C(OC(C)C) O=C(OC(C)C)c(cccc1)c1 O=C(Oc(cc(N(=O)=O)c1 )cc1 )C=C A-10 ------- O=C(OC)C=C c12c(C(=O)OC)cccc1 cccc2 c12cc(C(=O)OC)ccc1 cccc2 O=C(OCCCC)C=CC O=C(OCCCC)C=C 0=C(OCCCC) O=C(OCCC)C=CC 0=C(OCCC)C O=C(OCCC) O=C(Oc(ccc(CI)c1 )c1 )C=C O=C(Oc(ccc(C(=O)C)c1 )c1 )C=C O=C(Oc(ccc(N(=O)=O)c1 )c1 )C(CI) O=C(Oc(ccc(N(=O)=O)c1 )c1 )C=C c12c(C(=O)(Oc(ccc(N(=O)=O)c3)c3))cccc1cccc2 cl2c(C(=O)(Oc(ccc(OC)c3)c3))cccclcccc2 O=C(Oc(cccc1 )c1 )C(CI) O=C(Oc(cccc1)c1)C=C O=C(Oc(cccc1)c1)C O=C(OC(C)CC)C=CC O=C(OC(C)CC)C A-ll ------- |