.
  >                   UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                                    WASHINGTON, D C  20460


                                     SEP  12
                                                                      OFFICE OF THE ADMINISTRATOR
                                                                       SCIENCE ADVISORY BOARD
SUBJECT:  Transmittal of Science Advisory Board Report
FROM:     Vanessa T. Vu
            Director, Science Advisory Board Staff Office (HOOF)

TO:         Samuel Boltik
            EPA Headquarters Library Repository (3404T)
       This is to advise you that the Science Advisory Board issued a report numbered
EPA-SAB-07-011, "Science Advisory Board (SAB) Review of the Estimation Program
Interface Suite (EPI Suite TM)".

       Two copies of the report are attached. The report is available in electronic format on
the Science Advisory Board's web site at: http://www.epa.gov/sab/fiscal07.htm.

       If you have any questions regarding this report, please contact me directly at
202-343-9999, or via email at vu.vanessa@epa.gov.
Attachments (2)
                                   Internet Address (URL) • rrttpV/www.epa gov
            Recycled/Recyclable • Printed with Vegetable Oil Based Inks on 100% Postconsumer. Process Chlorine Free Recycled Paper

-------
                    UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                               WASHINGTON D.C. 20460
         j^r
  '*t PRO<«

                                                     OFFICE OF THE ADMINISTRATOR
                                                       SCIENCE ADVISORY BOARD
                               September 7, 2007
EPA-SAB-07-11

Honorable Stephen L. Johnson
Administrator
U.S. Environmental Protection Agency
1200 Pennsylvania Avenue, NW
Washington, DC 20460

       Subject:      Science Advisory Board (SAB) Review of the Estimation
                    Programs Interface Suite (EPI Suite™)

Dear Administrator Johnson:

       The Office of Pollution Prevention and Toxics (OPPT) requested that the Science
Advisory Board (SAB) review the Estimation Programs Interface Suite (EPI Suite™)
software. The Agency uses this software to estimate properties related to a chemical's
environmental transport and fate. This information is used to support regulatory
decisions in the new chemicals program and in other existing chemical assessment
activities.

       The SAB commends EPA for the strategic decision to support the development of
EPI Suite™ and to make it easily and freely available. Governmental and private
organizations within the United States and elsewhere make extensive use of this software
in supporting decisions regarding new and existing chemicals. The widespread uses of
this software for a number of different purposes stems, in part, from its successful
utilization and integration of available science in combination with its ease of operation,
transparency, and cost-effectiveness. Because EPI Suite™ is part of the Organization for
Economic Co-operation and Development's Quantitative Structure-Activity Relationship
(Q)SAR toolbox, the software will likely play a significant role in international
regulatory activities. It also supports the efforts of emerging industrial economies to
develop in an environmentally protective and sustainable manner.

       The SAB has carefully evaluated the EPI  Suite™ software. The Panel's
numerous recommendations for improvements in the software's scope, accuracy, and
ease of operations appear in the enclosed report, along with comments on appropriate
current and potential future uses.  Because of its importance in supporting Agency
decisions regarding existing and new chemicals, the Panel would like to draw your

-------
attention to the following three overarching findings involving the software's underlying
science, functionality and uses.

       First, for chemicals similar to those for which modules to estimate chemical
properties were developed, the algorithms that support the calculations are scientifically
defensible and appropriate for Agency regulatory screening applications.  However, for
existing and/or new chemicals whose structures and/or properties are outside the domain
used in module development, scientific uncertainty may limit the  utility of this software.
In such cases, the Agency uses other methodologies to evaluate chemical properties.

       The Panel also has identified a number of broad chemical  categories (e.g.,
polymers, organo-metallics, nanoparticles, etc.) and associated chemical properties for
which the Agency is encouraged to develop modules to estimate chemical properties.
Given their importance in industrial and commercial applications  as well as  their
potential environmental and human health impact, the Panel recommends that (Q)SAR
development for these chemical categories (and associated properties) be established as
an Agency priority.

       Secondly, the Panel noted that significant improvements in software functionality
and ease of operation could be achieved if the graphical user interface were  upgraded
from its current disk operating system (DOS) appearance to a more familiar format, such
as Windows™. By providing a more recognizable user interface, particularly to novice
users, the Agency will help to facilitate broader and more extensive  application of this
software in environmental decision-making.

       Finally, the resources for updating and  improving the software have not been
commensurate with its importance in supporting Agency decisions, nor with the rapidity
with which new and even novel chemicals are being developed for commercial use. In
light of the widespread and multiple uses for this software, the Agency should increase its
investments to expand the range of chemical categories over which the software can
generate valid predictions, and  the number of chemical properties that can be modeled as
new scientific information becomes available.

       Thank you for the opportunity to provide advice on this important suite of
modeling software and to interact with the very dedicated and able OPPT staff. Please
feel free to contact us if you have any questions concerning this review.
                                  Sincerely,
Dr. M. Granger Morgan, Chair
EPA Science Advisory Board
Dr. Michael J. McFarland, Chair
EPI Suite Review Panel
EPA Science Advisory Board

-------
                                   NOTICE
       This report has been written as part of the activities of the EPA Science Advisory
Board, a public advisory group providing extramural scientific information and advice to
the Administrator and other officials of the Environmental Protection Agency. The
Board is structured to provide balanced, expert assessment of scientific matters related to
the problems facing the Agency. This report has not been reviewed for approval by the
Agency and, hence, the contents of this report do not necessarily represent the views and
policies of the  Environmental Protection Agency, nor of other agencies in the Executive
Branch of the Federal government, nor does mention of trade names or commercial
products constitute a recommendation for use. Reports of the EPA Science Advisory
Board are posted on the EPA website at http://www.epa.gov/sab.

-------
                  U.S. Environmental Protection Agency
                         Science Advisory Board
                         EPI Suite Review Panel
CHAIR
Dr. Michael J. McFarland, Utah State University. Logan, UT
OTHER SAB MEMBERS
Dr. David A. Dzombak, Carnegie Mellon University, Pittsburgh, PA
CONSULTANTS
Dr. Deborah Hall Bennett, University of California - Davis, Davis, CA

Dr. Robert L. Chinery, New York State Department of Law, Albany, NY

Dr. Christina E. Cowan-Ellsberry, The Procter & Gamble Company, Cincinnati, OH

Dr. Miriam L. Diamond, University of Toronto, Toronto, Ontario, Canada

Dr. William J. Doucette, Utah State University, Logan, UT

Dr. Anton J. Hopflnger, University of New Mexico

Dr. Michael W. Murray, National Wildlife Federation, Ann Arbor, MI

Dr. Thomas F. Parkerton, ExxonMobil Biomedical Sciences, Annandale, NJ

Dr. Kevin H. Reinert, AM EC Earth and Environmental, Plymouth Meeting, PA

Dr. Daniel T. Salvito, Research Institute for Fragrance Materials, Inc., Woodcliff Lake,
NJ

Dr. Hans Sanderson, Danish National Environmental Research Institute, Roskilde,
Denmark

Dr. Louis J. Thibodeaux, Louisiana State University, Baton Rouge, LA
SCIENCE ADVISORY BOARD STAFF
Ms. Kathleen White, Washington, DC

-------
                 U.S. Environmental Protection Agency
                         Science Advisory Board
CHAIR
Dr. M. Granger Morgan, Carnegie Mellon University, Pittsburgh, PA


SAB MEMBERS
Dr. Gregory Biddinger, ExxonMobil Biomedical Sciences, Inc, Houston, TX

Dr. James Bus, The Dow Chemical Company, Midland, Ml

Dr. Deborah Cory-Slechta, University of Rochester, Rochester, MY

Dr. Maureen L. Cropper, University of Maryland, College Park, MD

Dr. Virginia Dale, Oak Ridge National Laboratory, Oak Ridge, TN

Dr. Kenneth Dickson, University of North Texas, Denton, TX

Dr. Baruch Fischhoff, Carnegie Mellon University, Pittsburgh, PA

Dr. James Galloway, University of Virginia, Charlottesville, VA

Dr. Lawrence Goulder, Stanford University, Stanford, CA

Dr. James K. Hammitt, Harvard University, Boston, MA
      Also Member: COUNCIL

Dr. Rogene Henderson, Lovelace Respiratory Research Institute, Albuquerque, NM
      Also Member: CASAC

Dr. James H. Johnson, Howard University, Washington, DC

Dr. Agnes Kane, Brown University, Providence, RI

Dr. Meryl Karol, University of Pittsburgh, Pittsburgh, PA

Dr. Catherine Kling, Iowa State University, Ames, 1A

Dr. George Lambert, Robert Wood Johnson Medical School-UMDNJ, Belle Mead, NJ

Dr. Jill Lipoti, New Jersey Department of Environmental Protection, Trenton, NJ

Dr. Michael J. McFarland, Utah State University, Logan, UT
                                     in

-------
Dr. Judith L. Meyer. University of Georgia. Athens. GA

Dr. Jana Milford. University of Colorado, Boulder, CO

Dr. Rebecca Parkin. The George Washington University Medical Center, Washington,
DC

Mr. David Rejeski, Woodrow Wilson International Center for Scholars. Washington.
DC

Dr. Stephen M. Roberts, University of Florida, Gainesville, FL

Dr. Joan B. Rose, Michigan State University, East Lansing, Ml

Dr. Jerald Schnoor, University of Iowa, Iowa City, IA

Dr. Kathleen Segerson, University of Connecticut, Storrs, CT

Dr. Kristin Shrader-Frechette, University of Notre Dame, Notre Dame, I"N

Dr. Philip Singer, University of North Carolina, Chapel Hill, NC

Dr. Robert Stavins, Harvard University, Cambridge, MA

Dr. Deborah Swackhamer, University of Minnesota, St. Paul, MN

Dr. Thomas L. Theis, University of Illinois at Chicago, Chicago, IL

Dr. Valerie Thomas, Georgia Institute of Technology, Atlanta, GA

Dr. Barton H. (Buzz) Thompson, Jr., Stanford University, Stanford, CA

Dr. Robert Twiss, University of California-Berkeley, Ross, CA

Dr. Terry F. Young, Environmental Defense, Oakland, CA

Dr. Lauren Zeise, California Environmental Protection Agency, Oakland, CA
SCIENCE ADVISORY BOARD STAFF
Mr. Thomas Miller, Washington, DC
                                      IV

-------
                         TABLE OF CONTENTS
EXECUTIVE SUMMARY	1
GENERAL COMMENTS	4
SPECIFIC COMMENTS	6
  1. Supporting Science	6
    A.  Comprehensiveness	6
    B.  Method accuracy and validation	16
    C.  Estimation Methods and Alternates	23
  2. Functionality	26
    A.  How convenient is the software and does it have all the necessary features?.. 26
    B.  Are there places where EPI SuiteTM's user guide (and other program
    documentation) does not clearly explain EPI's design and use?  How can these be
    improved?	29
    C.  Are there aspects of the user interface (i.e., the initial, structure/data entry
    screen; and the results screens) that need to be corrected, redesigned, or otherwise
    improved? Do the results screens display all the desired information?	30
    D.  Currently one enters EPI Suite7  using SMILES and CAS; are there other ways
    to describe the structure (e.g., ability to input a structure by drawing it), that should
    be added?	31
    E.   EPI Suite™ has many convenience features, such as the ability to accept batch
    mode entry of chemical structures, and automatic display of measured values for
    some  (but not all) properties. Are there other features that could enhance
    convenience and  overall utility for users	32
    F.  Are property estimates expressed in correct/appropriate units?	'.	33
    G.  Is adequate information on accuracy/validation conveyed to the user by the
    program documentation and/or the program itself?	33
 3.  Appropriate Use	35
    A.  Currently Identified Uses	35
    B.  Potential Additional Uses	36
REFERENCES	R-l
GLOSSARY	G-l
APPENDIX  1: Summary Assessment of EPI Suite™ Core Models	A-l
APPENDIX  for 1 Ai	A-3
APPENDIX  for IBi	A-4
APPENDIX  for ICi	A-7

-------
                         EXECUTIVE SUMMARY

       EPA's Office of Pollution Prevention and Toxic Substances (OPPTS) regulates
pesticides and chemicals to ensure protection of public health and the environment as
well as promote innovative programs to prevent pollution.  The Office of Pollution
Prevention and Toxics (OPPT) within OPPTS is responsible for assuring the public that
industrial chemicals for sale and use in the United States do not pose unacceptable risks
to human health or the environment.  To accomplish this, OPPT promotes pollution
prevention, use of safer chemicals, risk reduction, risk management and public
awareness.  OPPT programs include the pre-manufacture notification (PMN) review of
new industrial chemicals; testing, assessment, and risk reduction of existing industrial
chemicals; management of "national chemicals" (e.g., PCBs); international chemical
issues; pollution prevention advocacy; and partnership programs, such as the High
Production Volume Chemicals (HPV) Challenge, Green Suppliers Network, Design for
the Environment and Green Chemistry.

      Accurate and reliable predictions of the behavior of chemicals in a biological or
environmental system require a full and comprehensive understanding of their
thermodynamic, kinetic and transport properties both within and across multimedia
compartments. To support Agency decisions regarding the toxicity, environmental fate
and transport of new chemicals, OPPT (with Syracuse Research Corporation (SRJ))
developed the Estimation Programs Interface (EPI Suite™), which OPPT makes freely
available from its website. The software combines the available science with user-
friendliness, transparency, and cost-effectiveness. EPI Suite™  is utilized by various
Agency program offices as well as other US federal agencies, state regulatory agencies,
foreign countries and the private sector.

       The EPI Suite ™ software consists of physical-chemical property estimation
routines (PERs) and mass balance based environmental fate models (EFMs). Where
measured data are lacking and EPI Suite™ is appropriate, the Agency uses the results of
the PERs together with the EFMs, to understand a chemical's environmental fate and
transport.  This understanding is fundamental  to assessing chemical exposure, hazard,
and risk.

       OPPT requested that the Science Advisory Board (SAB) evaluate the science,
functionality and uses of the Agency's EPI Suite M software. The EPI Suite Review
Panel was formed  for this purpose and reviewed the software in the context of OPPT's
needs.

       Science. In summary, the Panel commends the Agency for using sound science
to develop and refine EPI Suite™ and encourages the further development and use of this
software in  supporting Agency decisions. The Panel  applauds the Agency for furnishing
chemical fate and transport modeling software that is science-based and is used globally
to support environmental policy decisions.

-------
       The Panel encourages the Agency to consider evaluating the chemical fate and
transport modules using the latest statistical approaches to determine their predictive
accuracy and to evaluate new estimation approaches as they gain acceptance in the
scientific community.  The Panel endorses a systematic approach for updating and
refining the chemical fate and transport modules as high quality and peer-reviewed
measurement data become  available - both to increase the applicability of the software to
a wider array of chemical classes and to support the inclusion of additional physical-
chemical properties.  The Panel has provided a number of recommendations focused on
expanding the current set of chemical properties and associated functionality including
EFMs for future upgrades to EP1 Suite M.  However, in light of the widespread
application of EPI Suite™, the Panel recommends that before the Agency decides to add
a module, it assess, to the extent practical, whether there is consensus in the scientific
community that the module has been appropriately parameterized and has been
sufficiently verified to be applicable in screening assessment.  Also, because the
accuracy of EPI Suite™ output will vary depending on the chemical and the
environmental compartment in which it is found,  the Panel recommends communicating
the uncertainty associated with estimates provided by EPI Suite™.

       The PERs currently within EPI  Suite™ have received extensive scientific scrutiny
with the results  published in the peer-reviewed literature.  Because EPI Suite™ was
historically developed to model the fate and transport behavior of nonpolar organic
chemicals, the physical-chemical property estimates for this class of chemicals are
typically well within an order of magnitude of measured values.  The Panel considered
these results adequate to support Agency screening level decision-making. Moreover,
these PERs satisfy the Organization for Economic Cooperation and Development
(OECD) principles established for quantitative structure-activity relationship ((Q)SAR)
validation, a finding which further supports the use of EPI Suite™ PERs in screening
level regulatory decision-making.

       The ability of EPI Suite™ to accurately model physical-chemical properties
depends on the chemical's  class, the quality of the  property module chemical data
training set and  whether the chemical's properties fall within the range of the chemical
training data set.  Many of the chemical training data sets are outdated and some are
incomplete.  Periodic review and refinement of the training sets would support the
continuous improvement of module output accuracy and expand the range over which
EPI Suite™ results are valid. These refinements could be accelerated if the Agency
leveraged its resources to collect additional measured property data. Criteria that the
Agency should consider in prioritizing the updates of chemical property data sets are
identified in 1-A-ii below.

       Chemical domain mapping has the potential to significantly improve the
predictive capabilities of mechanistically and statistically-based PERs, but no Panel
consensus emerged as to the most effective approach to achieve this goal. The Panel
encourages the Agency to consider establishing a scientific forum at which the various
methodologies for enhancing the accuracy of the  PER module output may be evaluated.

-------
       The Panel agreed on two broad recommendations aimed at improving EFM
module predictions.  First, the Panel supports a more explicit description and justification
for the Agency's selection of EFM parameter default values. Secondly, the Panel
encourages the Agency to provide the EPI Suite™ user with a clear and unambiguous
display of quantitative uncertainty estimates associated with the fate model (i.e., EFM)
output.
       Functionality.  The Panel, which included experienced as well as novice users of
EPI Suite™, considered the functionality and usability of EP1 Suite™ software.  While
there are many positive features associated with the EPI Suite™ user interface including
its documentation and HELP file availability, there are also opportunities for functional
improvement. For example, although EPI Suite™ operates within a Windows™
platform, a new user to EPI Suite   is immediately struck by the disk operating system
(DOS) appearance of the graphical user interface (GUI). The Panel encourages the
Agency to secure the necessary funding to upgrade the GUI to reflect a typical
Windows™ appearance and functionality.
       Uses. All of the modules in EPI Suite™ are generally accepted by the regulatory
and regulated community for use in risk-based priority setting, screening level risk
assessment and prioritization for chemical testing for the chemical classes to which the
modules apply. Given the mandated 90-day reporting period for which new chemicals in
the PMN program must be evaluated and the large number of chemicals that the Agency
must screen annually, reliance on (Q)SAR module output is justified. The modules are
expected to provide an order of magnitude estimate of a chemical's physical properties,
an accuracy level that is generally acceptable by the Agency for screening level
assessments. However, application of (Q)SAR-based modules to chemicals outside the
module training set domain increases the uncertainty of the module prediction.  Because
the chemical domains that are used in developing current EPI Suite   (Q)SARs do not
provide adequate coverage of nanoparticles, inorganic compounds, organo-metallic and
certain other classes of chemicals, application of EPI Suite  for these classes of
compounds within the PMN and pollution prevention (P2) programs is inappropriate.
The Panel recommends that the Agency collect more peer-reviewed measurement data on
the physical and chemical properties for these chemical classes with the intent of either
expanding the domain of the existing (Q)SARs or for creating new (Q)SARs specifically
for these classes of chemicals.

       Owing to its success in supporting Agency decision-making and its accessibility,
use of EPI Suite™ is prolific outside of the Agency, including in international regulatory
agencies.  Given its broad acceptance and use by regulators,  industry and the academic
community, the Panel strongly encourages the  Agency to explore opportunities  to
develop foreign language versions of EPI Suite™.

-------
                         GENERAL COMMENTS

       The EPI Suite™ software basically consists of two module categories: physical-
chemical property estimation routines (PERs) and environmental fate models (EFMs).
The PERs are used to predict important physical-chemical (e.g., water solubility, vapor
pressure, octanol-water partition coefficients) and reactivity (e.g., biodegradation,
atmospheric oxidation) properties and, together with the EFMs, project a chemical's
environmental fate and transport which is considered during the Agency's screening level
evaluation.

       Accurate and reliable predictions of the behavior of chemicals in a biological or
environmental system require a full and comprehensive understanding of their
thermodynamic, kinetic and transport properties both within and across multimedia
compartments.  To support Agency decisions regarding the toxicity, environmental fate
and transport of new chemicals, the EPI Suite™ software employs twelve  individual
modules that may be logically placed into one of these two functional categories.

       Category - 1:  The nine regression based estimation modules in the PER category
were developed for estimating physical-chemical properties for chemicals that lack the
minimum data set needed to support Agency decisions. These modules, including the
Octanol-Water Partitioning Coefficient Estimation Program (KOWWIN), the Henry's
Law Constant Estimation Program (HENRYWIN), the Soil or Sediment Organic Carbon
Partitioning Coefficient Estimation Program (PCK.OCWIN), the Water Solubility
Estimation Program (WSKOWFN), the Bioconcentration Factor Estimation Program
(BCFWIN) and the Melting Point-Boiling Point (and Vapor Pressure) Chemical
Estimation Program (MPBPWIN), MPBPPVWFN), are used for estimating the
equilibrium distribution or partitioning of a chemical between two media such as fish
tissue-water and organic matter-water (which are functions of the octanol-water partition
coefficient), air-water, organic matter-water, etc. The three other modules found in the
PER category include: the Atmospheric Oxidation Estimation Atmospheric Oxidation
Program (AOPW1N), the Biodegradation Estimation Program (BIOWIN) and the
Hydrolysis Estimation Program (HYDROW11M).  These modules employ regression-
based approximation methods to estimate the value of kinetic parameters for atmospheric
gas-phase reaction with the hydroxyl, aerobic biodegradation and hydrolysis reactions,
respectively.

       Category - 2: EPI Suite™ EFM modules that enable the user to estimate the
environmental fate and transport of specific chemicals include: the Volatilization Rate
from Water Estimation Program (WVOLWIN), the Sewage Treatment Plant Chemical
Fate Estimation Program (STPWFN) and multi-media fugacity model (LEV3EPI).  These
modules, which utilize a chemical species mass balance approach, have been designed to
estimate the chemical concentration, phase mass fractions and residence times of
chemicals when placed in well-defined environmental systems. The mass balance
approach allows the user to estimate the change in chemical concentration over time from
which removal rates can be estimated.  Moreover, the EFM modules employ, as inputs,
the partitioning and reaction kinetic results generated from the PER modules. The EPI

-------
Suite™ user, however, has the ability to override these default inputs and enter their own
values.

       The environmental compartments defined within the three EPI Suite™ EFM
modules require the user to input the volume and mass fractions of the various media
under consideration. In the absence of user defined values. EPI Suite™ assigns default
values, which are idealized representations of the real world. Requirements of the EFM
modules also include user (or EPI Suite™ - i.e.. default) defined chemical coefficients
that quantitatively describe the rate of chemical transport between the various media
compartments.

       An important limitation of the present version of EPI Suite™ is the inability for
users to input their own mass transfer coefficient (MTC) data. Moreover, the absence of
high quality peer-reviewed MTC data to serve as input to EPI Suite™ exacerbates this
problem.  Although filling this critical data gap is vital for broadening the range of
applicability of EPI Suite  ,  collecting useful MTC data is inherently expensive, a fact
which presents the Agency with a considerable resource challenge. Because of the
importance of obtaining and  incorporating accurate and reliable MTC information into
EPI Suite™, the Panel encourages the Agency to develop a systematic and longer-term
program, possibly through leveraging resources with other federal agencies, to address
this critical data need.  However, in the interim, the Panel endorses establishing a modest
effort that can, at a minimum, result in the formulation of a guideline MTC module based
on available peer reviewed theoretical models and supporting data. A workshop
consisting of an expert panel sponsored by the Agency is suggested as a means of
producing a draft  of the guideline version of the MTC module. A summary assessment
of core EPI Suite™ modules can be found in APPENDIX  1.

-------
                          SPECIFIC COMMENTS
1. Supporting Science

       A. Comprehensiveness

          i.   Are there additional properties that should be included in upgrades to
          EPI Suite™ for its various specified uses (PMN, P2)?

       All of the physical-chemical properties that are currently modeled by EPI Suite™
are critical in characterizing the behavior of a chemical released into the environment.
Therefore, none should be dropped.

        Under most circumstances, the PERs predict the measured property value within
an order of magnitude, a standard of accuracy that is generally acceptable for screening
level Agency decision-making. It would be inappropriate to use PERs to predict
physical-chemical properties of chemicals whose characteristics are significantly
different than those found in the module training set because the difference between
predicted and measured values may be greater. This potential inaccuracy is an important
issue unto itself and also for error propagation when these estimates are incorporated into
the fate models.

       Given the broad range of chemicals for which the Agency must prepare
environmental assessments together with the need  to ensure an equitable and transparent
evaluation of all chemical data submissions, the Panel encourages the Agency to furnish
stakeholders with a description of the process by which regulatory decisions are made for
chemicals when application of EPI Suite has been  determined to be inappropriate.

       With respect to expanding the current set of chemical properties (and associated
functionality) for future upgrades  to EPI Suite™, the Panel recommends that the Agency
consider incorporating the following:

   •   pKa, the negative log of a chemical's dissociation constant
   •   Influence of pKa on other physical-chemical properties
   •   Temperature dependency of all physical-chemical properties
   •   KAW, the air-water partition coefficient
   •   KQA, the octanol-air partition coefficient
   •   Bioaccumulation factors for root plants, leaf plants, and aquatic wildlife
   •   Diffusion coefficients in various environmental media
   •   Metabolism and production of stable  chemical intermediates
   •   Neutral hydrolysis
   •   Activity coefficients
   •   Sub-cooled liquid vapor pressure and aqueous solubility
   •   Surface tension

-------
    •  Anaerobic biodegradation potential
    •  Ozone depletion potential, greenhouse gas potential, and maximum incremental
       reactivity (MIR) used to evaluate ozone formation potential.

Some of these endpoints and improved features (e.g., temperature-dependence of
physical-chemical properties) can already be predicted by another Agency supported
model (SPARC). The Panel, therefore, encourages the Agency to consolidate and build
upon existing work for future EPI Suite ™ improvements.

       The current EPI  Suite ™ has only limited utility in predicting parameters for the
important and large class of compounds known as polymers. Several Panel members
offered the following list of additional chemical properties specifically related to the
toxicity and fate of polymers that the Agency may consider in future upgrades to EPI
Suite1M:

    •  Glass transition temperature
    •  Crystal melt transition temperature
    •  Elastic mechanical properties like bulk modulus
    •  Viscosity measures
    •  Heat capacity
    •  Cohesive energy
    •  Charge
    •  Water solubility
    •  Dispersibility
    •  Flammability
    •  Parameters (e.g., degradation rates) influencing environmental persistence

       Several commercial software packages estimate many of the environmentally
important physical-chemical properties of polymers.  The Panel encourages the Agency
to evaluate the scientific underpinnings of these software packages to determine if similar
functionality could be incorporated into EPI Suite™.

       For some classes of chemicals, the physical-chemical properties estimated by EPI
Suite™ are not sufficient to predict a chemical's behavior. The Panel encourages the
Agency to consider development of a systematic and longer-term plan to develop and
integrate additional EPI  Suite™ functionality to adequately model additional physical-
chemical properties as well as the fate and transport characteristics of these compounds.
Similarly, the Panel strongly recommends that the Agency establish and support technical
transfer symposia and associated activities (e.g., science workshops) that will help
facilitate Agency exposure to the latest scientific approaches to chemical property
modeling.

       Given the Agency's resource limitations, the Panel strongly recommends that the
Agency establish a set of objective and transparent criteria for identifying and prioritizing
the  most important physical-chemical properties required for defensible regulatory

-------
decision-making. Examples of possible ranking criteria, which are not listed in any sort
of priority, include the following:

   •   The property's potential use in future fate and transport modeling enhancements

   •   The accuracy and reliability of the property's currently available experimental
       data set

   •   The extent of the chemical domain covered by the modeled property

   •   The opportunity for increasing the scope and applicability of EPI Suite ™ to a
       broader range  of chemical classes and properties.

   •   Determination of whether the new property could be easily modeled using the
       existing model chemical data set

   •   Relative importance of property value as input to other EPI  Suite ™ modules
       and/or Agency chemical assessments

   •   The relative magnitude between "model error" and "measurement error"

   •   Cost or other resource requirements associated with modeling the new property


       Greater use of MTCs can improve some applications in EPI Suite™. A recent
study comparing the outputs of five multimedia models demonstrated that model
homogenization was possible only when the numerical values of the dozen or so MTCs
were numerically equal (Cowan, et al., 1995).  Where MTCs varied significantly, the
computed concentration levels, mass fractions in the media compartments and the
chemical residence time estimates differed, in many cases, by several orders of
magnitude. The peer-reviewed literature contains a significant quantity of data with
which to develop MTCs.  Therefore, the Panel encourages the Agency to support the
development of additional MTCs and, where possible, establish a systematic process  for
evaluating and incorporating high quality MTC data within EPI Suite™

       The highest priority fate models are those which are judged to be used most often
and/or to have the most impact on decision-making processes. The Panel has identified
these models to be:

   •   Fugacity Unit  World
   •   STP
   •   BCF/BAF
   •   Long-range transport

       While there was consensus among the panelists that BAF is an important fate
parameter to model and the Panel encourages the EPA to develop this module, several

-------
panelists strongly cautioned that BCF/BAF models still have an incomplete treatment of
certain factors important in predicting uptake and metabolism. For example, while the
Arnot and Gobas (2004) model  includes a metabolism term, it is not clear, given
experimental difficulties, how accurately this term can be parameterized for different
compounds in different biota. For metabolizable chemicals (e.g.. aliphatic alcohols or
acids that have predicted log Kow values greater than 5 but are readily metabolized), the
predictions of BCF and BAF from a model based solely on log Kow can be significantly
greater (e.g., one order of magnitude or more) than experimentally determined BCF
values.  While this type of phenomenon has been recognized by researchers involved in
development of BCFWIN (Meylan et al., 1999), and since the module in EPI Suite ™
does contain correction factors to attempt to account for metabolism, further work is
needed to improve its predictive capability.

       Some panelists identified related concerns with the development of this module,
including:

    •  Conducting experimental studies for BAF to validate the model is difficult and
       expensive and such studies have been conducted only for a limited number of
       substances which are either slowly or not metabolized.

    •  Within the literature there are wide ranges reported in field measured BAFs (and
       even BCFs in laboratory studies) that have been obtained for a given chemical.

    •  Concern was expressed regarding the difficulty in appropriately parameterizing a
       BAF model for non-recalcitrant chemicals.  A correction factor approach alone
       (as is  used in BCFWIN) may still lead to significant errors in prediction for
       certain substances (or potential errors where measurement data are not available),
       and novice users may not appreciate the limitations in these predictions.

    •  There is no widely accepted method for estimating whole body metabolism rates
       in fish either from first principles (i.e., structure or other properties) or otherwise
       although there is considerable research on-going to develop and validate such
       methods. These efforts include the International Life Sciences Institute/Health
       and Environmental Sciences Institute (1LSI/HES1) project and recently initiated
       work by ECVAM. Therefore, even if the user were given the option to enter a
       metabolism rate, these estimates are not currently  available.

    •  There is the potential for inconsistencies between  the outputs of BCFWIN and the
       potential new BAF modules (e.g., Arnot and Gobas model) that may lead  to
       confusion in the interpretation of the fate of some  chemicals in part because these
       two models are based on very different approaches. BCFWIN relies on a fitted
       equation to measured BCF data. The Arnot and Gobas model is based on first
       principles and, as such, includes hydrophobic partitioning, growth dilution, and
       metabolism. When differences between the model predictions represent the
       variability in BCF and BAF data this is acceptable. However, in many cases, the
       differences will be due to problems in adequately  parameterizing the BAF model

-------
       (e.g., to account for metabolism) and it would be difficult to know that this is the
       cause of the discrepancy a priori.

       The training set used to calibrate the existing model, BCFWIN. includes studies
based on analysis of parent test substance as well as studies based on analysis of total
radioactivity. The total radioactivity based BCF can not distinguish between parent
substance bioaccumulation and incorporation of metabolites into the organism as a result
of normal catabolic processes (although the Panel recognizes that some metabolites can
be of toxicological concern). As a result, the model is trained on data that lacks a
consistent basis for (Q)SAR development and subsequent decision-making. The
BCFWIN database also fails to indicate whether the basis for the BCF is parent substance
or total radionuclide analysis.

       Given the increasing focus on the assessment of persistent, bioaccumulative and
toxic (PBT) chemicals in regulatory contexts, the current BCFWIN data set should be
critically reviewed, any inappropriate data that does not meet acceptance standards (e.g.,
total radioactivity based BCF for metabolized substances) deleted, and new literature data
added to provide a consistent basis for an improved "next generation" (Q)SAR.

       The existing Japanese "MITI" BCF database provides perhaps the best single
source of aqueous fish BCF data that could be included in this effort.  The data in the
MITI database is based on the OECD 305 bioaccumulation test procedure, which is
currently considered by many to be the "gold standard" for these types of tests
(http://www.safe.nite.go.jp/english/kizon/KIZON start_hazkizon.html). Compilation of
such data also would support the development of (Q)SARs for estimating fish
biotransformation potential that could be used as input to BAF models or multimedia
exposure models that predict human intake fraction.

       The panelists encourage the Agency to participate in and follow the on-going
scientific developments in BAF determinations including:

   •   Additional efforts  at experimentally determining bioaccumulation (including
       better understanding metabolism)

   •   Improved databases for developing and verifying BAF models

   •   ILSI/HESI (International Life Sciences Institute/Health and Environmental
       Sciences Institute) Work Group on Bioaccumulation

   •   Ongoing modeling research published in the literature

       In light of the widespread application of EPI Suite™, before the decision is made
to add a new module, such as the BAF module, the Agency should assess to the extent
practical, whether there is consensus in the scientific community that the model has been
or can be appropriately parameterized and has been sufficiently verified to be applicable
in screening assessments.
                                        10

-------
       More detailed information can be found in the APPENDIX for I Ai and related
issues are discussed in section 1-C-ii below.

          ii.   Are there additional sets of existing measured data which should be
          included in upgrades to EPI Suite™?   Are there specific measurements
          with the potential to improve EPI Suite™ estimates so much that an effort
          should be made to collect them?

       Existing peer-reviewed measurement data sets are available for the following
parameters: octanol-water partition coefficients (Kow), Henry's law constants (He), air-
octanol partition coefficients (KAO), biodegradation rates, organic carbon partition
coefficient (Koc), aqueous solubility, and rates of aquatic hydrolysis. Several panelists
noted that updating the chemical training data set used in estimating KOC should be a
priority because of the limited amount of data that is currently used to estimate the value
of this parameter within EPI Suite™. The Panel encourages the Agency to expand the
functionality  of the KOC module to capture the range of organic carbon types that could
affect a chemical's fate and transport including: natural vegetation-based, soot, black
carbons, non-aqueous phase liquids (NAPL), etc. Appendix 1-B-i identifies additional
data sets the Agency might consider.

       Because of the Agency's limited resources, the Panel supports a strategic
approach to identifying those data sets that require refinement.  Criteria that the Agency
should consider in prioritizing the updates of chemical property data sets include the
following:

    •  The duration of time since the chemical property data set was last updated

    •  Level of uncertainty associated with the chemical property estimates

    •  The domain and quality of the chemical property training set domain

    •  Accuracy of chemical property prediction

       Several panelists identified scientific proceedings associated with certain highly
reputable international conferences and journals such as the J. Phys. Chem. Ref. Data
(http://ipcrd.aip.org) as excellent sources of peer reviewed chemical data sets that should
be considered for inclusion in upgrades  to EPI  Suite ™ .  There are additional sets of
measured data that the Agency could consider for inclusion in upgrades to EPI Suite1 M
pending the Agency's satisfaction with the quality of peer-review received.  Some of
these are:

    •  Additional sewage treatment plant (STP) chemical partitioning and fate data.
       Appropriate sources for this type of data would include, but are not limited to:  a)
       the National Association of Clean Water Agencies (formerly the Association of
       Metropolitan  Sewerage Agencies), b) Water Environment Research Foundation
       (WERF), c) Water Environment Federation (WEF), and d) Journal of
       Environmental Engineering and  related journals.
                                        II

-------
    •  The existing Japanese "MITl" data - While most data bases aggregate data from a
       number of different studies using different methods, the MJTI database uses a
       standard procedure to test a large number of chemicals, including direct
       measurement of the properties of interest for parent compounds. Some panelists
       familiar with the database say it provides an excellent source of aqueous fish BCF
       data.

    •  Additional sources of Polychlorinated Biphenyls (PCB) congener data sets that
       are available in the peer-reviewed literature (e.g., Frame et al., 1996a, 1996b).

   •   Reliable un-published data reported as part of the High Production Volume
       (HPV) challenge program (http://www.epa.gov/HPV/) or other international
       regulatory initiatives such as the OECD Screening Information Data Set (S1DS)
       program (http://www.epa.gov/opptintr/chemtest/pubs/oecdsids.htm).

       The Panel agreed that the EPI Suite™ fate and transport modules are limited by
the paucity of chemical degradation (e.g., biodegradation and biotransformation
processes) data available.  Like mass transfer coefficients, chemical degradation
information is so important to understanding the fate and transport of chemicals in the
environment that, if necessary, the Agency should consider redirecting resources from
current programs to address this critical data need. Moreover, there have been a number
of recent scientific advances in understanding chemical degradation that merit Agency
consideration. For example, an innovative methodology termed the environmental
"reagents" approach has been developed for defining the reactive power of environmental
compartments. Understanding this reactivity has important implications to the fate of
chemicals and should be considered  in future upgrades to the EPI Suite™ chemical
degradation modules (Green and Bergman 2005).
         iii. Are there other capabilities that should be included in upgrades to
         EPI Suite™? The Agency is especially interested in the SAB's views on
         uncertainty analysis and if/how information on how good the estimates are
         can be conveyed to users.

Uncertainty in Parameter Estimation, Routines, and Predictions

       When a PER is used to predict properties for chemicals lying outside the domain
of compounds used in the training set for that PER, confidence in the prediction will
generally be lower than if the chemical were within the existing domain. The Panel
recommends that results in such cases be flagged to highlight for the user the potential
uncertainties in the estimate value.

       Although the Panel explored a range of views concerning how uncertainty should
be conveyed to the EPI Suite   user, two approaches emerged as the preferred options.
Both approaches involve the development of appropriate statistical confidence intervals
                                       12

-------
surrounding a mean value of an estimated chemical property. In the first case, the
majority of the Panel recommended that the quantitative uncertainty information be
displayed only in HELP files while, in the other, several panel members preferred having
the data presented with the module output for each endpoint/test chemical. Advantages
and disadvantages of both approaches are summarized in the following:

    •   Provide information on the confidence range in HELP files:

          Advantage: This approach does not require that the Agency defend quantitative
          estimates, particularly for test chemicals that are outside of the model domain.
          Moreover, by limiting the availability of the uncertainty discussion to the help
          file, the Agency reduces the potential for misinterpretation or misapplication of
          the uncertainty results.

          Disadvantage: If not presented more explicitly, the novice user may overlook
          this information increasing the potential for misinterpretation or misapplication
          of the model results.
    •   Provide the confidence interval in the module output:

          Advantage:  The Agency and the scientific community are moving toward
          more explicit acknowledgement and quantification of uncertainty.  This
          approach is consistent with such goals. Moreover, by including quantitative
          uncertainty estimates with module output, the EPI Suite™ user is compelled to
          recognize the potential of making decision errors.

          Disadvantage: While the complex nature of data uncertainties and modeling
          uncertainties needs to be communicated, more informative, but potentially
          more complex, quantitative uncertainty assessment methods present novice
          users and decision makers with new challenges.  Effective incorporation of
          uncertainty in decisions will not be accomplished with quantitative uncertainty
          analysis alone.

       The Panel encourages the Agency to explicitly acknowledge to the EPI Suite™
user the fact that the quantitative uncertainty estimate for each endpoint/test chemical
includes only the statistical error associated with the model prediction and neglects the
error in reported experimental measurement values that were used to calibrate the model.
To the extent practical, the Agency should provide guidance to the user on the expected
data error component for each modeled property.
                                        13

-------
Uncertainty in Environmental Fate Model Predictions

       The Panel endorses that uncertainty associated with the EPI Suite™ fate model
(i.e., EFM) be better conveyed to the user.  The Panel identified the following sources of
EFM uncertainty:

   •  Model structure

   •  Model parameters (e.g., chemical properties, mass transfer coefficients, etc.)

   •  Media compartment(s) including type, size and distribution


       Panel deliberations included consideration of various approaches to effectively
convey uncertainty to the EPI Suite™ user.  The following list summarizes the range of
approaches discussed by the Panel together with their potential advantages and
disadvantages.

   •  Model output details could remain in its current form, while the documentation
       could more fully describe the input parameter range and limitations of the
       evaluative fate models.

       The EFM modules in EPI Suite™ are designed to produce "evaluative"
predictions. The media compartments reflect generic environmental  scenarios such as the
"unit world". The term evaluative is used to describe an output that is interpreted to be of
relative significance and/or order-of-magnitude rather than a precise  numerical result.
The  major (i.e., 1st order) sources of output uncertainty are associated with the ascribed
media of chemical entry.  For example, significantly different media concentration
predictions will result if the chemical is "emitted" into the air compartment rather than
the water compartment. Clear data/information available in the PMN as to the choice of
media for chemical entry  is needed.  In addition, cautions/alerts as to the high level of
output variability resulting from media entry choice need be placed in the documentation
as understanding this variability is key to controlling this source of EFM output
uncertainty. Experience with such models indicates that input variations in chemical
properties and MTCs result in 2nd order levels of EFM output uncertainty (Webster, et al,
1998).

          Advantages: Simplicity and consistency in  interpretation of fate model output.

          Disadvantages:  Only presenting uncertainty information in the help section
          assumes that the user will read this section.  Even if this section were read,
          there is no guarantee that the scientific or regulatory implications of
          uncertainty will be fully understood.

   •  Give qualitative information regarding the uncertainty associated with model
       results based on the range of the chemical property values.
                                        14

-------
       An example of such an approach is illustrated by describing a chemical's
distribution using a K.QA versus KQW diagram.  Construction of such a plot will depict the
distribution of the chemical with respect to the various environmental phases, e.g.. air,
water or soil/sediment. EPI Suite™ should provide explanatory text that clearly informs
the user that the relative media compartment sizes, inter-compartmental chemical mass
transfer rates and the media compartment into which the chemical is released will affect
the model predictions of the chemical's allocation between media compartments.
Moreover, if a chemical were associated exclusively with a single medium, uncertainty in
the partition coefficients would have a minimal impact on the chemical's allocation
between compartments (as compared to those chemicals that are distributed between
phases).

          Advantages: The user will receive qualitative information regarding the
          potential sensitivity of model output to physical-chemical properties as it
          relates to environmental fate. This approach provides yet another level of
          screening whereby a chemical that does not clearly lie exclusively  within a
          specific environmental compartment may merit further investigation (based on
          environmental partitioning concerns alone).

          Disadvantages:  Development of a robust method for determining and
          presenting this information represents a considerable technical challenge.

    •   Calculate error propagated from estimates of physical-chemical properties and
       fate models, i.e., input 95% confidence limits or qualitative confidence factors
       from each estimated physical-chemical property to obtain a range of fate results
       (MacLeodetal. 2002).

       MacLeod et al. (2002) present a simple, semi-quantitative method for calculating
error propagated through environmental fate models.  Several panel members supported
this approach over the computationally demanding Monte Carlo simulation where the
required number of model iterations can be significant (e.g., > 2000 iterations).  The
semi-quantitative approach provides a simple view of the range of values that could be
expected based on user-defined uncertainties associated input parameters where
uncertainty is expressed as a multiplicative factor.

          Advantages: With this method, the user generates an estimate of the
          distribution of the model output for each chemical in the various media
          compartments. Use of this approach assumes  that the user will have an
          estimation of the uncertainty associated with the model inputs.

          Disadvantages:  The uncertainty associated with other factors (e.g., mass
          transfer coefficients and media of chemical emission) may be of more
          importance in interpreting modeling results particularly given that  the intent of
          these models are often to be evaluative (screening use)  in nature.
                                        15

-------
       Finally, the Panel supported a more explicit description and justification for the
Agency's selection of EFM parameter default values. This information, which should be
easily accessible to the EP1 Suite™ user, must provide sufficient detail of the
environmental media that the default values purport to represent (e.g., temperate or arid
terrestrial system).
          iv. Are there other estimation methods that should be considered in
          upgrading EPI Suite™?

       The Panel was able to identify several innovative methodologies that have the
potential to enhance both the accuracy and scope of the EPI Suite™ modules. These
methodologies include the: a) least squares adjustment of chemical properties approach
(Schenker et al., 2006), b) polyparameter linear free energy relationship approach (Goss
et al., 2003.  Nguyen et al., 2005), and c) the use of molecular polarizability to predict
vapor pressure and KQA (Staikova et al., 2004). In addition, the Panel encourages the
Agency to partner with other stakeholders to establish a forum (e.g., technical workshop,
interagency workgroup, etc.) to evaluate the various methodologies available for mapping
chemical domains in support of future (Q)SAR development and innovations in fate
modeling.
       B.    Method accuracy and validation

          i.  Is the accuracy of the modules in the EPI Suite™ sufficient for its
             various specified uses?

       EPI Suite™ is a screening tool that supports Agency risk-based decisions
regarding new and existing chemicals.  EPI Suite™ outputs are generally found to be
within an order of magnitude of measured values, an accuracy standard that has been
deemed sufficient by the Agency for defensible decision-making at the screening level.
Since many users may not recognize the range of accuracy associated with EPI Suite™
output, the Panel encourages the Agency to electronically post a detailed disclaimer that
clearly identifies the recommended uses of the current version of the EPI  Suite™
software.

        Although the accuracy of EPI Suite™ varies depending on endpoint, the Agency
staff described EPI Suite's design as intended to provide "best estimates," and in the view
of some panel members,  the screening level models used for assessing exposure are
generally designed to be  conservative.  The reason for this is that, for a screening level
assessment, the Agency generally develops estimates that are conservative (protective).
Such conservatism minimizes the probability of users making decision errors based on
module output.   While minimizing false positive decision errors  improves the
effectiveness with which the Agency uses its scarce resources, minimizing false negative
decision errors also establishes greater confidence that Agency decisions based on EPI
Suite ™ output will be sufficiently protective of the environment.
                                        16

-------
       Concerning application of EPI Suite™ output, greater transparency in describing
the process by which decision errors are considered in regulatory decision-making would
more effectively communicate environmental assessment decisions. By explicitly
defining the acceptable level of false negative and false positive decision error rates
within each regulatory program that uses EPI Suite™ module output, the Agency would
make the basis for its decisions more easily understood.

       In describing EPI Suite™'s level of quality assurance, the Agency confirmed that
EPI Suite™ was in full compliance with the EPA's Information Quality Guidelines
(USEPA 2002)1. The Agency has stated that extensive software security precautions
have been fully integrated into EPI Suite™ to prevent the possibility of unauthorized
algorithm modification. Moreover, the use of scientifically defensible (Q)SARs within
the individual modules ensures that the software output is presented in a complete and
unbiased manner.  The three basic steps employed by the Agency in developing the EPI
Suite™ software include the following:

    •   Model Development:  This step includes: a) defining the Agency
       programmatic needs, b) scientific evaluation of the peer-reviewed
       literature, c) developing and testing the theoretical concept that  supports
       the model, and d) developing and documenting the (Q)SAR(s).

    •   Model Evaluation:  This step includes: a) evaluating the (Q)SAR(s) and
       their intermediate output, b) evaluating the model results against peer-
       reviewed measurement data, c) providing basic quality assurance/quality
       control checks, d) alpha testing the model to ensure that it performs as
       designed, e) beta testing the model by independent users, and f)
       facilitating peer review of the QSAR by the scientific community.

    •   Model Application: This step  includes evaluating and documenting the
       data quality and model performance limitations to ensure that users will
       apply the model appropriately.

       At the present time, there are relatively few systematic evaluations of the training
data sets for EPI Suite™ modules.  The Panel strongly recommends that the Agency
establish a data quality oversight program that monitors, critically evaluates and
incorporates new peer-reviewed measurement data as well as new modeling approaches.
Several innovative methodologies offer potential opportunities to improve the accuracy
and broaden the scope of EPI Suite™  software. These include the:
1 As described in the Council for Regulatory Environmental Models Guidelines (USEPA 2003), EPA's
Information Quality Guidelines (USEPA 2002) define quality as a broad-term that includes the concepts of
integrity, utility, and objectivity. The Guidelines state that "integrity refers to the protection of information
from unauthorized access or revision to ensure that it is not compromised through corruption or
falsification. In the context of environmental models, often integrity is most relevant to protection of code
from unauthorized or inappropriate manipulation Utility refers to the usefulness of the information to the
intended users  Objectivity involves two distinct elements, presentation and substance Objectivity includes
whether disseminated information is being presented m an accurate, clear, complete and unbiased manner.
In addition, objectivity involves a focus on ascertaining accurate, reliable and unbiased information "
                                          17

-------
   •   least squares adjustment of chemical properties approach (Schenker et al., 2006),

   •   polyparameter linear free energy relationship approach (Goss et al., 2003. Ngyuen
       et al., 2005), and

   •   use of molecular polarizability as a predictor of physical-chemical properties
       (Staikova et al., 2004).
       EPI Suite's™ data quality should be evaluated at regular intervals (e.g., at least
annually). Updates to individual modules should be documented for technical comment
and use by the user community.  Currently, the Agency has other software packages (e.g.,
SPARC) at its disposal whose output may be compared to selected output from EPI
Suite™.

       For EPI Suite ™ users, the following quality assurance information would be
helpful in evaluating and characterizing individual module output:

   •   Provide a detailed description of the module chemical training set domain.

   •   Flag output when the chemical and associated physical-chemical properties are
       outside the training set domain.

   •   Furnish the range of experimental data used in the module chemical training  set in
       addition to the selected value used in calculations.

   •   Provide statistical comparison of results using estimated and experimental data.

   •   Identify any chemical fragments that are not captured by the Simplified Molecular
       Input Line Entry System (SMILES) algorithm within the module output.

   •   Identify those chemicals or class of chemicals that have been placed on the
       'potential problem' list under Toxic Substances Control Act (TSCA).

   •   Within the help files, module accuracy or method error should be fully discussed.

   •   A description of how default parameters or data were selected should be provided.

       The Panel recognizes the importance of the availability of high quality, peer-
reviewed measurement data as the basis for EPI Suite modules. Therefore, the Panel
encourages the Agency to upgrade the current set of EPI Suite™  modules to include as
much peer-reviewed measurement data of a credible and known quality as possible and
remove, where justified, data of lower or unknown quality.  Moreover, the Agency
should  develop a programmatic framework that would facilitate the systematic evaluation
of data quality obtained from both intra-Agency and inter-Agency sources. The goal of
                                       18

-------
these activities is to develop improved chemical data training sets of known quality for
each of the properties estimated by EPI Suite™.  More detailed information can be found
in APPENDIX IBi.
          ii.   Have the modules been adequately validated, and have they been
          published in the peer-reviewed technical literature or elsewhere?

       While no module is ever completely validated, the Panel agreed that the EPI
Suite™ modules have, for the most part, been satisfactorily evaluated.  The scientific
underpinnings of each of the compartment modules have been appropriately vetted in the
peer-reviewed scientific literature and the physical-chemical property (Q)SARs have
been found to satisfy the OECD principles for (Q)SAR validation. The five OECD
principles established for (Q)SAR validation (OECD 2004) are summarized as follows:

   •   Principle 1:    Defined endpoint
   •   Principle 2:    Unambiguous algorithm
   •   Principles:    Defined domain of applicability

   •   Principle 4:    Appropriate measures of goodness of fit (e.g., coefficient of
                     determination - R2)

   •   Principle 5:    Mechanistic interpretation.
       OECD Principle 1 requires that (Q)SARs should have a defined endpoint.  Most
EPI Suite™ modules conform to this requirement.  The end point for the biodegradation
module (BIOWfN) is less clear because certain aspects of the module (e.g., primary
degradation) could range from a minor change in chemical structure (e.g., loss of one
halogen, change from one unsaturated to saturated bond in a complex structure) to full
mineralization of the chemical. The user should fully recognize that, because of the
inherent complexity of the degradation process, ascribing a consistent primary
degradation endpoint under all possible environmental conditions may not be feasible.
Some panelists commented on the inconsistency in the underlying training data used for
calibration of the BCFWIN module (e.g., inclusion of studies involving both parent
substance as well as non-parent specific radiotracer studies).

       OECD Principle 2 has been consistently achieved by the EPI Suite™ (Q)SARs.
Most EPI  Suite™ modules are relatively transparent in their design and construction. An
overview of their structure and development is provided in the user guide and in the
published peer-reviewed literature. The one notable exception to this finding is the
biodegradation module (BIOWfN), whose structure and parameterization is less
transparent. The Panel strongly recommends that the Agency better define the design,
structure and data quality implications of the BIOWfN module. Definition of the
                                        19

-------
environmental medium to which the BIOWIN module output results apply would be a
valuable first step. Furthermore, the scientific justification for the scaling rules used to
extrapolate results from BIOWIN estimates associated with aqueous environments to soil
and sediment should be fully described in the Help files. Finally, the Agency should
fully describe the sensitivity of module output when chemical removal through various
abiotic processes is prevalent (e.g., sorption, hydrolysis, chemical oxidation, etc.).

       EPI Suite™  modules are generally consistent with OECD Principle 3.  However,
the Panel noted that module predictions are less reliable for chemicals that are outside of
the chemical training set domain. Moreover, for modules that have multidimensional
interpolation domains (i.e., models that use atom/fragment components, e.g., KOWWIN),
determining the actual interpolation domain is not trivial.

       A recently peer-reviewed publication evaluated the domain of the chemical
training data set utilized by KOWWIN. This work proposes a novel approach for
defining the multi-dimensional space that describes the chemical data training set
(Nikolova-Jeliazkova, et al. 2005).  The Panel encourages the Agency to explore this and
other scientific approaches suitable  for defining the chemical training set domains for EPI
Suite™ modules. The ultimate goal, of course, is to develop a scientifically defensible
process by which chemicals are selected for inclusion in the chemical training set
domain. Moreover, based on the insight developed through this approach, priorities can
be established to target new data collection that efficiently expands the model domain for
substances of regulatory importance.

       In general, the EPI Suite™ modules are consistent with OECD Principle 4.
External evaluation  of an EPI Suite™ module using query chemicals with known
properties is the standard procedure for assessing (Q)SAR reliability.  External
evaluation has produced adjusted R2 values of approximately 0.75, a  value that is
considered satisfactory  for regulatory screening level chemical evaluation. A few of the
EPI Suite™ modules (e.g., BCFWIN, HYDROW1N, etc.) do not appear to have had
external evaluation. The Panel strongly encourages the  Agency to  scan the peer-
reviewed literature to determine if external evaluation of these modules has occurred and,
if so, is the data quality suitable for  supporting upgrades to EPI Suite™.

       Those EPI Suite™ modules  which are not regression-based routines do not
conform to OECD's Principle 5. However, the EFM modules are mechanistically based
and are adequately described in the  Help files.

          iii.  Are some modules more accurate/better validated than others, and if
          so, which need more work?

       Most of the EPI Suite™ modules have been evaluated sufficiently to support
regulatory decision-making.   However, all modules would benefit by improved domain
mapping, which would  allow, amongst other things, the ability of the user to determine a
priori the suitability of a particular module to reliably estimate a given physical-chemical
property for a specific chemical.
                                       20

-------
       Of the EPI Suite™ modules that require additional validation/evaluation beyond
that already discussed in the response to the preceding question, the organic carbon
partition coefficient model, PCKOCWIN is a priority because this module was developed
twenty years ago (1986) and has yet to be revised. Presently. KOC estimation routines use
molecular connectivity indices (MCIs) and correction factors based on structural features
of the chemical.  MCIs are generally not widely used or accepted by (Q)SAR developers
because MCI mechanistic information is difficult to  interpret. Finally, the database of
KOC values used to develop the present version of the PCKOCWFN module is not as large
and inclusive as for other EPI Suite™ modules.

          iv.   To the extent that modules work together to generate estimates, do
          they do so correctly?

       EPI Suite™ modules work together to generate scientifically defensible estimates
of the physical-chemical properties of chemicals. However, the transfer of data between
modules requires further refinement. The Panel encourages the Agency to explicitly
describe the protocol (and hierarchy) that govern the passing of physical-chemical
property module output to the chemical fate and transport modules. For example, the
user may want to know whether a measured physical-chemical property value is used
preferentially over a chemical property module prediction in fate and transport modules
and the implications of either choice (e.g., advantages of using presumably more accurate
measured data over the advantage of using an internally consistent set of physical-
chemical properties when estimating chemical fate, (e.g., Beyer et al.2002)).

       To improve transparency in describing module interaction, module inputs as well
as outputs should be provided as part of the EPI Suite™ results. Moreover, the Panel
strongly supports separating the physical-chemical property estimation modules from the
fate modules, such that the fate modules can be executed independently. With respect to
module default values for certain parameters (e.g., mass transfer coefficients, media
compartment volumes, deposition parameters), the Panel endorses greater user-
customization capabilities including the option for batch mode processing with user-
defined inputs.

       The Panel found that for some modules, inconsistent results can be obtained for a
homologous series of compounds where predictions rely on values for other PER
parameters in EPI Suite™. For example, the estimated BCF values for five compounds in
the n-alkane series, based on either experimental or predicted log Kow values are given
below.
                                        21

-------
       Table X: Octanol-Water Partition Coefficients and Estimated
       Bioconcentration Factors for Several n-Alkanes Derived from EPI Suite
Compound
n-octane
n-nonane
n-decane
n-undecane
n-dodecane
Log Kow*
Experimental
5.18
NA
5.01
NA
6.10
Predicted
4.27
4.76
5.25
5.74
6.23
BCF
1944
93
144
528
314
        •Bolded values used by EPI Suite to predict BCF

       As seen above, the predicted log KOW values show a predictable pattern of
increasing hydrophobicity with increasing chain length. However, BCF values do not
show this pattern - the shortest chain compound (n-octane) with the lowest predicted log
KOW (and an experimental value intermediate between two other experimental values for
higher molecular weight alkanes) produces the highest predicted BCF. This pattern is not
undone by manually entering an experimental value - for example, entering a log KOW of
5.18 for n-nonane gives a predicted BCF (based on that value) of 194, still an order of
magnitude lower than the predicted BCF for n-octane, with an identical experimental log
Kow. It appears further work may be needed in development and use of correction factors
employed to estimate BCF in EPI Suite.

       The common option that allows the user to enter the CAS number of a chemical
to obtain the corresponding SMILES string is a convenient feature of all EPI Suite™
modules. However, it appears that a number of commercial substances that are not
unique structures (i.e., Unknown, Variable Composition and Biologicals - UVCB) are
included in the database as single representative structures.  There are two principal
concerns with this approach.  First, it is unclear from the user guide how representative
structures have been selected.  Second, it is uncertain if predictions derived from unique
structures can be reliably extrapolated  to characterize the  actual complex substance.  To
illustrate this concern, the representative structure for CAS number 68526-86-3
(Alcohols, Cl 1-14-iso-, C13-rich) is shown  below.
                                                         CH,
                                                         CH,
                                       22

-------
       This isomeric alcohol mixture is reacted with phthalic anhydride to produce CAS
number 68515-47-9 (1. 2-Benzenedicarboxylic acid, di-Cl 1-14-branched alkyl esters.
CIS-rich)
                                                     'CH,
       The representative structures selected for these two chemicals are inconsistent
since they reflect different alkyl chain branching. Moreover, such arbitrary differences in
selection of representative structures can yield misleading predictions for some key
endpoints (e.g., biodegradation).

       To address this concern, the user could first be alerted by EPI Suite™ to the fact
that the chemical under consideration is complex and may not have a unique structure
and that physical-chemical property predictions may be less certain than for a unique
chemical.
       C.     Estimation Methods and Alternates

          i. Are the estimation methods in the EPI Suite™ up-to-date and generally
          accepted by the scientific community for its various uses?

       In general, the Panel  concluded that the current estimation methods used in the
EPI Suite™ modules are generally accepted by the scientific community. However, the
methods are at risk of becoming outdated as data and practice advance, particularly with
regard to the data included in the module training sets. For this reason, the Panel
encourages the Agency to evaluate whether the incorporation of newer statistical
approaches (e.g., logistical modeling) would increase the accuracy of module prediction.
A detailed summary of the relevance and general acceptability of EPI Suite™ estimation
methods is provided in the following bullets.

   •   Up-to-date: The underlying data and statistical models are generally not up to
       date. The Agency should consider incorporation of new data sets and newer
       statistical analysis tools to optimize the accuracy of the modules. Linear
                                        23

-------
       regression may not always be the optimal statistical model for physical-chemical
       property estimation.
      Acceptance by the scientific community: Those in the scientific community who
      understand the role and accuracy limitations of screening models used in
      regulatory decision-making generally accept the EPI Suite™ module results for
      many classes of organic chemicals. The EPI Suite™ modules are also generally
      accepted among regulators. EPI Suite™ modules have been accepted by the
      OECD and are being tested for implementation in relation to high production
      volume (HPV) chemicals and the Globally Harmonized System (GHS) for
      classification and labeling of chemicals by OECD. At the request of the United
      Nations Sub-Committee of Experts on the GHS, the OECD  is developing
      proposals for classification criteria and labeling of chemicals according to the
      health and environmental hazards the may present. A Task Force on
      Harmonization of Classification and Labeling has been established to coordinate
      the technical work carried out by the experts.  OECD typically assigns a
      reliability code of 2 (valid with restrictions) to EPI Suite M estimates. Moreover,
      the extensive peer-reviewed documentation that supports the use of EPI Suite™
      (Q)SARs as well as the large number of evaluation (validation) studies published
      demonstrates that EPI Suite™ complies with EPA information quality guidelines
      (USEPA 2002).
       Use in assessments: Within the wider scientific community there is some
       confusion about whether EPI Suite™ module output is appropriate for full risk
       assessment or hazard assessment.  However, in general, those experts that
       understand that the EPI Suite™ modules are evaluative by design, hypothesis
       generators, and first tier predictions of a chemical's fate when the alternative is no
       data at all support the predictive functionality that the modules provide.  More
       detailed information can be found in the APPENDIX for ICi.
         ii.  Are there other estimation methods that should be considered in
         upgrading EPI Suite™?

       Owing to the breadth of this charge question, the Panel's response was two-fold.
The first part of the Panel's response is focused on estimation methods that are applicable
primarily to new physical-chemical properties (i.e., those that are not currently available
within EPI Suite  ).  The second part of the Panel's response describes the development
of methods/approaches that could be used to more effectively estimate properties that are
currently available in EPI Suite™.
                                       24

-------
       With respect to new additional physical-chemical properties, the Panel identified
the following as important for expanding the accuracy and scope of EPI Suite™ for
organic compounds:

   •   pKa
   •   Influence of pKa on other physical-chemical properties
   •   Temperature dependency of all physical-chemical properties
   •   KOA
   •   Bioaccumulation factors for root plants, leaf plants, fish and terrestrial organisms
       (e.g., meat and milk transfer factors)
   •   Diffusion coefficients in various environmental media
   •   Metabolism and production of stable chemical intermediates
   •   Neutral hydrolysis
   •   Activity coefficients
   •   Sub-cooled liquid vapor pressure and aqueous solubility
   •   Surface tension
   •   Anaerobic biodegradation potential
   •   Ozone depletion potential, greenhouse gas potential, and maximum incremental
       reactivity (MIR) for assessing ozone formation potential.

       With respect to EFMs for wastewater treatment, EPI Suite ™ currently includes
predictions for only a default conventional activated sludge system. Future
enhancements should provide options for user-defined treatment systems (e.g., tank
dimensions, fine versus coarse bubble diffusers) as well as alternate treatment designs
(e.g., aerobic lagoons).

       Several panel members offered the following list of additional chemical properties
specifically related to the toxicity and fate of polymers that the Agency may consider
adding to EPI Suite™:

   •   Glass transition temperature
   •   Crystal melt transition temperature
   •   Elastic mechanical properties like bulk modulus
   •   Viscosity measures
   •   Heat capacity
   •   Cohesive energy
   •   Flammability
   •   Parameters (e.g., degradation rates) influencing environmental persistence
                                        25

-------
       With regard to improving the accuracy of predictions of those physical-chemical
properties currently available within EPI Suite™, the Panel identified the following new
approaches:

   •   The Agency should consider the use of poly-parameter linear free energy
       relationships (poly-parameter LFERs) and neural networks in module
       optimization as well as partial least squares and support vector machine
       methodologies in data fitting.

   •   In those cases where multiple modules exist that are capable of predicting the
       value of the same physical-chemical property, consensus modeling should be
       conducted. If all modules for estimating a given property for a particular
       chemical agree, there is a high level of confidence associated with the property
       estimation.  Conversely, if the modules results vary widely, the reliability of the
       property prediction is uncertain.

   •   To the extent that the Agency can document data quality, the Agency should
       consider moving from two dimensional to three dimensional chemical structure
       based methods.

       Additional comments relating to this topic can be found in section 1-A-i above.
2. Functionality

       A.    How convenient is the software and does it have all the necessary
       features?

       Although the software is convenient to use, significant improvements should be
made to enhance the appearance, navigability and quality of technical support provided
by the EPI Suite™ software. The following bullets summarize the technical
recommendations.

   •   Currently, the individual property estimation and fate modules cannot be launched
       from the EPI Suite™ interface.  The Panel supports greater program flexibility
       that would allow software users the ability to launch individual modules directly
       from the user interface, with appropriate indication of options for entering data or
       utilizing values generated by EPI Suite™ to run the modules.

   •   To ensure that software users are cognizant of the quality assurance limitations
       associated with module output, individual modules should alert the user when a
       chemical's physical-chemical properties are outside the chemical training set
       domain.

   •   Although EPI Suite™ operates on a Windows™ platform, the graphical user
       interface (GUI) has an archaic DOS appearance. The Panel encourages the
                                       26

-------
   Agency to upgrade EPI Suite™'s GUI to reflect a more typical Windows™
   operating system environment.

•  To minimize the loss of data when new versions of EPI Suite™ are released, the
   Panel recommends that the new version installation program not delete chemical
   data input by the user but, rather, only overwrite older versions of EPI Suite™
   software itself.

•  To address the myriad of data reporting requirements, the Panel recommends that
   users have the option of saving output files in various formats (e.g., Word™,
   WordPerfect™, Excel™, etc.).

•  Providing greater flexibility for inputting data files in batch mode e.g., provision
   of a screen that allows EPI Suite™ users the ability to simply "cut and paste"
   Chemical Abstract Services Registry Number (CAS) numbers or SMILES
   notations would increase efficiency.

•  EPI Suite™ EFM module users would benefit from having access to a simple
   flow chart that clearly describes the data processing steps that result in generating
   environmental fate model output.

•  To enable users to access various data sets simultaneously, the EPI Suite™
   program should allow minimization of all  screens.

•  To reduce confusion when saving a chemical name run (via Save User), it would
   be helpful if the program used as a default the full chemical name (or a truncated
   version), rather than  the most recently saved name.

•  To improve program navigability, all parameters should be located in a single
   location rather than having some parameters placed in the "Functions -  Other"
   category.

•  The default option for displaying module results should be the full output results
   category rather than  simply furnishing the summary output results.

•  Use of color-coded text to distinguish experimental values from predicted values
   or to alert users of chemicals whose properties were outside those contained in the
   module's chemical data training set would help  to minimize misinterpretation of
   results.

•  When inputting a chemical based on SMILES notation alone, the chemical name
   should be displayed  in both the data entry  screen and in the output file.
                                    27

-------
•  In the AOPWIN module, EPI Suite ™ should specify the environmental
   conditions that are associated with the default concentrations of hydroxyl radical
   and ozone and allow user input of alternative hydroxyl radical and ozone
   concentrations.

•  Clarify the units used in the EPI Suite ™ module PCKOCWIN.

•  BIOWIN Help information should clearly state the conditions which pertain to
   this program's estimates (e.g., aqueous slurry) as well as decision rules for
   extension of BIOW1N results to other media (e.g., sediment, soil).

•  More details regarding the structure, function and parameterization of the
   WVOLWrN module should be provided in the Help files. For example, it is
   unclear what default values are being used for air and water temperature, water
   advective flow, depth of water etc.

•  For the sewage treatment plant module, i.e., STPW1N,  the Help files fail to
   provide the default plant operating conditions.   Temperature of water, whether
   the plant has only secondary treatment or includes tertiary treatment as well, solid
   retention time for the activated sludge systems, etc. should be provided in the
   Help files.

•  Since AOPWIN and the Level 3 fugacity module output is sensitive to mass
   transfer rates as well as degradation/transformation rates, the default values (and
   their associated temperature dependency) should be provided in the Help files or
   in an appendix in the user guide.

•  Experimental data that may be available for a specific structure is not provided for
   some endpoints (e.g., BIOWI1M, BCFWFN).

•  Entering air advection times in hours is not intuitive. Users should have the
   option of entering wind speed instead.
   On the KQC tab, it is impossible to determine whether the module uses the KQW
   method, as KQW is not a calculated property in the results.

   In EPI Suite ™ module results, it would be preferable to list experimental values
   in the same order as predicted values are given (i.e., boiling point, melting point,
   and vapor pressure).

   For the example of lindane, there seems to be a problem with experimental results
   for melting point and boiling point (i.e., values in wrong order).
                                    28

-------
    •   In the half-life selection module (LEVEL3NT) the user is not allowed to specify a
       model estimate or a selected value for air. which is an option for the other
       environmental media.

    •   For those physical-chemical properties for which two or more methods are
       currently available. EPI Suite™ should provide to the user the ability to select
       which module they would prefer to use (e.g., water solubility).

    •   The reference feature should be enhanced by allowing the user to easily access
       individual references (including a brief abstract) through addition of a simple pop-
       up window.

    •   For key references, EPI Suite™ should provide links to web pages where pdf
       versions of the documents can be accessed, if available.

    •   The Help files should contain the list of all references used in developing the
       predictive models.

    •   When modeled property estimates are passed on to other modules (e.g., fate and
       transport modules), the EPI Suite™ program should  identify to the user the values
       that are passed as well as provide clear documentation in the user guide of the
       protocol used to establish data transmission priority. This is especially important
       when there is more than one method available for estimating a particular property,
       e.g., Henry's law constant.

    •   Where the EPI Suite™ Help files explicitly indicate  that certain chemicals have
       been excluded in the database (e.g., CAS Number database), supporting
       explanation should be provided.

    •   Help files and other documentation should be regularly checked by the Agency
       for typographical errors.

    •   The Agency should consider adding a "comments" facility to the EPI Suite™ to
       enable receipt and  incorporation of feedback from users such as identification of
       errors and recommendations.
       B.     Are there places where EPI SuiteTM's user guide (and other program
       documentation) does not clearly explain EPI's design and use? How can
       these be improved?

       The user guide should more clearly identify the modules which can be executed
independently and the features available for a particular module when executed alone.
The stand alone modules could be identified in a separate highlighted section.  Some
features are unavailable when executed as part of EPI Suite  yet can be accessed in
stand alone operations. The "Experimental Value Adjusted" option in KOWWPN and


                                        29

-------
HENRYWFN is an example.  In addition, separate sections in the user guide should
incorporate increased discussion of training set domains and uncertainty in predictions, as
noted elsewhere in this report.

       The Panel agreed that the EPI Suite™ user guide provided a clear and succinct
description of the design and use of the software. However, the Panel noted that the
documentation quality was uneven with many sections supported by detailed references
while others were noticeably devoid of such support.  Moreover, the Panel was
unanimous in its recommendation that the EPI Suite™ software should allow users the
ability to easily download and print a copy of the user manual as a stand-alone document.

       With respect to general improvements for the user guide, the Panel recommends
that the Agency develop a separate detailed guide for activities or functions common
among the various modules (e.g.. how to import chemicals through the SMILES notation,
function keys and buttons, use of results and structure windows, etc.) as well as a quick
start guide for experienced users.  Finally, the guide should clearly describe those
modules that predict chemical properties based on the output from other modules (e.g.,
use of Kow output to predict bioconcentration factors through BCFW1TM.
       C.    Are there aspects of the user interface (i.e., the initial, structure/data
       entry screen; and the results screens) that need to be corrected, redesigned,
       or otherwise improved? Do the results screens display all the desired
       information?

       The Panel applauds the multi-faceted functionality of the EPI Suite™ user
interface. However, the Panel is of the unanimous opinion that EPI Suite™ does not take
full advantage of the opportunities provided by a Windows™ environment.  Moreover,
while there are many positive features associated with the EPI Suite ™ user interface
including its documentation and HELP file availability, there are also opportunities for
substantial improvement. Recommendations for improving the overall functionality of
the user interface could include the following:

   •   The format for module output should be user defined and include the following
       Windows™-based display options: Excel™, WordPerfect™ and/or Word™ file.

   •   When multiple measured values are available within a module, the user should
       have the option to select which measured value is applied in the calculations.

   •   Under the fugacity tab, the input screen should identify the source of module
       input(s) as well as what algorithms are being executed.

   •   Because a user can enter data through either a SMILES string or the chemical
       name, the screen could more clearly indicate that both options are possible.
                                       30

-------
       The "previous"' button has limited functionality and does not seem to work in all
       scenarios. The "previous" option should allow the user to return to a chemical
       when evaluating multiple chemicals and ideally recall more than simply the most
       recent chemical evaluated.
       D.     Currently one enters EPI Suite™ using SMILES and CAS; are there
       other ways to describe the structure (e.g., ability to input a structure by
       drawing it), that should be added?

       The SMILES structure and Chemical Abstract Services (CAS) registry number
input options are adequate to describe and query chemical structures.  However, the
addition of an input drawing program would  extend the utility of the EPI Suite™ to users
who are unfamiliar with the SMILES notation. Alternatively, the user could be directed
towards commercial packages to assist in the derivation of SMILES structures.

       When CAS registry numbers are unavailable, users typically prefer to draw their
structures rather than use a string language input.  Moreover, the use of SMILES may
limit the users of EPI Suite™ to those with basic knowledge of organic chemistry.

       Inclusion of a two-dimensional [2D] structure drawing program in addition to
SMILES will be valuable to users with limited knowledge of organic chemistry.  It is
also useful to highlight to current users that a structure drawn in commercial software
packages (e.g., Cambridge Soft's Chemdraw™) can be copied and pasted directly into
EPI Suite™.

       The Panel does not recommend that the Agency attempt to develop its own
structure drawing program, but, rather purchase/license one  of the many commercially
available software packages. There are several computer-based chemical drawing
packages that generate SMILES or other 2D [and 3D] structure tables.  The Panel noted
the following observations that support utilizing commercially available software
packages:

   •   Most commercially available software packages are  generally accepted by the
       scientific user community.

   •   Programs like those offered by Elsevier's MDL and  Chemdraw™ have options to
       execute batch mode operations as well as read and write structure files interfaces
       to other commercial  software.

   •   The MDL software package has a module that effectively models chemical
       properties of linear polymers.

   •   Both the MDL (1) and Chemdraw1 M (2) software packages can effectively draw
       isomeric structures and have the ability to interface to three dimensional [3-D]
       chemical structure generating programs.
                                       31

-------
       E.    EPI Suite1 M has many convenience features, such as the ability to
       accept batch mode entry of chemical structures, and automatic display of
       measured values for some (but not all) properties.  Are there other features
       that could enhance convenience and overall utility for users?

       While there are a number of features in EPI Suite™ that increase the convenience
of the program (including multiple modes of identifying a chemical of interest in the
input, and allowing for user-specified input parameters), the Panel has recognized that the
program interface should balance convenience with transparency including
characterization of the uncertainty associated with model output. In other words, while
the Panel is cognizant of the importance of usability in executing the EPI Suite ™
programs, convenience should not come at the expense of providing users with a better
sense of how estimated values are derived.

       The following bullets summarize the specific Panel recommendations with
respect to additional features to enhance software convenience.

    •  To encourage examination of the sensitivity of user input on module output, the
       Panel recommends that the user have the ability to execute the fate modules
       separately from the  physical-chemical parameter prediction models (with the
       caveat noted previously in Section 2A).

    •  The CAS number database should  be validated in the  current version (in
       particular, discrepancies between CAS numbers and SMILES notation), and
       regularly updated with new information.

       There is a discrepancy between the number of chemicals for which SMILES
       notation exists and the number of chemicals in the TSCA inventory.  It would be
       valuable to document within the SMILES HELP files the reason for the difference
       in chemical coverage.   The documentation on SMILES refers to approximately
       20,000 discrete organic chemicals  in the original TSCA inventory that are in the
       SRC database, while the June 2005 U.S. Government Accountability Office
       (GAO) report references 62,000 organic chemicals in the original inventory
       (GAO, 2005).

    •  For the batch mode  entry  feature, the system should allow user-specified inputs of
       physical-chemical properties.

    •  Rather than having the module output written to the directory containing the
       program for the batch mode entry feature, it would be preferable to give the user
       the option of naming the output file and identifying the location where it will be
       saved.

    •  The output data in batch mode should include CAS numbers for each chemical as
       well as the names and  SMILES notation.
                                       32

-------
       The Name Lookup feature should be added to each individual module.

       For chemicals that have isomers, the module output should explicitly state that
       fact. The output display should include identification of the other isomers that
       exist, by name and CAS number.

       For results displayed in summary format, measured values should be given for
       several isomers of the chemical assessed, if available.

       For both the summary and full output options, repeating the listing of
       experimental values in the results screen for a given parameter is confusing and
       should be avoided (e.g., experimental aqueous solubility in both fragment
       approach and log KOW approach).
       F.     Are property estimates expressed in correct/appropriate units?

       In general, the Panel found no specific concerns regarding the units used to
express the property estimates. However, the Panel has made the following
recommendations that should improve the overall utility of module output.

   •   Output data should be presented in International System of Units (SI) units.

   •   There should  be consistency with the use of significant figures.

   •   For BCFW11M the units are L/kg (wet weight) for fish and should be included in
       the output.

   •   Units should be specified for log Koc.

   •   The Agency should provide a unit conversion program to allow the user the
       option to convert from one set of output units to another.
       G.     Is adequate information on accuracy/validation conveyed to the user
       by the program documentation and/or the program itself?

       In general, the Panel found that the information on module accuracy and
validation was conveyed adequately to the user, but not in a consistent and transparent
manner.  For the sake of clarity, the Panel has addressed accuracy/validation issues
pertaining to (Q)SARs (i.e., algorithms) and the actual property estimation outputs
separately.
                                        33

-------
          i)  Is adequate information on accuracy/validation conveyed with respect
          to the QSAR predictive model itself?

       The Panel found that, while regression statistics are provided in assessing model
performance and residual error, for most modules, this information is not generally
transparent to the user.  Moreover, when such information is available, it is uncertain as
to which version of the  module the reported analysis applies.  It should be straightforward
to determine a common set of statistical significance measures valid across all modules
that would provide common and comparative measures of accuracy and validity.  A
suggested set of module performance metrics for consideration include:

V.x.y =      model version
N(T)  =      number of compounds in the training set
N(O) =      number of outliers removed in developing the module
R2 =         standard coefficient of determination
Q2 =         leave-one-out cross-validation coefficient.
SD =        standard deviation of fit
R(l) =       lower value of the range of the property in the training set
R(u) =       upper value of the range of the property in the training set
MRE =      mean residual error
SRE =       standard deviation of residual error
RX =         average correlation coefficient for models built from X random values of
             the dependent variables contained in the training set
R(t)x2 =      correlation coefficient of an external validation set of X compounds

       In addition, for each of the endpoints predicted  by QSAR, a brief discussion on
the measured error associated with current test protocols would be valuable.  Any insights
regarding trends in measurement error (e.g., measurement error of Log KOW and water
solubility tends  to increase with increasing Log KOW or decreasing  water solubility,
respectively) should be summarized.
         ii)  Is adequate information on accuracy/validation conveyed with respect
         to making the property estimation of a particular test chemical?

       The Panel concluded that a major shortcoming of EPI Suite™ is that the user is
given no indication as to whether the domain of module applicability is appropriate for
the test chemical. Currently, the decision to use a module for a specific chemical appears
to be based on past experience and/or professional judgment. This approach is not
transparent and could lead to inconsistency and error in assessments among chemicals.
How the Agency uses (or does not use) or  interprets EPI Suite™ results in making
decisions is an important consideration in determining if the software provides the degree
of accuracy that supports its intended use.

       Upgrades to EPI Suite™ provide the Agency with an opportunity to better
understand the accuracy of the software's estimates.  For example,  the Agency could
                                       34

-------
compare earlier estimates to any new measured values that have since been published for
the chemical of interest. After an EP1 Suite™ PER or EFM  has been upgraded, the new
estimate can be compared with the earlier estimate for the same chemical. Assuming the
more recent upgrade is producing more accurate estimates in general, results from the
new predictions can indicate the degree of over- or under-prediction for the parameter of
concern in the original assessment.  Comparing either new experimental or estimated data
to decision-making criteria can be used to assess the performance of EP1 Suite™ in
supporting regulatory decision-making.

       The Panel also endorses an independent "model domain" analysis to  improve
accuracy and reliability estimates of chemical property values.  In this analysis, the
degree of molecular similarity of the test chemical to that of the chemicals used in the
module training set establishes the reliability of a property estimate. The Panel did not
have the necessary expertise to provide specific advice on preferred domain  analysis
methods but encourages the Agency to seek experts for technical guidance so that this
functionality can be included  in future  EPI Suite™ upgrades.
3. Appropriate Use

       A.     Currently Identified Uses

          i.   Is the science incorporated into EPI Suite1 M adequate for each of
          these current uses?

       All of the modules in EPI Suite™ are generally accepted for use in risk-based
priority setting, screening level risk assessment and prioritization for chemical testing, for
the chemical classes to which the modules apply. Given the large number of chemicals
that the Agency must screen in a short period of time, reliance on (Q)SAR module output
is justified. The modules are expected to provide order of magnitude estimates, an
accuracy standard that is generally acceptable by the Agency for screening level
assessments.  This level of accuracy should be clearly conveyed to users outside the
Agency.

       The Agency should continue to validate, update, and investigate the uncertainty
associated with the modules in  various regulatory programs. A  more extensive analysis
and explanation of the limitations of the PERs  and EFMs would help clarify appropriate
use.

          ii.   If not, what improvements are needed to make EPI Suite™ adequate
          and what alternative approach could be used in the interim?

       There are specific uses of EPI  Suite™ that are not entirely appropriate for
supporting the PlvTN and pollution prevention (P2) programs. At present,  the chemical
domains that are  used by (Q)SARs do not provide adequate coverage of nanoparticles,
inorganic compounds, organo-metallic and some polymeric chemicals (as well as other
                                       35

-------
classes of chemicals). Application of (Q)SARs to chemicals outside the domain of the
training set is likely to result in unreliable estimates. The Panel recommends that the
Agency collect more peer-reviewed measurement data on the physical and chemical
properties for these chemicals with the intent of either expanding the domain of the
existing (Q)SARs or for creating new (Q)SARs specifically for these classes of
chemicals.
       B.      Potential Additional Uses
       Given the Agency's global leadership in the field of chemical screening to
emerging industrial economies in Asia, South America, Eastern Europe, and Africa, it
should come as no surprise that these regions are adopting EPI Suite   in their regulatory
programs as well.

       EPI Suite™, if translated into major foreign languages (e.g., Arabic, Spanish,
Portuguese, French, Russian, Standard Chinese and Mandarin, Bahasa Indonesia, and
Hindi), is a practical and scientifically-credible risk management technology transfer that
will allow countries with emerging industries to establish sustainable chemicals
management systems. The United Nations (UN) Strategic Approach to International
Chemicals Management (SAICM) project represents an ideal forum in which the benefits
of EPI Suite™ application can be shared with the international regulatory community.

       In addition to the direct uses of EPI Suite™ by the Agency, the following
additional potential uses have been identified.

   •   EPA and other Federal Agencies'

             EPI Suite™ is clearly seen as an important tool in any regulatory program
             that evaluates chemicals for public health and environmental safety.
             Agency programs that benefit from EPI Suite™ include: a) EPA Office of
             Pesticide Programs (OPP), b) EPA Office of Water (0 W), c) EPA Office
             of Solid Waste and Emergency Response (OSWER), and the c) US Food
             and Drug Administration.

   •   Private Industry:

             Industrial applications where  EPI Suite™ software can be valuable
             include the development of more environmentally friendly products or
             "green" engineering processes.

             EPI Suite™ can be used to support the issuance of chemical exposure-
             based waivers that reduce the use of animal testing under programs such
             as TSCA and HPV Challenges world-wide.
                                       36

-------
       EPI Suite™ output can inform and guide environmental exposure
       monitoring programs.

International Regulatory and other Programs:

       EPI Suite™ output can be used to support hazard classification when
       experimental data are not available.

       EPI Suite™ output can be used as part of the process to conduct Persistent
       Bioaccumulative and Toxic (PBT) identification/categorization.

       EPI Suite™ output can be used to support chemical assessment and
       management programs especially for High Production Volume (HPV)
       chemicals.

       EPI Suite™ output can be used to support global initiatives such as the
       Stockholm Convention to control the long-range transport of Persistent
       Organic Pollutants (POPs) or other assessments of the potential for long
       range transport of chemicals and other Green House gas assessments.

       EPI Suite™ may play a significant role in the OECD (Q)SAR ToolBox.
                                37

-------
       REFERENCES FOR BOTH REPORT AND APPENDICES

Altschuh J; R. Bruggemann ,H. Santl, G. Eichinger and O.G. Piringer. Henry's law
constants for a diverse set of organic chemicals: experimental determination and
comparison of estimation methods.  Chemosphere. Volume 39, Number 11, pp. 1871-
1887(17), November 1999

Arnot, Jon A. and Frank A.P.C. Gobas A Food Web Bioaocumulation Model for
Organic Chemicals in Aquatic Ecostystem, Environmental Toxicology and Chemistry:
Vol. 23, No.  10, pp. 2343-2355, 2004.

Beyer A, Wania F, Gouin T, Mackay D, Matthies M. 2002. Selecting internally
consistent physicochemical properties of organic compounds. Environ. Toxicol. Chem.
21:941-953.

Cambridge Soft, Inc. 100 Cambridge Park Drive, Cambridge, MA 02140 USA
info@.cambridgesoft.com: TEL 1 (617) 588-9100; FAX 1 (617) 588-9190

Cowan, C E, D Mackay, TCJ  Feijtel, D van de Meent, A Di Guardo, J Davies, and N
Mackay. 1995. "The multi-media fate model: a vital tool for predicting the fate of
chemicals". SETAC Press,  Pensacola, FL. USA.

DiToro, DM. 2005. "Sediment flux modeling", John Wiley, NY, USA.

Elsevier MDL, 2440 Camino Ramon, Suite 300; San Ramon, Ca 94583;
http://www.mdl.com/

Frame, G.M.,Cochran, J.W., and Boewadt, S.S. 1996a. "Complete PCB congener
distributions for 17 Aroclor mixtures determined by 3 HRGC systems optimized for
comprehensive, quantitative, congener-specific analysis," HRC-J. High-Resolul.
Chromatogr., 19(12):657-668.

Frame, G., Wagner,  R., Carnahan, J., Brown, J., May, R., Smullen, L., and Bedard, D.
1996b. "Comprehensive, quantitative, congener-specific analyses of eight Aroclors and
complete PCB congener assignments on DB-1 capillary GC columns," Chemosphere,
33(4):603-623.

Goss,  K.-U.;  Buschmann, J.; Schwarzenbach, R. P. Determination of the Surface
Sorption Properties of Talc, Different Salts, and Clay Minerals at Various Relative
Humidities Using Adsorption  Data of a Diverse set of Organic Vapors. Environ Toxicol.
Chem  2003,  22, 2667-2672.

Government Accountability Office (GAO), 2005,  Chemical Regulation: Options Exist to
Improve EPA's Ability to Assess Health Risks and Manage Its Chemical Review
Program, GAO-05-458, June 2005.
                                    G-l

-------
Green. Nicholas and Ake Bergman, Chemical Reactivity as a Tool for Estimating
Persistence:  A proposed experimental approach for measuring this key environmental
factor, Environmental Toxicology and Chemistry: Vol. 39, Iss. 23, pp 480A-486A, 2005.

Hilal, et al. QSAR Comb. Sci. 2003, 22, pp. 565- 573:

Hilal, et al. QSAR Comb Sci. 2003, 23, pp. 709-720

Hilal, personal communication, 2005; Long Chained Aliphatic Alcohols S1AR, 2006.
Leo, A.J. 1992. 30 years of calculating Log Poet- QSAR Meeting, Duluth M"N, July 23,
1992

MacLeod M, Fraser AJ, Mackay D. 2002. Evaluating and expressing the propagation of
uncertainty in chemical fate and bioaccumulation models. Environ. Toxicol. Chem.
21:700-709.

Meylan, WM; Howard, PH.  Bond Contribution Method for Estimating Henry's Law
Constants. Environmental Toxicology and Chemistry ETOCDK, Vol. 10, No. 10, p
1283-1293, October 1991.

Meylan, W.M., Howard, P.H., Boethling, R.S., Aronson, D., Printup, H., Gouchie, S.,
1999, Improved Method for Estimating Bioconcentration/Bioaccumulation Factor from
Octanol-Water Partition Coefficient, Environ Toxicol Chem, 18(4):664-672.

Nguyen, T. H.; Goss, K.-U.; Ball, W. P. Polyparameter linear free energy relationships
for estimating the equilibrium partition of organic compounds between water and the
natural organic matter in soils and sediments. Environ Sci  Technol 2005, 39, 913-924.

Nikolova-Jeliazkova, N. and Jaworska, J. (2005). An Approach to Determining
Applicability Domains for QSAR Group Contribution Models: An Analysis of SRC
KOW1N. ATLA  33, 461-470.

OECD, Principles for the Validation, for Regulatory Purposes, of (Quantitative)
Structure-Activity Relationship Models, November 2004.
http://www.oecd.org/document/23/0.2340.en_2649 34365  33957015 I 1_1  I.OO.html

Peijnenburg, Pure & Appl. Chem. 1994, Vol. 66, No 9, 1931-1941

Schenker U,  M Macleod ,Scheringer M, Hungerbuhler K. 2005  Improving Data Quality
for Environmental Fate Models: A Least-Squares Adjustment Procedure for Harmonizing
Physicochemical Properties of Organic Compounds. Env. Sci. Technol. 39:8434-8441.

Staikova M,  Wania F, Donaldson DJ. 2004. Molecular polarizability as a single-
parameter predictor of vapour pressures and  octanol-air partition coefficients of non-polar
compounds:  a priori approach and results. Atmos. Environ.  38:213-225.
                                      G-2

-------
Thibodeaux, LJ. 1996. "Environmental Chemodynamics". John Wiley, NY, USA.

Trapp, S and M Matthies. 1998. "Chemodynamics and environmental modeling-an
introduction". Springer-Verlag, Berlin, Germany.

USEPA. 2002. Information Quality Guidelines. Office of Environmental Information.
(EPA/260R-02-008) Washington DC.

USEPA. 2003. Draft Guidance on the Development, Evaluation, and Application of
Regulatory Environmental Models. The Council for Regulatory Environmental
Modeling. November 2003.
                                    G-3

-------
                                GLOSSARY
AOPWFN
BAF
BCF
BCFWFN
BIOWfN
CAS Number
Chemdraw™
DERMWFN
DOS
DSL
ECOSAR
EFM
EPI
GAO
GHS
GUI
He
HENRYWTN
HPV
HYDROWFN
1UPAC
KQW
KOWWIN
LEVEL3NT
MCI
MDL

MTC
MPBPWIN
NAPL
OECD
OPP
OPPT
OPPTS
OSWER
PCKOCWIN
PER
pKa
PMM
POP
Atmospheric Oxidation Estimation Pestimation rogram
Bioaccumulation factor
Bioconcentration factor
Bioconcentration factor estimation program
Biodegradation factor estimation program
Chemical Abstract Services Registry Number
Chemical Drawing Program - CambridgeSoft Corporation
Dermal Permeability Coefficient Program
Disk operating system
Domestic Substances List - Environment Canada
Ecological Structure Activity Relationship Program
Environmental Fate Models
Estimation Program Interface
Government Accountability Office
Globally Harmonized System for Classification of Chemicals
Graphic User Interface
Henry's Law Constant
Henry's Law Constant Estimation Program
High Production Volume Chemicals
Hydrolysis Factor Estimation Program
International Union of Pure and Applied Chemistry
Air-water Partitioning Coefficient
OctanolOctonal-Air Partitioning Coefficient
Organic Carbon Partitioning Coefficient
Octanol-Water Partitioning Coefficient
Octanol-Water Partitioning Coefficient Estimation Program
Level 3 Fugacity Estimation Program
Molecular Connectivity Indices
Elsevier Molecular Design Limited (MDL) Information
Systems
Mass Transfer Coefficient
Melting Point-Boiling Point Chemical Estimation Program
Non-aqueous Phase liquid
Organization of Economic Cooperation and Development
Office of Pesticide Programs
Office of Pollution Prevention and Toxics
Office of Prevention, Pesticides and Toxic Substances
Office of Solid Waste and Emergency Response
Organic Carbon Partitioning Coefficient Estimation Program
Property Estimation Routine
Negative Log of a Chemical's Dissociation Constant
Premanufacture Notice
Persistent Organic Pollutants
                                     G-l

-------
PP-LFER
QSAR
QSPR
REACH

SAICM

SPARC

SMILES
SPC
STPWIN
TSCA
UVCB

WATERNT
WSKOWIN
WVOLY1N
VOC
Polyparameter Linear Free Energy Relationships
Quantitative Structure Activity Relationship
Quantitative Structure Property Relationship
European Union's Registration, Evaluation and Authorisation
of Chemicals Policy
United Nations (UN) Strategic Approach to International
Chemicals Management
Spare Performs Automated Reasoning in Chemistry -
http://ibmlc2.chem.uga.edu/sparc/
Simplified Molecular Input Line Entry System
Structure Property Correlation
Sewage Treatment Plant Chemical Fate Estimation Program
Toxic Substances Control Act
Unknown or Variable Composition, Complex Reaction
Products and Biological Materials [per HS]
Organic Compound Water Solubility Program
Water Solubility Estimation Program
Volatilization Rate from Water Estimation Program
Volatile Organic Compound
                                     G-2

-------
                             TM
Summary Assessment of EPI Suite '  Core Models
Model
AOPWfN
BCFWFN
BIOWIN
HYDROW1N
KOWWIN
MPBPVP
HENRY WIN
PCKOCWIN
WATERNT
WSKOWIN
WVOLVIN
LEVEL3NT
Assessment
A^ospheric oxidation/ozone reaction rates are predicted using AOPWIN using the
Atkinson fragment and functional approach method. It is the generally accepted approach
for estimating these properties. It has been validated on a relatively small dataset of 77-
79 chemicals EPA should consider more validations for this method. R2 = 0.93
BCFWIN is generally accepted as the best fit to existing bioconcentration data. BCFWIN
does not appear to have been externally validated or the information is not available in
the user guides. If these models have been externally validated in the literature by various
investigators, EPA should include this data in the user's manuals. No R2.
The (Q)SPR estimation of biodegradation has inherent problems, one of which is the lack
of reproducibility of measured biodegradation data The BIOWIN model is reasonably
well accepted and generally performs as well as or better than the available models EPA
should summarize all available validation data for BIOWTN in the users manual so that
this information is readily available. Also, EPA should consider giving more advice on
which of the 3 BIOWIN model approaches is most appropriate in a given situation R2 =
0.5-0.97
Hydrolysis rates for a specific set of functional groups are predicted by HYDROWIN and
are a generally accepted approach. HYDROWIN does not appear to have been externally
validated or the information is not available in the user guides If these models have been
externally validated in the literature by various investigators, EPA should include this
data in the user's manuals No R2
The KOWWIN model is well accepted, uses an accepted fragment-based technique and is
an important (Q)SPR for regulatory use It generally performs better than most existing
(Q)SPR Kow prediction methods The external validation data for this method is good
and the summary information is available to the user R2 = 0 94
The MPBPVP (Q)SPR is accepted as a good estimator of BP, MP and VP. The melting
point (Q)SPR is the weakest of this group because the external validation coefficient of
determination was reported as 0.66 The standard deviation of 63 K is also indicative of
some prediction error It is not likely that a significantly more accurate melting point
determination is necessary for EPA regulatory programs and this method should be
satisfactory for most regulatory uses R2 = 0.92-0.95
Uses two different methods and produces two different estimates (bond and group
contribution) for air-to-water partition coefficient The models are generally accepted
with R2 = 0 94-0 96
The as a good estimation tool of soil sorption coefficients (Koc) based on first order
molecular connectivity index (MCI). It is satisfactory for most regulatory uses R2 = 0 86-
096.
WATERNT uses the atom fragment contribution (AFC) method to predict water
solubility building upon the KOWWIN methods water solubility of organic compounds
at 25°C is predicted R2 = 0.87-0 98.
WSKOWIN is a good model for prediction of water solubility. It has been validated with
a large dataset with a high coefficient of determination, R2 = 0 9
Estimates volatilization half-lives from a model river and lake The program's default
parameters for a model river will yield a half-life that is indicative of the fastest
volatilization that may be expected in environmental waters (a shallow, rapidly moving
river with strong surface wind). The default parameters for the lake yield a much slower
rate.
The EPI interface program executes the WVOLNT( Volatilization Rate from Water)
program by transferring the Molecular Weight, the Henry's Law Constant, and various
volatilization parameters to WVOLNT No R2
Half-lives are required for air, soil, sediment and water . . the fugacity can not run
                    A-1

-------
             without them If the half-lives in air, water, soil and sediment are known, the "Use Half-
             Lives Entered Below" should be selected and the known values should be entered in the
             appropriate fields. Often, however, these data are not available and require estimation
             The BIOWIN and AOPWIN programs are used to make these estimates  The AOPW1N
             air estimate is based upon estimated hydroxyl radical and ozone rate constants.
             AOPWIN does have an experimental database containing more than 700 compounds  If
             an entered structure has a database match, the database value is used instead of the
             program estimate. The half-life for degradation of a chemical in water, soil, and sediment
             is determined using the ultimate biodegradation expert survey model of the BIOWIN
             estimation program. This estimation program provides an indication of a chemical's
             environmental biodegradation rate in relative terms such as hours, hours to days, days,
             days to weeks, and so on, the terms represent the approximate amount of time needed for
             degradation to be "complete". This output cannot be used directly by the level IJI
             multimedia mass balance model The mean value within the estimated time range
             returned by Biowin3 is converted  to a half-life using a set of conversion factors. These
             conversion factors consider that 6  half-lives constitute "complete" degradation of a
             chemical substance, assuming first-order kinetics. The resulting conversion factors for
             water are provided below. The Fugacity Model can not run without a vapor pressure. If
             the vapor pressure is not user-entered, the model uses the vapor pressure estimate by the
             MPBPWIN Program  If the MPBPWIN Program estimates a vapor pressure of zero
             (which can occur if an estimate is  less than  l.OOe-40 mm Hg), the fugacity model uses an
             assumed value of 1 .OOe-15 mm Hg (this value is low enough to have no sensitivity effect
             in the fugacity estimates). The model also requires a log Kow value  If the log Kow is
             not user-entered, the model uses the value from the KOWWFN Program (an  experimental
             database value is used if available instead of the estimate) The  Fugacity  model in
             EPIWIN has limited user-access to many parameters in the Mackay Level III Model. For
             example, parameters such as rain rate, aerosol deposition, soil water runoff, and diffusion
             mass transfer coefficients can not  be changed by the EPIWIN user For these parameters,
             EPIWIN relies solely upon the defaults values as determined by Mackay and co-workers
             This greatly simplifies application of a Level III model for most users No R2.	
STPWIN    | The STPWIN program is a version of the Toronto Model originally developed by Donald
             Mackay and colleagues at the University of Toronto Includes outputs on. Bio P- the
             biodegradation half-life (in units of hours) in the primary clanfierof a sewage treatment
             plant (STP)  Bio A: the biodegradation half-life (in units of hours) in the aeration vessel
             of an STP Bio S the biodegradation half-life (in units of hours) in the final settling tank
             of an STP All STP parameters are now accessed from the mam menu bar by selecting
             "STP" The STP program uses only default operating conditions of a model sewage
             treatment plant operating at 25 degree C. No R2	
                                           A-2

-------
                             APPENDIX for lAi

       The upgrades to EPI Suite™ could include a module containing algorithms for
estimating the mass-transfer coefficients (MTCs) used in the EFM Category as well as
allowing for user-entered values. A recent study comparing the outputs of five
multimedia fate models demonstrated that model homogenization was possible only
when the numerical values of the dozen or so MTCs were numerically equal (Cowan, et
al., 1995). Otherwise, the computed concentration levels, mass fractions in the
compartments and the chemical residence time estimates were dramatically different,
many by orders- of-magnitude. Typically the numerical values of these MTCs vary by a
factor often at a particular environmental interface and sometimes much more
(Thibodeaux, 1996).

       The LEV3EPI  module for example, contains twelve default MTC values; these
were likely chosen by  the model developers and are embedded within the code. In
addition to having chemical species and physical property dependence, the MTCs are
also functions of parameters that characterize the sizes, fluid dynamics, etc., of the
environmental compartments.  In the future, as EFMs develop in sophistication the users
will need the option of having algorithms for estimating MTCs, including those that are
most representative of the environmental compartments into which the chemicals are
entering.

       It is possible and appropriate for EPA, with only a modest expenditure of
resources, to develop estimating algorithms for these MTCs.  A sizable quantity of data
and accompanying theoretical models exist in diverse types of published literature. In
general the tasks required in the algorithm development efforts will include the collection
and evaluation of the existing data followed by producing the appropriate theory-directed
statistical correlations needed for their estimation. These final algorithms should be
similar to those in the  PER Category of the EPI Suite™. Some limited compilations of
these MTC algorithms are available in textbooks and other documents (Thibodeaux,
1996; DiToro, 2005; Trapp and Matthies, 1998). Many are imbedded within existing
Agency software, EXAMS for example.  However, there is no single location for
accessing such parameters for direct use by the Agency or others. By having such an EPI
Suite™ module (e.g., MTCWFN) a major input parameter for the LEV3EPI could be
definitively selected by the user thereby eliminating one level of uncertainty that
presently exists by relying on unknown imbedded default values.
                                      A-3

-------
                             APPENDIX for IBi

The following descriptions are edited versions of the accuracy statements in the EPI
Suite™ HELP Files.
       Estimation Accuracy of WATERNT: The statistical accuracy of the current 1000
       compound training set is excellent; the correlation coefficient (R2) is 0.975, the
       standard deviation is 0.336 and the absolute mean error is 0.28. However, to be
       effective, an estimation method must be capable of making accurate predictions
       for chemicals not included in the training set. Currently, WATERNT has been
       tested on a validation dataset of 3,923 compounds. The validation set includes a
       diverse selection of chemical structures that rigorously test the predictive
       accuracy of any model. It contains many chemicals that are similar in structure to
       chemicals in the training set, but also many chemicals that are different from and
       structurally more complex than chemicals in the training set. Statistical
       performance for estimated vs. experimental log WatSol (moIes/L) are: n = 3923;
       R2 = 0.86; sd = 0.869; me = 0.70.

       Accuracy of AOPWIN: The accuracy of the estimation methods used by the
       Atmospheric Oxidation Program can be examined by comparing a list of more
       than 640 experimentally determined hydroxyl radical rate constants to the
       program's estimated rate constants. Over 90 percent of the estimated rate
       constants for the 647 different chemicals are within a factor of two of the
       experiment value. Over 95 percent of the estimates are within a factor of three of
       experimental. This can be compared to the PCFAP program (Fate of Atmospheric
       Pollutants) of the USEPA GEMS software which estimates the same rate
       constants as AOPWfN. For 617 compounds (PCFAP can not estimate or produces
       program errors for the remaining experimental values), PCFAP is within a factor
       of two for about 49 percent of the experimental values and within a factor of 3 for
       about 65 percent. PCFAP is particularly inaccurate for many compounds
       containing nitrogen, sulfur or phosphorus. The document "Estimation Accuracy of
       the Atmospheric Oxidation Program" contains a compilation of the experimental
       rate constants used to determine the accuracy of AOPWIN and PCFAP.  Each
       chemical in the compilation includes the experimental rate constant, the  AOPWIN
       estimate, the PCFAP estimate, and the SMILES notation for that chemical. For
       Aromatic Compounds, one of the advantages of the SMILES interpreter used by
       AOPWIN is the ability to identify individual aromatic rings and ring structures.
       This allows the overall rate constant estimation of many aromatic compounds to
       begin with an experimentally measured value for the basic ring structure. For
       example, if 1-methylnaphthalene is entered into AOPWIN, AOPWIN finds the
       naphthalene ring and assigns it the experimentally measured value for
       naphthalene (21.6 x  10"12 cm3/molecule-sec). It then adjusts the experimental
       naphthalene value for one methyl group attachment to an aromatic ring to yield an
       overall estimate of 56.9 x 10"l2cnwmolecule-sec (the experimental value for I-
       methylnaphthalene is 53.0 x 10"12). AOPWIN  identifies and uses the aromatic
                                      A-4

-------
rings (15) that have experimental values (x 10~'2cm3/molecule-sec) and 7 rings are
assigned a value based primarily upon experimentally measured ionization
potentials (x 10"l2cm3/molecule-sec):

Accuracy of BIOWIN: B1OWIN produces two separate MITI probability
estimates for each chemical. The first estimate is based upon the fragments
derived through linear regression. The second estimate is based upon the
fragments derived through non-linear regression. Prediction accuracy of the
training and validation sets are listed below. The validation set is completely
independent of the training set. Chemicals in the validation set were not used to
derive any fragment values. The numbers correspond to correct predictions (either
"readily degradable" or "not readily degradable"):
       Training Set: Critically Evaluated as "Readily Degradable"
              Italian (Italy)Linear Model: 201/254 (79.1%)
              Non-Linear Model: 204/254 (80.3%)
       Training Set: Critically Evaluated as "Not Readily Degradable"
              Italian (Italy)Linear Model: 284/335 (84.8%)
              Non-Linear Model: 284/335 (84.8%)
       Training Set: TOTAL
              Linear Model: 485/589 (82.3%)
              Non-Linear Model: 488/589 (82.9%)
       Validation Set: Critically Evaluated as "Readily Degradable"
              Italian (Italy) Linear Model: 105/131 (80.2%)
              Non-Linear Model: 103/131 (78.6%)
       Validation Set: Critically Evaluated as "Not Readily Degradable"
              Italian (Italy)Linear Model: 135/164 (82.3%)
              Non-Linear Model: 135/164 (82.3%)
       Validation Set: TOTAL
              Linear Model: 240/295 (81.3%)
              Non-Linear Model: 238/295 (80.7%)
Accuracy of HENRY WIN: The accuracy of the bond contribution method is
discussed in detail in Meylan and Howard (1991). Briefly, a correlation
coefficient (R2) of 0.97, a standard deviation (sd) of 0.34 and a mean error (me) of
0.21  were found for a 345 compound training set (all statistics apply to LWAPC
values). A 74 compound validation dataset had respective R2, sd and me statistics
of 0.96, 0.46 and 0.31. SRC's current experimental database contains 1650
compounds. Since publication of the Meylan and Howard (1991) article, the
methodology was updated (HENRYWfN version 2) by adding new bond
contribution values and new correction factors, especially for various classes of
pesticides.

At times, the bond estimate and the group estimate made by HENRYWfN may
vary significantly. Experience with HENRYWIN has shown that the difference
between bond and group methods can vary by as much as 2 orders of magnitude
                                A-5

-------
for some compounds with many functional groups. The estimation from the group
method is sometimes preferred unless the bond method uses a correction factor
from Table D-3 (Appendix D) or Appendix F. A recent independent evaluation
(Altschuh et al., 1999) for a diverse set of organic chemicals found the bond
method more accurate than the group method. The group method generates
inaccurate estimates for certain types of structures, such as
hexachlorocyclohexanes (Altschuh et al., 1999). At times, averaging two widely
divergent values is appropriate. For some compounds, both methods can yield a
Henry's Law constant of 1.0x10"12 atm-m3/mole or smaller. Numbers smaller than
this value may be unrealistically low. However, any organic compound with a
Henry's Law constant less than 3.0 xlO"7 is considered essentially non-volatile
from water (Thomas,  1990). The Exposure Evaluation Branch of the U.S. EPA
(OPPT) uses a cut-off of 1.0 xlO"8 atm- mVmole for HLC estimates; any estimate
less than the cut-off is considered 1.0 xlO"18 atm- m3/mole.

Estimation Accuracy of KOWWIN: The figures in this Help file (not shown)
illustrate KOWWFN's ability to estimate accurate log P values. The listing
compares the accuracy of KOWWIN to the ClogP1"1 Program (Daylight, 1995;
BioByte, 1995) statistics using SRC's Experimental Log P Database: (n = number
of compounds; R = correlation coefficient; sd = standard deviation; me = absolute
mean error)
      KOWWINvl.63
              Total: n=!2805; R2=0.95; sd=0.435; me=0.316
              Training: n=2474 R2=0.981 sd=0.22 me=0.16
              Validation:n=10331 R2=0.94 sd=0.47 me=0.35

      CLOGP for Windows (vl .0)
              Total: n=l 1735(a) R2=0.91 sd=0.59 me=0.384
      CLOGP (UNIX version as reported by Leo, 1992)
              Total: n=7250 R2=0.96 sd=0.3
              (using  equation: Log P = 0.914 CLOGP + 0.184) (b)
      (a) Taken from the current database; the difference between the entire
      database (12686) and the number used (11616) is primarily due to
      "missing fragments" in the CLOGP program. BioByte's Internet website
      reports the following statistics for its starlist: n=8942, R2=0.917, sd=0.482
      using the equation: Log P = 0.876CLOGP + 0.307.
      * These statistics were determined after removing large systemic deviant
      compounds and other large deviant structures where the underlying
      difficulty is conformational (Leo, A.J. 1992. 30 years of calculating Log
      Poet. QSAR Meeting. July 23, 1992).
                               A-6

-------
                             APPENDIX for ICi

The primary regulatory obligation in a tiered approach to risk assessment is conservatism
of the prediction at the lowest tier (model based screening), not accuracy; tolerance
towards false negatives differs between countries. The OECD HPV group is currently
running an assessment of member countries' appreciation and application of (Q)SAR
estimates. The results of this effort would lend itself useful to the EPA in reviewing the
EPI Suite™. The EPA should consider all the listed criteria in the appendix when
upgrades to the models are made.

Case study I: Water Solubility. Estimation of long chained aliphatic alcohols water
solubility, comparative analysis between EPI Suite™ (WSKOWWIN). SPARC. and
measured

In relation to an HPV submission, a comparison of water solubility estimations for
aliphatic alcohols, found that, for shorter-chain alcohols (C6-C10), the modeled and
measured values were comparable. For mid-chain (CIO-14) alcohols, the EPI Suite
model moderately overestimated the water solubility. For the longer-chain  alcohols (C14-
C18), the EPI Suite™ overestimated water solubility by approximately one log unit,
which could have an impact on the need for further toxicity assessment. This case
illustrates that empirical regression driven models are more susceptible to error when
screening complex compounds with few empirical data or data of questionable quality
close to the limit of solubility, than thermal and quantum energy driven models such as
SPARC, which are less dependent on measured values (Hilal, et al.,2003 a  and b) see
Figure in Appendix.
                                                             TM
                              Long Alcohols Water Solubility
            en
1e+5

1e+4

1e+3-

1e+2 -



1e*0



1e-2  -

1e-3  -
                                                EPIWIN
                                                SPARC
                                                Experimental (mean +/- SD)
                     C6  C7 C8  C9  C10 C11  C12 C13 C14 C15 C16 C17 C18

                                     Chain length
                                      A-7

-------
Case study II: Hydrolysis. Comparative analysis between EPI Suite™ (HYDROWIN).
SPARC, and measured

For hydrolysis, SPARC, at this time, calculates only carboxylic acid ester hydrolysis rate
constants in any single or mixed solvent at any temperature. EPI Suite ™ calculates
esters, carbamates, epoxides, halomethanes, and alkyl chlorides hydrolysis rate constants
only in water. The SPARC residual mean squares deviation error of the calculated versus
observed values in water is better than 0.37 and R2 equal to 0.98 while the EPI R2 for 124
ester compounds is 0.965 (see appendix for list of compounds). Below are graphs
comparing the SPARC versus EPI Suite™ calculations for carboxylic acid ester
hydrolysis rate. The end result for the 61 compounds is that SPARC does slightly better
in the mean unsigned error, but has a lower frequency of potential significant outliers
than EPI Suite   does - see  graphs (Hilal, personal communication, 2005; Long Chained
Aliphatic Alcohols SIAR, 2006).

                                 Hydrolysis Log K
               4 -
                 -i	1	1	1	1	1	1	1	1	1	1	1	1—
                  0   5   10  15   20  25   30  35  40   45  50  55  60  65
                                     Compound
                                      A-8

-------
                             Error (experimental +/- predicted)
               2.50

               2.25-

               2.00-

               1.75 -

               1.50 -
             §> 1.25 H
               1.00

               0.75-

               050

               0.25 -
               0.00
                           10  15   20  25   30  35  40  45  50  55  60   65
                                       Compound
Biodegradation (BIO WIN) and other fate models are more complicated than most of the
other algorithm driven (Q)SARs under EPI Suite™. Modeling of biodegradation has been
challenged due to lack of data and variability in soils (e.g., microorganism communities)
by IUPAC (Peijnenburg  1994). For Kow (octanol/water) partition coefficient, EPI Suite
T  estimations are slightly better than SPARC especially when the log Kow < 7 (which is
borderline for any model and experimental test).  At higher KOW , SPARC calculations are
better than EPI Suite™.  At higher KOW'S EPI Suite™ is bound to measurements that
were later (through slow stir method) shown to be inaccurate. SPARC models Kow as  a
ratio of activity coefficient calculations.  Originally, SPARC versus EPI Suite M
calculations indicated that SPARC values are too high. After slow stir, SPARC calculated
the same Kow value but experimental values changed and the SPARC values, though they
did not change, were now in better agreement with experimental values. For boiling
point, solubility and Henry's constant, SPARC performed well, and out-performed EPI
Suite'M (R2 = 0.999) (Hilal, et al. a and b). Total mean across all EPI Suite'M R" best case
(using the highest R2) = 0.95 ± 0.03 SD, and worst-case (lowest R2) = 0.86 ±0.15 SD.
                                      A-9

-------
Table I1 Compounds in graphs in numerical order
Smiles (1-62)
0=C(OCC=C)C
c(ccc1COC(=O)C)cc1
O=C(OC(C)C=C)C
O=C(OC(C)C#C)C
O=C(OC=C)C
O=C(OC(C)(CC)C=C)C
O=C(OCC)CSC
0=C(OCC)CS(=0)C
O=C(OCC)CS(=O)(=O)C
O=C(OC)C
O=C(OC)C=CC
O=C(OCC)
O=C(OCC)CCC
O=C(OCC)C=CC
O=C(OC)C(CI)CI
O=C(OC)C=C(C)C
O=C(OC)C(C)=C
0=C(OC)C
0=C(OCC)C=CC
O=C(OC)
0=C(OC)C=CC
0=C(OCC)C(CI)
O=C(OCC)C(CI)CI
O=C(OCC)C(F)F
O=C(OCC)C=C
0=C(OCC)C#CC
O=C(OCC)C#C
c12c(C(=O)OCC)cccc1 cccc2
c12cc(C(=O)OCC)ccc1 cccc2
O=C(OCC)c(cccc1)c1
O=C(OCC)C=CC(=O)OCC
o=c(occ)c=cc(=o)occ
O=C(OC(C)C(C))C=CC
O=C(OC(C)C)C=CC
c12c(C(=O)OC(C)C)cccc1 cccc2
c12cc(C(=O)OC(C)C)ccc1 cccc2
O=C(OC(C)C)C
O=C(OC(C)C)
O=C(OC(C)C)c(cccc1)c1
O=C(Oc(cc(N(=O)=O)c1 )cc1 )C=C
                                          A-10

-------
O=C(OC)C=C
c12c(C(=O)OC)cccc1 cccc2
c12cc(C(=O)OC)ccc1 cccc2
O=C(OCCCC)C=CC
O=C(OCCCC)C=C
0=C(OCCCC)
O=C(OCCC)C=CC
0=C(OCCC)C
O=C(OCCC)
O=C(Oc(ccc(CI)c1 )c1 )C=C
O=C(Oc(ccc(C(=O)C)c1 )c1 )C=C
O=C(Oc(ccc(N(=O)=O)c1 )c1 )C(CI)
O=C(Oc(ccc(N(=O)=O)c1 )c1 )C=C
c12c(C(=O)(Oc(ccc(N(=O)=O)c3)c3))cccc1cccc2
cl2c(C(=O)(Oc(ccc(OC)c3)c3))cccclcccc2
O=C(Oc(cccc1 )c1 )C(CI)
O=C(Oc(cccc1)c1)C=C
O=C(Oc(cccc1)c1)C
O=C(OC(C)CC)C=CC
O=C(OC(C)CC)C
                                            A-ll

-------